Generate a department/employee roll report

Level: Intermediate/Advanced

Puzzle: Generate a department /employee roll report (with a single  SELECT statement) that would look as following:

10         20         30
---------- ---------- -------
CLARK      ADAMS      ALLEN
KING       FORD       BLAKE
MILLER     JONES      JAMES
           SCOTT      MARTIN
           SMITH      TURNER
                      WARD

Assumption: Only departments 10, 20, and 30 are expected in the output.

Note that columns in the report may and will likely contain different number of values. This makes the puzzle somewhat tricky.

Method/Workaround #1: Using FULL join on 3 in-line views

WITH d10 AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn
FROM emp
WHERE deptno=10
),   d20 AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn
FROM emp
WHERE deptno=20
),   d30 AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn
FROM emp
WHERE deptno=30
)
SELECT  d10.ename "10", d20.ename "20", d30.ename "30"
FROM d10 FULL JOIN d20 ON d10.rn=d20.rn
	 FULL JOIN d30 ON d10.rn=d30.rn OR d20.rn=d30.rn
ORDER BY COALESCE(d10.rn, d20.rn, d30.rn)

Note the OR operator in the 2nd FULL JOIN condition. If you omit it, the result will be different:

WITH d10 AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn
FROM emp
WHERE deptno=10
),   d20 AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn
FROM emp
WHERE deptno=20
),   d30 AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn
FROM emp
WHERE deptno=30
)
SELECT  d10.ename "10", d20.ename "20", d30.ename "30"
FROM d10 FULL JOIN d20 ON d10.rn=d20.rn
	 FULL JOIN d30 ON d10.rn=d30.rn --OR d20.rn=d30.rn
ORDER BY COALESCE(d10.rn, d20.rn, d30.rn)
/

10         20         30
---------- ---------- -------
CLARK      ADAMS      ALLEN
KING       FORD       BLAKE
MILLER     JONES      JAMES
                      MARTIN
           SCOTT
                      TURNER
           SMITH
                      WARD

Since we don’t know which department will have more employees, we can’t reliably pick the right order for joining tables, so we have to twist it with an additional OR condition.

Overall, this solution is quite simple and straightforward, but very bulky and not scallable. Imagine having 10 departments to show in the report. Not a very neat SQL.
The following 2 workarounds offer substantially better solution.

Method/Workaround #2: Using PIVOT clause

SELECT "10","20","30"
FROM (
  SELECT ROW_NUMBER()OVER(PARTITION BY deptno ORDER BY ename) rn, deptno, ename
  FROM emp
)
PIVOT
(
  MAX(ename)
  FOR deptno IN (10,20,30)
)
ORDER BY rn

Note, that aggregation is done by the “rn” column which is the only common attribute in all 3 columns. Since rn is unique in each deparment, grouping by it will make MAX(ename) evaluate to ename itself as each group will always have 1 value.

Method/Workaround #3: Traditional simulation of PIVOT clause

WITH x AS (
SELECT CASE WHEN deptno=10 THEN ename END "10",
       CASE WHEN deptno=20 THEN ename END "20",
       CASE WHEN deptno=30 THEN ename END "30",
       ROW_NUMBER()OVER(PARTITION BY deptno ORDER BY ename) rn
FROM emp
)
SELECT MAX("10") AS "10",
       MAX("20") AS "20",
       MAX("30") AS "30"
FROM x
GROUP BY rn
ORDER BY rn

It is a less compact but much more generic approach in a sense that it will work even in those RDBMS that don’t support PIVOT. The idea behind this method is identical to the one used in Method 2.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Advertisements

Find a top record without using a subquery and in-line view/with clause, by Zahar Hilkevich

We will be finding a top paid employee from the scott’s emp table.

With a help of a subquery, the task becomes trivial:

SELECT ename, job, sal
FROM emp
WHERE sal=(SELECT MAX(sal) FROM emp)

Result:

ENAME      JOB              SAL
---------- --------- ----------
KING       PRESIDENT       5000

Alternatively, you can use an in-line view:

SELECT ename, job, sal
FROM (SELECT ename, job, sal, RANK()OVER(ORDER BY sal DESC) rk
      FROM emp)
WHERE rk=1

Result:

ENAME      JOB              SAL
---------- --------- ----------
KING       PRESIDENT       5000

The exercise is to achieve the same without a help of “another” query which is usually provided by a subquery, in-line view, or WITH clause:

SELECT a.ename, a.job, a.sal
FROM emp a, emp b
GROUP BY a.ename, a.job, a.sal
HAVING a.sal=MAX(b.sal)

Result:

ENAME      JOB              SAL
---------- --------- ----------
KING       PRESIDENT       5000

We have just used a Cartesian Product of two instances of the emp table!

You can read about 15 Workarounds for Getting Top Records in my new post.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

A Practical Use of Nested Group Functions in Oracle SQL, by Zahar Hilkevich

Nested group functions is quite a unique feature of Oracle SQL since (at least) version 7. However, it is not quite obvious where developers can benefit from using it. Below is one of the most frequent scenarios where the feature comes handy.

Suppose, you need to find a number of unique combinations of two or more columns in a table. This is a very frequent situation when you plan to create a new index on the selected columns.

Let say, we want to know how many different pairs of (deptno, job) fields we have in the emp table (scott schema). The most straight forward solution is:

SELECT COUNT(*)
FROM (SELECT DISTINCT deptno, job
      FROM emp)

Result:

COUNT(*)
--------
       9

The challenge is to do the same without the use of a sub-query (or in-line view/WITH clause):

SELECT COUNT(COUNT(*))
FROM emp
GROUP BY deptno, job

Result:

COUNT(COUNT(*))
---------------
              9

You can use any “inner” group function with the same result:

SELECT COUNT(SUM(sal)) cnt1, COUNT(AVG(sal)) cnt2, COUNT(STDDEV(sal)) cnt3
FROM emp
GROUP BY deptno, job

Result:

 CNT1       CNT2       CNT3
----- ---------- ----------
    9          9          9

P.S. If you like this trick, you can find many more in my book “Oracle SQL Tricks and Workarounds”

Synchronize 2 tables with a SINGLE SQL statement in Oracle, by Zahar Hilkevich

To illustrate the challenge we will use dept table from Oracle’s default scott schema and its copy, dept2 table. Here is the content:

SELECT *
FROM dept;

DEPTNO DNAME          LOC
------ -------------- --------
    10 ACCOUNTING     NEW YORK
    20 RESEARCH       DALLAS
    30 SALES          CHICAGO
    40 OPERATIONS     BOSTON

SELECT *
FROM dept2;

DEPTNO DNAME          LOC
------ -------------- ---------
    10 ACCOUNTING     NEW YORK
    20 RESEARCH       DALLAS
    30 SALES          LAS VEGAS
    50 HR             MIAMI

As you can see, in the dept2 table, 1 original row was deleted (deptno=40), 1 – updated (location changed for deptno=30) and 1 – inserted (deptno=50).

The challenge is to present a single SQL DML statement against dept2 table so its content would become identical with the dept table.

As you could have guessed already, it must be a MERGE statement:

MERGE INTO dept2 d2
USING (SELECT deptno, dname, loc
       FROM dept
       UNION ALL
       SELECT a.deptno, a.dname, 'DELETE ME' AS loc
       FROM dept2 a LEFT JOIN dept b ON a.deptno=b.deptno
       WHERE b.deptno IS NULL) d
ON (d2.deptno=d.deptno)
WHEN NOT MATCHED THEN
   INSERT(deptno, dname, loc)
   VALUES(d.deptno, d.dname, d.loc)
WHEN MATCHED THEN
   UPDATE SET d2.dname=d.dname,
              d2.loc=d.loc
   DELETE WHERE d2.loc='DELETE ME'

5 rows merged.

The main trick here is to SELECT all rows (using both copies of dept table) that will include rows to be updated, inserted, and deleted:

SELECT deptno, dname, loc
FROM dept
UNION ALL
SELECT a.deptno, a.dname, 'DELETE ME' AS loc
FROM dept2 a LEFT JOIN dept b ON a.deptno=b.deptno
WHERE b.deptno IS NULL

Result:

DEPTNO DNAME          LOC
------ -------------- ---------
    10 ACCOUNTING     NEW YORK
    20 RESEARCH       DALLAS
    30 SALES          CHICAGO
    40 OPERATIONS     BOSTON
    50 HR             DELETE ME

Once we have this, the rest is relatively simple: all these records will be either inserted (deptno=40) or updated (the rest) and only one of them (deptno=50) will be deleted using the indicator we created (loc=’DELETE ME’).

DELETE option in the MERGE command will only affect records updated by the UPDATE clause. That is why we need to include it in the list and make sure we can identify such record using a reliably criteria (loc=’DELETE ME’).

For more details on how MERGE works and for a variety of other tricks and workarounds, check my book “Oracle SQL Tricks and Workarounds”

Get the root record when its parent id matches the root’s id, by Zahar Hilkevich

Very often developers decide to set parent id of a hierarchy root record to the record’s id value, meaning that the parent of that record is the record itself. This works fine as long as you don’t need to show a branch (or the entire) hierarchy from a record up to its very top root.

Let’s look at a query against the famous scott schema that comes with default Oracle’s database:

SELECT empno AS ID, ename AS name, mgr AS parent_id, LEVEL
 FROM emp
 CONNECT BY empno=PRIOR mgr
 START WITH empno=7788

Result:

   ID NAME        PARENT_ID      LEVEL
----- ---------- ---------- ----------
 7788 SCOTT            7566          1
 7566 JONES            7839          2
 7839 KING                           3

so far so good – note that King’s manager employee number (parent_id) is NULL.

Let’s update it to 7839 (King’s empno) and re-run the above SQL:

UPDATE emp
SET mgr=empno
WHERE empno=7839

Now, the result is quite different:

SQL> SELECT empno AS ID, ename AS name, mgr AS parent_id, LEVEL
 2  FROM emp
 3  CONNECT BY empno=PRIOR mgr
 4  START WITH empno=7788
 5  /
 ERROR:
 ORA-01436: CONNECT BY loop in user data

Of course, you can use NOCYCLE clause:

SELECT empno AS ID, ename AS name, mgr AS parent_id, LEVEL
FROM emp
CONNECT BY NOCYCLE empno=PRIOR mgr
START WITH empno=7788

Result:

  ID NAME        PARENT_ID      LEVEL
---- ---------- ---------- ----------
7788 SCOTT            7566          1
7566 JONES            7839          2

The KING has gone! Now the challenge is quite apparent – we need to see the entire branch that includes the KING!

The trick is to mimic the original design when the King’s parent id was NULL!

A very first idea is to use WITH clause:

WITH x AS (
 SELECT empno AS ID, ename AS name, DECODE(mgr,empno,NULL,mgr) AS parent_id
 FROM emp
 )
 SELECT ID, name, parent_id, LEVEL
 FROM x
 CONNECT BY id=PRIOR parent_id
 START WITH id=7788

Result:

  ID NAME       PARENT_ID       LEVEL
---- ---------- --------------- ----------
7788 SCOTT      7566            1
7566 JONES      7839            2
7839 KING                       3

Note, that parent_id for KING is shown as NULL even though we updated it to 7839 – this can be fixed by applying NVL on the parent_id if needed.

A closer look allowed us to find a nice workaround that does not rely on the WITH clause:

SELECT empno AS ID, ename AS name, mgr AS parent_id, LEVEL
FROM emp
CONNECT BY empno=PRIOR DECODE(mgr,empno,NULL,mgr)
START WITH empno=7788

Result:

  ID NAME        PARENT_ID      LEVEL
---- ---------- ---------- ----------
7788 SCOTT            7566          1
7566 JONES            7839          2
7839 KING             7839          3

Here we go!

Don’t forget to rollback the changes to set mgr=NULL for the KING’s record

P.S. If you like this trick, you can find many more in my book “Oracle SQL Tricks and Workarounds”