12 Solutions to 2018 Oracle SQL Puzzle of the Week #10

Top Salary Puzzle

Find highest salary in each department without using MAX function

  • Use a single SELECT statement only.
  • For an added complexity (optional): try not using ANY functions at all (neither group, nor analytic, not even scalar)

Expected Result:

DEPTNO MAX_SAL
10 5000
20 3000
30 2850

Solutions:

We will begin with a simpler problem that does allow us using functions.

Solution #1. Using MIN function

Credit to: Boobal Ganesan

MIN function can be seen as an opposite to the MAX, so it is trivial to employ it here:

SELECT deptno, -MIN(-sal) max_sal
FROM scott.emp
GROUP BY deptno
ORDER BY 1;

Solution #2. Using LISTAGG and REGEXP_SUBSTR functions

This is an “order” based approach that sorts the values within a concatenated string and then uses regular expression to cut the first token.

SELECT deptno,
       REGEXP_SUBSTR(LISTAGG(sal,',') 
                     WITHIN GROUP(ORDER BY sal DESC),'[^,]+',1,1) max_sal
FROM scott.emp
GROUP BY deptno
ORDER BY 1;

Solution #3. Using AVG(…) KEEP() group function

This is another “order” based strategy whete AVG function can be replaced with MIN or any other aggregate function that returns a single value out of a set of identical ones.

SELECT deptno, AVG(sal) KEEP(DENSE_RANK FIRST ORDER BY sal DESC) max_sal
FROM scott.emp
GROUP BY deptno
ORDER BY 1;

Solution #4. Using Analytic function and CTE

ROW_NUMBER is chosen in this approach, though other analytic functions, such as RANK, DENSE_RANK, LEAD, LAG, FIRST_VALUE, etc can be used here (with some changes) as well. ROW_NUMBER is convenient to use as it allows to avoid DISTINCT option.

WITH x AS (
SELECT deptno, sal, 
       ROW_NUMBER()OVER(PARTITION BY deptno ORDER BY sal DESC) rn
FROM scott.emp
)
SELECT deptno, sal max_sal
FROM x
WHERE rn=1
ORDER BY 1;

Solution #5. Using MATCH_RECOGNIZE clause

Credit to: KATAYAMA NAOTO

This approach is similar to the previous one if we used LAG analytic function: which would return NULL for the top record.

SELECT deptno, sal max_sal 
FROM scott.emp
MATCH_RECOGNIZE (
PARTITION BY deptno
ORDER BY sal DESC
ALL ROWS PER MATCH
PATTERN (ISNULL)
DEFINE ISNULL AS PREV(ISNULL.sal) IS NULL
);

Solution #6. CONNECT BY and CONNECT_BY_ISLEAF while avoiding Analytic functions

This approach is a bit artificial. We could have used DISTINCT and avoid START WITH clause completely.  CTEs x and y are used to simulate ROW_NUMBER analytic function.

WITH x AS (
SELECT deptno, sal
FROM scott.emp
ORDER BY 1,2
), y AS (
SELECT x.*, ROWNUM rn
FROM x
)
SELECT deptno, sal
FROM y
WHERE CONNECT_BY_ISLEAF=1
CONNECT BY deptno=PRIOR deptno
       AND rn=PRIOR rn+1
START WITH (deptno, rn) IN (SELECT deptno, MIN(rn)
                            FROM y
                            GROUP BY deptno);

Solution #7. Using MODEL clause with ROW_NUMBER function

This method is pretty much the same as in the Solution #4 above. The RETURN UPDATED ROWS and dummy measures are used to only return rows with rn=1.

SELECT deptno, max_sal
FROM scott.emp
MODEL
RETURN UPDATED ROWS
PARTITION BY (deptno)
DIMENSION BY (ROW_NUMBER() OVER(PARTITION BY deptno ORDER BY sal DESC) rn)
MEASURES(sal max_sal, 0 dummy)
RULES(
 dummy[1]=1
)
ORDER BY 1;

The following 5 solutions (##8-12) satisfy the “added complexity” term and do NOT use any functions at all.

Solution #8. Using ALL predicate

Generally speaking, >=ALL filter is identical to =(SELECT MAX() …). See my book for more detailed explanations.

SELECT deptno, sal max_sal
FROM scott.emp a
WHERE sal>=ALL(SELECT sal
               FROM scott.emp
               WHERE deptno=a.deptno)
GROUP BY deptno, sal
ORDER BY 1;

Solution #9. Using NOT EXISTS predicate

See Chapter 10 of my book for details.

SELECT deptno, sal max_sal
FROM scott.emp a
WHERE NOT EXISTS(SELECT 1
                 FROM scott.emp
                 WHERE deptno=a.deptno
                   AND sal>a.sal)
GROUP BY deptno, sal
ORDER BY 1;

Solution #10. Using Outer-Join with IS NULL filter

This approach is also covered very deeply in my book, Chapter 10.

SELECT a.deptno, a.sal max_sal
FROM scott.emp a LEFT JOIN scott.emp b ON a.deptno=b.deptno
                                      AND b.sal>a.sal 
WHERE b.empno IS NULL
GROUP BY a.deptno, a.sal
ORDER BY 1;

Solution #11. Using MINUS and ANY predicate

MINUS serves 2 purposes: it removes non-top rows and eliminates duplicates, so no DISTINCT option (or GROUP BY) is required.

SELECT deptno, sal max_sal 
FROM scott.emp
MINUS
SELECT deptno, sal
FROM scott.emp a
WHERE sal<ANY(SELECT sal 
              FROM scott.emp
              WHERE deptno=a.deptno);

Solution #12. Using MINUS and EXISTS predicate

Last two approaches covered in the drill from the Chapter 10 of my book.

SELECT deptno, sal max_sal 
FROM scott.emp
MINUS
SELECT deptno, sal
FROM scott.emp a
WHERE EXISTS(SELECT 1 
             FROM scott.emp
             WHERE deptno=a.deptno
               AND sal>a.sal);

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Advertisements

Interview Question: Get top and bottom paid employees in each department

This is a typical interview problem: list all bottom and top paid employees in each department. A preferred solution should not be using UNION or UNION ALL operators.

Please watch this short video to learn a couple of non-obvious techniques and to impress your potential employers on your next job interview.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

15 Workarounds for Getting Top Records

To illustrate the concept we will be solving the following problem defined for scott schema:

Find all top paid employees in each department. Display employee names, salaries, jobs, and department.

To qualify for a workaround, a solution’s execution plan should have a distinct hash value (More on that can be found in my book “Oracle SQL Tricks and Workarounds”).

Workaround #1: Correlated subquery

SELECT ename, job, sal, deptno
FROM emp a
WHERE sal=(SELECT MAX(sal)
           FROM emp b
           WHERE b.deptno=a.deptno);

Result:

ENAME      JOB              SAL     DEPTNO
---------- --------- ---------- ----------
BLAKE      MANAGER         2850         30
SCOTT      ANALYST         3000         20
KING       PRESIDENT       5000         10
FORD       ANALYST         3000         20

Execution Plan:

Plan hash value: 1245077725

--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |     1 |    47 |     8  (25)| 00:00:01 |
|*  1 |  HASH JOIN           |         |     1 |    47 |     8  (25)| 00:00:01 |
|   2 |   VIEW               | VW_SQ_1 |     3 |    78 |     4  (25)| 00:00:01 |
|   3 |    HASH GROUP BY     |         |     3 |    21 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP     |    14 |    98 |     3   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | EMP     |    14 |   294 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Workaround #2: Correlated subquery with arithmetic transformation

SELECT ename, job, sal, deptno
FROM emp a
WHERE 0=(SELECT MAX(b.sal)-a.sal
         FROM emp b
         WHERE b.deptno=a.deptno)

Execution Plan:

Plan hash value: 2649664444

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    21 |    24   (0)| 00:00:01 |
|*  1 |  FILTER             |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL | EMP  |    14 |   294 |     3   (0)| 00:00:01 |
|   3 |   SORT AGGREGATE    |      |     1 |     7 |            |          |
|*  4 |    TABLE ACCESS FULL| EMP  |     5 |    35 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Workaround #3: Non-Correlated subquery

SELECT ename, job, sal, deptno
FROM emp a
WHERE (deptno, sal) IN (SELECT deptno, MAX(sal)
                        FROM emp
                        GROUP BY deptno)

Execution Plan:

Plan hash value: 2491199121

---------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |     1 |    47 |     8  (25)| 00:00:01 |
|*  1 |  HASH JOIN SEMI      |          |     1 |    47 |     8  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL  | EMP      |    14 |   294 |     3   (0)| 00:00:01 |
|   3 |   VIEW               | VW_NSO_1 |     3 |    78 |     4  (25)| 00:00:01 |
|   4 |    HASH GROUP BY     |          |     3 |    21 |     4  (25)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| EMP      |    14 |    98 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Workaround #4: Aggregating over Cartesian Product

SELECT a.ename, a.job, a.sal, a.deptno
FROM emp a, emp b
WHERE a.deptno=b.deptno
GROUP BY a.ename, a.job, a.sal, a.deptno
HAVING a.sal=MAX(b.sal)

Execution Plan:

Plan hash value: 2435006919

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     6 |   168 |     8  (25)| 00:00:01 |
|*  1 |  FILTER              |      |       |       |            |          |
|   2 |   HASH GROUP BY      |      |     6 |   168 |     8  (25)| 00:00:01 |
|*  3 |    HASH JOIN         |      |    65 |  1820 |     7  (15)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP  |    14 |   294 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Workaround #5: Outer Join with IS NULL check

SELECT a.ename, a.job, a.sal, a.deptno
FROM emp a LEFT JOIN emp b ON a.deptno=b.deptno
                          AND a.sal<b.sal
WHERE b.empno IS NULL

Execution Plan:

Plan hash value: 1201587841

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    32 |     7  (15)| 00:00:01 |
|*  1 |  FILTER             |      |       |       |            |          |
|*  2 |   HASH JOIN OUTER   |      |     1 |    32 |     7  (15)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP  |    14 |   294 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP  |    14 |   154 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Workaround #6: Using NOT EXISTS

SELECT ename, job, sal, deptno
FROM emp a
WHERE NOT EXISTS (SELECT 1
                  FROM emp b
                  WHERE b.deptno=a.deptno
                    AND b.sal>a.sal)

Execution Plan:

Plan hash value: 3353202012

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     5 |   140 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN ANTI    |      |     5 |   140 |     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |   294 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Synonymous Workaround for #6 (execution plan has the same hash value): Using COUNT(*)=0 Equivalent

SELECT ename, job, sal, deptno
FROM emp a
WHERE 0=(SELECT COUNT(*)
         FROM emp b
         WHERE b.deptno=a.deptno
           AND b.sal>a.sal)

Execution Plan:

Plan hash value: 3353202012

Workaround #7: Using ALL Predicate

SELECT ename, job, sal, deptno
FROM emp a
WHERE a.sal>=ALL(SELECT b.sal
                 FROM emp b
                 WHERE b.deptno=a.deptno)

Execution Plan:

Plan hash value: 2561671593

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    14 |   294 |    24   (0)| 00:00:01 |
|*  1 |  FILTER            |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |   294 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| EMP  |     2 |    14 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Workaround #8: Using In-Line View

SELECT a.ename, a.sal, a.deptno
FROM emp a, (SELECT deptno, MAX(sal) max_sal
             FROM emp
             GROUP BY deptno) b
WHERE a.deptno=b.deptno
  AND a.sal=b.max_sal

Execution Plan:

Plan hash value: 269884559

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    39 |     8  (25)| 00:00:01 |
|*  1 |  HASH JOIN           |      |     1 |    39 |     8  (25)| 00:00:01 |
|   2 |   VIEW               |      |     3 |    78 |     4  (25)| 00:00:01 |
|   3 |    HASH GROUP BY     |      |     3 |    21 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP  |    14 |    98 |     3   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | EMP  |    14 |   182 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Workaround #9: Using EXISTS Predicate

SELECT ename, job, sal, deptno
FROM emp a
WHERE EXISTS (SELECT 1
              FROM emp b
              WHERE b.deptno=a.deptno
              HAVING a.sal=MAX(b.sal))

Execution Plan:

Plan hash value: 3057787348

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    21 |    24   (0)| 00:00:01 |
|*  1 |  FILTER              |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL  | EMP  |    14 |   294 |     3   (0)| 00:00:01 |
|*  3 |   FILTER             |      |       |       |            |          |
|   4 |    SORT AGGREGATE    |      |     1 |     7 |            |          |
|*  5 |     TABLE ACCESS FULL| EMP  |     5 |    35 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Synonymous Workaround for #9 (execution plan has the same hash value): Using COUNT(*)>0 Equivalent

SELECT ename, job, sal, deptno
FROM emp a
WHERE 0< (SELECT COUNT(1)
          FROM emp b
          WHERE b.deptno=a.deptno
          HAVING a.sal=MAX(b.sal))

Execution Plan:

Plan hash value: 3057787348

Here is a practical example which happens to qualify as another Synonymous Workaround for #9:

SELECT ename, job, sal, deptno
FROM emp a
WHERE NOT EXISTS (SELECT 1
                  FROM emp b
                  WHERE b.deptno=a.deptno
                  HAVING a.sal<MAX(b.sal))

Execution Plan:

Plan hash value: 3057787348

Workaround #10: Using Analytical Function RANK()

WITH x AS (
SELECT ename, job, sal, deptno,
       RANK()OVER(PARTITION BY deptno ORDER BY sal DESC) rk
FROM emp a
)
SELECT ename, job, sal, deptno
FROM x
WHERE rk=1

Execution Plan:

Plan hash value: 3291446077

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |    14 |   728 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |      |    14 |   728 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |    14 |   294 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | EMP  |    14 |   294 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Workaround #11: Using Analytical Function MAX

WITH x AS (
SELECT ename, job, sal, deptno,
       MAX(sal)OVER(PARTITION BY deptno) max_sal
FROM emp a
)
SELECT ename, job, sal, deptno
FROM x
WHERE sal=max_sal

Execution Plan:

Plan hash value: 4130734685

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    14 |   728 |     4  (25)| 00:00:01 |
|*  1 |  VIEW               |      |    14 |   728 |     4  (25)| 00:00:01 |
|   2 |   WINDOW SORT       |      |    14 |   294 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP  |    14 |   294 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Workaround #12: Using Analytical Function COUNT with CONNECT BY

WITH x AS (
SELECT ename, job, sal, deptno, COUNT(*)OVER(PARTITION BY empno) cnt
FROM emp a
CONNECT BY deptno=PRIOR deptno
       AND sal<PRIOR sal
)
SELECT ename, job, sal, deptno
FROM x
WHERE cnt=1

Execution Plan:

Plan hash value: 704858046

-----------------------------------------------------------------------------------------
| Id  | Operation                      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |    14 |   728 |     2   (0)| 00:00:01 |
|*  1 |  VIEW                          |        |    14 |   728 |     2   (0)| 00:00:01 |
|   2 |   WINDOW SORT                  |        |    14 |   350 |     2   (0)| 00:00:01 |
|*  3 |    CONNECT BY WITHOUT FILTERING|        |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID| EMP    |    14 |   350 |     2   (0)| 00:00:01 |
|   5 |      INDEX FULL SCAN           | PK_EMP |    14 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Workaround #13: Using Analytical Function COUNT with CONNECT BY filtered by LEVEL

WITH x AS (
SELECT ename, job, sal, deptno, COUNT(*)OVER(PARTITION BY empno) cnt
FROM emp a
WHERE level<=2
CONNECT BY deptno=PRIOR deptno
       AND sal<PRIOR sal
)
SELECT ename, job, sal, deptno
FROM x
WHERE cnt=1

Execution Plan:

Plan hash value: 2668428643

------------------------------------------------------------------------------------------
| Id  | Operation                       | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |        |    14 |   728 |     2   (0)| 00:00:01 |
|*  1 |  VIEW                           |        |    14 |   728 |     2   (0)| 00:00:01 |
|   2 |   WINDOW SORT                   |        |    14 |   350 |     2   (0)| 00:00:01 |
|*  3 |    FILTER                       |        |       |       |            |          |
|*  4 |     CONNECT BY WITHOUT FILTERING|        |       |       |            |          |
|   5 |      TABLE ACCESS BY INDEX ROWID| EMP    |    14 |   350 |     2   (0)| 00:00:01 |
|   6 |       INDEX FULL SCAN           | PK_EMP |    14 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Workaround #14: CONNECT BY with GROUP BY and HAVING

SELECT ename, job, sal, deptno
FROM emp a
CONNECT BY deptno=PRIOR deptno
       AND sal<PRIOR sal
GROUP BY ename, job, sal, deptno
HAVING COUNT(*)=1

Execution Plan:

Plan hash value: 2144516570

---------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |     1 |    21 |     4  (25)| 00:00:01 |
|*  1 |  FILTER                        |      |       |       |            |          |
|   2 |   HASH GROUP BY                |      |     1 |    21 |     4  (25)| 00:00:01 |
|*  3 |    CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
|   4 |     TABLE ACCESS FULL          | EMP  |    14 |   294 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Workaround #15: GROUP BY and HAVING over CONNECT BY filtered by LEVEL

SELECT ename, job, sal, deptno
FROM emp a
WHERE level<=2
CONNECT BY deptno=PRIOR deptno
       AND sal<PRIOR sal
GROUP BY ename, job, sal, deptno
HAVING COUNT(*)=1

Execution Plan:

Plan hash value: 1946770371

----------------------------------------------------------------------------------------
| Id  | Operation                       | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |      |     1 |    21 |     4  (25)| 00:00:01 |
|*  1 |  FILTER                         |      |       |       |            |          |
|   2 |   HASH GROUP BY                 |      |     1 |    21 |     4  (25)| 00:00:01 |
|*  3 |    FILTER                       |      |       |       |            |          |
|*  4 |     CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
|   5 |      TABLE ACCESS FULL          | EMP  |    14 |   294 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

If you want to learn how to come up with numerous workarounds on your own, check my book “Oracle SQL Tricks and Workarounds” for instructions.