Explore the power of analytic functions – Part 2

May 4, 2020Zahar HilkevichLeave a comment

In the previous blog post on that subject we reviewed a couple a notable applications of Oracle analytic functions. Today, I came across another interesting illustration of the same concept.

Problem: List all employees from the same department and holding the same job title as ADAMS.

Expected Result:

ENAME	JOB	DEPTNO
ADAMS	CLERK	20
SMITH	CLERK	20

Like before, we start with traditional approaches that every experienced developer would easily demonstrate.

Strategy #1: Using multi-column subquery

SELECT ename, job, deptno
FROM scott.emp
WHERE (deptno, job) IN (SELECT deptno, job
                        FROM scott.emp
                        WHERE ename = 'ADAMS')
ORDER BY 2, 3, 1

Strategy #2: Using self-join

SELECT a.ename, job, deptno
FROM scott.emp a JOIN scott.emp b USING(deptno, job)
WHERE b.ename = 'ADAMS'
ORDER BY job, deptno, a.ename

Strategy #3: Using EXISTS predicate

SELECT ename, job, deptno
FROM scott.emp a
WHERE EXISTS (SELECT 1
              FROM scott.emp
              WHERE ename  = 'ADAMS'
                AND deptno = a.deptno
                AND job    = a.job)
ORDER BY 2, 3, 1

A common feature of all the strategies above is having two copies of the emp table with two joining conditions (deptno, job) and one filter (ename = ‘ADAMS’)

As we have seen before, with analytic functions, we can get away with a single copy of th emp table.

Strategy #4: Using COUNT Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       COUNT(DECODE(ename, 'ADAMS', 1)) 
             OVER(PARTITION BY deptno, job) cnt
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE cnt > 0
ORDER BY 2, 3, 1

Of course, you can use different analytic functions here:

Strategy #5: Using MAX Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       MAX(DECODE(ename, 'ADAMS', ename)) 
           OVER(PARTITION BY deptno, job) adams
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE adams = 'ADAMS'
ORDER BY 2, 3, 1

Strategy #6: Using LISTAGG Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       LISTAGG(DECODE(ename, 'ADAMS', 'Y'), '|') WITHIN GROUP (ORDER BY 1) 
               OVER(PARTITION BY deptno, job) flag
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE flag LIKE 'Y%'
ORDER BY 2, 3, 1

We need to use LIKE operator in case we have more than a single Adams working in the same department and holding the same job title.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

A trick that helps avoiding multiple table scans.

November 29, 2018Zahar Hilkevich1 Comment

Let’s look at a fairly simple SQL problem:

In a traditional scott.emp table, find all employees who work in the same department as the president.

Make your query work even if there are more than 1 president records exist in emp table
Make Oracle scan emp table just ONCE

A “traditional” solution to this problem may look like this:

Strategy #1: Using a subquery

SELECT *
FROM scott.emp
WHERE deptno IN (SELECT deptno 
                 FROM scott.emp 
                 WHERE job='PRESIDENT')

or this:

Strategy #2: Using a self-join

SELECT DISTINCT a.*
FROM scott.emp a JOIN scott.emp b ON a.deptno=b.deptno
WHERE b.job='PRESIDENT'

Note, that DISTINCT option in the above query is needed to prevent duplicates if there were multiple presidents in а specific department.

Both solutions above use 2 copies of the emp table which makes oracle scan the same scott.emp table twice.

A trick presented below allows you to use only a single copy of the emp table to solve the problem. The trick involves different conceptual and technical approaches compared to the solutions we have seen so far.

Conceptually, we should rephrase the problem in a way that would keep it identical and at the same time allows us to use different technical arsenal. This approach is explained in a detailed manner in my book “Oracle SQL Tricks and Workarounds”. We can rephrase the puzzle and say that we are looking for employees from departments with some “positive” number of presidents working there. Technically speaking, we need to use analytic function COUNT and check if it is greater than 0:

Strategy #3: Using analytic function COUNT

WITH x AS (
SELECT e.*, 
       COUNT(DECODE(job,'PRESIDENT',1))OVER(PARTITION BY deptno) cnt
FROM scott.emp e
)
SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM x
WHERE cnt>0
ORDER BY empno

We do need to use a common table expression as we cannot filter out by analytic function in the same query where the function is used. Nevertheless, we scan the emp table just once, and during this scan, Oracle engine counts the number of presidents in each department.

COUNT is not the only analytic function that can be employed to solve the problem.

Strategy #4: Using analytic function LISTAGG

WITH x AS (
SELECT e.*, 
       LISTAGG(job,'|') 
         WITHIN GROUP (ORDER BY job) OVER(PARTITION BY deptno) jobs
FROM scott.emp e 
)
SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM x
WHERE '|' || jobs || '|' LIKE '%|PRESIDENT|%'
ORDER BY empno

Instead of counting the presidents by department, we simply concatenate all the job titles and check if the resulting string includes a president.

Finally, if you don’t like using sub-queries in general, we can leverage the power of MODEL clause:

Strategy #5: Using MODEL clause to avoid sub-queries

SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM scott.emp
MODEL RETURN UPDATED ROWS 
DIMENSION BY (
  empno, 
  SIGN(COUNT(DECODE(job,'PRESIDENT',1))OVER(PARTITION BY deptno)) cnt
) 
MEASURES(ename, job, mgr, hiredate, sal, comm, deptno, 0 dummy) 
RULES(dummy[ANY, 1]=1)
ORDER BY empno

The tricky part here is using a composition of SIGN, COUNT, and DECODE functions (i.e. SIGN on top of what we used in Strategy #3) as a secondary dimension and empno as primary. Employee number is unique by itself, so adding another dimension will still maintain uniqueness required by MODEL clause. The only MODEL RULE changes the dummy measure which “triggers” the “RETURN UPDATED ROWS” instruction and returns only those rows where the dummy dimension was set to 1 – notice that its default value is 0.

You can check the execution plan for all of the above strategies to see how many times Oracle scans the emp table.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

7 Solutions to 2018 Oracle SQL Puzzle of the Week #13

April 17, 2018April 17, 2018Zahar HilkevichLeave a comment

Second Top Employee as of the Start of Employment

List all employees who were 2nd top paid in the entire company as of the time their employment started

Use a single SELECT statement only.
At the time of employment start the rank of the employee by salary should be 2.
Show the top salary at the time when the employee started with the company.
We assume that no employees have ever been terminated since day 1.

Expected Result:

ENAME	JOB	SAL	HIREDATE	MAX_SAL
WARD	SALESMAN	1250	22-FEB-81	1600
BLAKE	MANAGER	2850	01-MAY-81	2975
FORD	ANALYST	3000	03-DEC-81	5000
SCOTT	ANALYST	3000	19-APR-87	5000

Solutions:

Solution #1. Using LATERAL view, RANK and cumulative MAX analytic functions (Oracle 12g+):

SELECT e.ename, e.job, e.hiredate, e.sal, r.max_sal 
FROM scott.emp e, LATERAL(SELECT a.empno,  
                                 RANK() OVER(ORDER BY a.sal DESC) rk, 
                                 MAX(a.sal) OVER() max_sal 
                          FROM scott.emp a 
                          WHERE a.hiredate<=e.hiredate) r 
WHERE e.empno=r.empno  
  AND rk=2 
ORDER BY e.hiredate

Solution #2. Using CTE, cumulative MAX analytic function and a correlated subquery with COUNT to mimic the filter by RANK:

WITH x AS ( 
SELECT ename, job, hiredate, sal, MAX(sal)OVER(ORDER BY hiredate) max_sal 
FROM scott.emp a 
) 
SELECT * 
FROM x 
WHERE 1=(SELECT COUNT(*) 
         FROM scott.emp 
         WHERE hiredate<=x.hiredate 
           AND sal>x.sal) 
ORDER BY hiredate

Solution #3. Using CTE, cumulative MAX analytic function and an in-line scalar subquery in SELECT to mimic the RANK:

WITH x AS ( 
SELECT ename, job, hiredate, sal, MAX(sal)OVER(ORDER BY hiredate) max_sal, 
       (SELECT COUNT(*)+1 
        FROM scott.emp 
        WHERE sal>e.sal  
          AND hiredate<=e.hiredate) rk 
FROM scott.emp e 
) 
SELECT ename, job, hiredate, sal, max_sal 
FROM x 
WHERE rk=2 
ORDER BY hiredate

Solution #4. Using self-join and Cartesian Product with aggregation:

SELECT a.ename, a.job, a.hiredate, a.sal, MAX(b.sal) max_sal 
FROM scott.emp a JOIN scott.emp b ON b.hiredate<=a.hiredate 
                                 AND b.sal>a.sal 
GROUP BY a.ename, a.job, a.hiredate, a.sal 
HAVING COUNT(DISTINCT b.empno)=1 
ORDER BY a.hiredate

Solution #5. Using CTE and cumulative MAX analytic function (twice):

WITH x AS ( 
SELECT ename, job, hiredate, sal, 
       MAX(sal) OVER(ORDER BY hiredate) max_sal 
FROM scott.emp  
), y  AS ( 
SELECT ename, job, hiredate, sal, max_sal, MAX(sal) OVER(ORDER BY hiredate) max_sal2 
FROM x 
WHERE sal<max_sal 
) 
SELECT ename, job, hiredate, sal, max_sal 
FROM y 
WHERE sal=max_sal2 
ORDER BY hiredate

Solution #6. Using regular and recursive CTEs, ROWNUM, GREATEST, and CASE functions (no Analytic functions!):

WITH e AS ( 
SELECT ename, job, sal, hiredate 
FROM scott.emp 
ORDER BY hiredate 
), x AS ( 
SELECT ename, job, sal, hiredate, ROWNUM rn 
FROM e 
), y(max_sal, sal2, rn) AS ( 
SELECT sal, 0, 1 
FROM x 
WHERE rn=1 
UNION ALL 
SELECT GREATEST(x.sal, y.max_sal) AS max_sal, 
       CASE WHEN x.sal>y.max_sal THEN y.max_sal 
            WHEN x.sal>y.sal2 AND x.sal<=y.max_sal THEN x.sal  
            ELSE y.sal2  
       END AS sal2, 
       x.rn 
FROM x JOIN y ON x.rn=y.rn+1 
) 
SELECT x.ename, x.job, x.sal, x.hiredate, y.max_sal 
FROM y JOIN x ON y.rn=x.rn AND y.sal2=x.sal

Solution #7. Using CTE and MODEL clause to mimic Solution #6:

WITH x AS ( 
SELECT * 
FROM scott.emp 
MODEL 
DIMENSION BY (ROW_NUMBER() OVER(ORDER BY hiredate) rn) 
MEASURES(ename, job, sal, hiredate, sal max_sal, 0 sal2) 
RULES( 
    max_sal[rn>1] = GREATEST(max_sal[CV()-1], sal[CV()]), 
    sal2[rn>1] = CASE WHEN sal[CV()]> max_sal[CV()-1] THEN max_sal[CV()-1] 
                      WHEN sal[CV()]> sal2[CV()-1]   
		       AND sal[CV()]<=max_sal[CV()-1] THEN sal[CV()]  
                      ELSE sal2[CV()-1] 
                 END 
     ) 
) 
SELECT ename, job, sal, hiredate, max_sal 
FROM x 
WHERE sal=sal2

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

3 Solutions to 2018 Oracle SQL Puzzle of the Week #11

March 26, 2018March 26, 2018Zahar Hilkevich3 Comments

Mimic ROW_NUMBER function

Write a single SELECT statement that produces the same result as the following one:

SELECT e.*, ROW_NUMBER()OVER(PARTITION BY deptno ORDER BY ename) rn
FROM scott.emp e

Use a single SELECT statement only.
Analytic functions are NOT allowed
Any SQL clauses that use PARTITION BY keywords are NOT allowed

Expected Result:

EMPNO	ENAME	JOB	MGR	HIREDATE	SAL	COMM	DEPTNO	RN
7782	CLARK	MANAGER	7839	09-JUN-81	2450	–	10	1
7839	KING	PRESIDENT	–	17-NOV-81	5000	–	10	2
7934	MILLER	CLERK	7782	23-JAN-82	1300	–	10	3
7876	ADAMS	CLERK	7788	23-MAY-87	1100	–	20	1
7902	FORD	ANALYST	7566	03-DEC-81	3000	–	20	2
7566	JONES	MANAGER	7839	02-APR-81	2975	–	20	3
7788	SCOTT	ANALYST	7566	19-APR-87	3000	–	20	4
7369	SMITH	CLERK	7902	17-DEC-80	800	–	20	5
7499	ALLEN	SALESMAN	7698	20-FEB-81	1600	300	30	1
7698	BLAKE	MANAGER	7839	01-MAY-81	2850	–	30	2
7900	JAMES	CLERK	7698	03-DEC-81	950	–	30	3
7654	MARTIN	SALESMAN	7698	28-SEP-81	1250	1400	30	4
7844	TURNER	SALESMAN	7698	08-SEP-81	1500	0	30	5
7521	WARD	SALESMAN	7698	22-FEB-81	1250	500	30	6

Solutions:

Solution #1. Using MATCH_RECOGNIZE clause

Credit to: Naoto Katayama

SELECT empno,ename,job,mgr,hiredate,sal,comm,deptno,rn  
FROM scott.emp 
MATCH_RECOGNIZE ( 
ORDER BY deptno,ename,empno 
MEASURES RUNNING COUNT(*) AS rn 
ALL ROWS PER MATCH 
PATTERN (FIRSTROW NEXTROWS*) 
DEFINE 
   FIRSTROW AS PREV(FIRSTROW.deptno) IS NULL  
OR PREV(FIRSTROW.deptno) != FIRSTROW.deptno, 
   NEXTROWS AS PREV(NEXTROWS.deptno) = NEXTROWS.deptno 
)

Solution #2. Using Self-Join with Cartesian Product and GROUP BY

Partial Credit to: Boobal Ganesan

SELECT e1.empno,e1.ename,e1.job,e1.mgr,e1.hiredate,e1.sal,e1.comm,e1.deptno,  
       COUNT(*) rn 
FROM scott.emp e1 LEFT OUTER JOIN scott.emp e2  
  ON e1.deptno = e2.deptno 
 AND e2.ename || ROWIDTOCHAR(e2.ROWID) <= e1.ename || ROWIDTOCHAR(e1.ROWID) 
GROUP BY e1.empno,e1.ename,e1.job,e1.mgr,e1.hiredate,e1.sal,e1.comm,e1.deptno 
ORDER BY e1.deptno, COUNT(*)

Solution #3. Using CTE, ROWNUM, and arithmetic formula

WITH x AS ( 
SELECT * 
FROM scott.emp 
ORDER BY deptno, ename 
), y AS ( 
SELECT deptno, MIN(ROWNUM) min_rn 
FROM x 
GROUP BY deptno 
) 
SELECT x.*, ROWNUM-y.min_rn+1 AS rn 
FROM x JOIN y ON x.deptno=y.deptno 
ORDER BY x.deptno, x.ename

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

8 Solutions to 2018 Oracle SQL Puzzle of the Week #9

March 12, 2018March 12, 2018Zahar HilkevichLeave a comment

Recent employment Puzzle

For each location, show 2 most recently hired employees

Use a single SELECT statement only.
ename1 and hiredate1 columns should correspond the latest hired employee while ename1 and hiredate1 columns – the previous one

Expected Result:

LOC	ENAME1	HIREDATE1	ENAME2	HIREDATE2
NEW YORK	MILLER	23-JAN-82	KING	17-NOV-81
CHICAGO	JAMES	03-DEC-81	MARTIN	28-SEP-81
DALLAS	ADAMS	23-MAY-87	SCOTT	19-APR-87

Solutions:

Solution #1. Using Self-Join and MAX functions

SELECT d.loc, 
     MAX(e1.ename) KEEP(DENSE_RANK FIRST ORDER BY e1.hiredate DESC) ename1, 
     MAX(e1.hiredate) hiredate1, 
     MAX(e2.ename) KEEP(DENSE_RANK FIRST ORDER BY e2.hiredate DESC) ename2, 
     MAX(e2.hiredate) hiredate2 
FROM scott.emp e1 JOIN scott.emp e2 ON e1.deptno=e2.deptno 
 AND e1.hiredate>=e2.hiredate 
 AND e1.ROWID!=e2.ROWID 
                  JOIN scott.dept d ON e1.deptno=d.deptno 
GROUP BY d.loc;

Solution #2. Using LISTAGG and REGEXP_SUBSTR functions

SELECT d.loc, 
       MAX(ename) KEEP(DENSE_RANK FIRST ORDER BY hiredate DESC) ename1,
       MAX(hiredate) hiredate1, 
       REGEXP_SUBSTR(LISTAGG(ename, '|') WITHIN GROUP (ORDER BY hiredate DESC),
 '[^|]+',1,2) ename2,
       REGEXP_SUBSTR(LISTAGG(hiredate, '|') WITHIN GROUP (ORDER BY hiredate DESC),
 '[^|]+',1,2) hiredate2
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
GROUP BY d.loc
ORDER BY 1;

Solution #3. Using CTE, ROW_NUMBER, and Self-Join

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT a.loc, a.ename ename1, a.hiredate hiredate1,
              b.ename ename2, b.hiredate hiredate2
FROM x a JOIN x b ON a.loc=b.loc AND a.rn=1 AND b.rn=2;

Solution #4. Using Pivot

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT loc, e1_ename AS ename1, e1_hdate AS hiredate1,
       e2_ename AS ename2, e2_hdate AS hiredate2
FROM x
PIVOT (
MAX(ename) ename, MAX(hiredate) hdate FOR rn IN (1 AS e1, 2 AS e2) 
)
ORDER BY 1;

Solution #5. Simulating Pivot with MAX and DECODE functions

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT loc, 
       MAX(DECODE(rn,1,ename)) ename1, 
       MAX(DECODE(rn,1,hiredate)) hiredate1,
       MAX(DECODE(rn,2,ename)) ename2, 
       MAX(DECODE(rn,2,hiredate)) hiredate2
FROM x
GROUP BY loc
ORDER BY 1;

Solution #6. Using CONNECT BY

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT loc, PRIOR ename ename1, PRIOR hiredate hiredate1, 
       ename ename2, hiredate hriedate2
FROM x
WHERE rn=2
START WITH rn=1
CONNECT BY loc=PRIOR loc
       AND rn=PRIOR rn+1;

Solution #7. Using LEAD and ROW_NUMBER Analytic functions

WITH x AS (
SELECT d.loc, e.ename ename1, e.hiredate hiredate1, 
 LEAD(e.ename,1) OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) ename2,
 LEAD(e.hiredate,1) OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) hiredate2,
 ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
) 
SELECT loc, ename1, hiredate1, ename2, hiredate2
FROM x
WHERE rn=1
ORDER BY 1;

Solution #8. Using Model Clause:

SELECT loc, ename1, hiredate1, ename2, hiredate2
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
MODEL
RETURN UPDATED ROWS
PARTITION BY (d.loc)
DIMENSION BY (
   ROW_NUMBER()OVER(PARTITION BY d.loc ORDER BY e.hiredate DESC) AS rn
)
MEASURES(
    ename AS ename1, hiredate AS hiredate1, 
    ename AS ename2, hiredate AS hiredate2
)
RULES(
    ename2[1]   =ename1[2],
    hiredate2[1]=hiredate1[2]
)
ORDER BY 1;

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

9 Solutions to 2018 Oracle SQL Puzzle of the Week #7

February 26, 2018Zahar Hilkevich1 Comment

Namesake Puzzle

Show groups of employees having the same last name.

Use a single SELECT statement only
Use hr.employees table

Solutions:

Solution #1: Using Subquery with HAVING clause:

SELECT first_name, last_name, department_id, employee_id 
FROM hr.employees 
WHERE last_name IN (SELECT last_name 
                    FROM hr.employees 
		    GROUP BY last_name 
		    HAVING COUNT(*)>1) 
ORDER BY 2,1

Solution #2: Using Multi-Column Subquery with NO HAVING clause

SELECT first_name, last_name, department_id, employee_id 
FROM hr.employees 
WHERE (last_name, 1) IN (SELECT last_name, SIGN(COUNT(*)-1) 
                         FROM hr.employees 
			 GROUP BY last_name) 
ORDER BY 2,1

Solution #3: Using Subquery with IN operator

SELECT first_name, last_name, department_id, employee_id 
FROM hr.employees a 
WHERE last_name IN (SELECT b.last_name 
                    FROM hr.employees b 
		    WHERE a.employee_id!=b.employee_id) 
ORDER BY 2,1

Solution #4: Using Self-Join with duplicate elimination in GROUP BY

SELECT a.first_name, a.last_name, a.department_id, a.employee_id 
FROM hr.employees a JOIN hr.employees b ON a.last_name=b.last_name 
                                       AND a.employee_id!=b.employee_id 
GROUP BY a.first_name, a.last_name, a.department_id, a.employee_id 
ORDER BY 2,1

Solution #5: Using a filter by COUNT analytic function with PARTITION BY

WITH x AS ( 
SELECT first_name, last_name, department_id, employee_id,  
       COUNT(*) OVER(PARTITION BY last_name) cnt 
FROM hr.employees 
)	 
SELECT first_name, last_name, department_id, employee_id 
FROM x 
WHERE cnt>1 
ORDER BY 2,1

Solution #6: Mimicking COUNT analytic function with MODEL clause

(credit to Naoto Katayama)

WITH x AS ( 
SELECT first_name, last_name, department_id, employee_id, cnt 
FROM hr.employees 
MODEL 
RETURN UPDATED ROWS 
DIMENSION BY (last_name, employee_id) 
MEASURES(first_name, department_id, 0 AS cnt) 
RULES (cnt[ANY, ANY]=COUNT(*)[CV(), ANY]) 
) 
SELECT first_name, last_name, department_id, employee_id 
FROM x 
WHERE cnt>1 
ORDER BY 2,1

Solution #7: Filtering by LEAD and LAG analytic functions

WITH x AS ( 
SELECT first_name, last_name, department_id, employee_id,  
       LAG (last_name,1) OVER(ORDER BY last_name) lag_name, 
       LEAD(last_name,1) OVER(ORDER BY last_name) lead_name 
FROM hr.employees 
)	 
SELECT first_name, last_name, department_id, employee_id 
FROM x 
WHERE last_name IN (lag_name, lead_name) 
ORDER BY 2,1

Solution #8: Using MODEL clause with dummy measure for SIGN over analytic function expression

SELECT first_name, last_name, department_id, employee_id
FROM hr.employees 
MODEL 
RETURN UPDATED ROWS 
PARTITION BY (last_name) 
DIMENSION BY (SIGN(COUNT(*) OVER(PARTITION BY last_name)-1) AS n, 
              employee_id) 
MEASURES(first_name, department_id, 0 AS dummy) 
RULES (dummy[1, ANY]=1) 
ORDER BY 2,1

Solution #9: Using UNPIVOT with DISTINCT option over CONNECT BY with PRIOR

WITH x AS (
SELECT first_name curr_first, last_name, department_id curr_dept, employee_id curr_id, 
       PRIOR first_name prior_first, PRIOR department_id prior_dept, PRIOR employee_id prior_id
FROM hr.employees
WHERE level=2
CONNECT BY last_name=PRIOR last_name AND employee_id>PRIOR employee_id
)
SELECT DISTINCT first_name, last_name, department_id, employee_id
FROM x
UNPIVOT( 
    (first_name, department_id, employee_id)  for dummy IN ((curr_first, curr_dept, curr_id),
                                                            (prior_first,prior_dept,prior_id))
)
ORDER BY 2,1;

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Term Replication Sequence SQL Puzzle

February 16, 2018February 16, 2018Zahar Hilkevich3 Comments

SQL Puzzle:

Generate a term replication sequence: 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, etc in a single SELECT statement.

Level: Advanced

Expected Result (for N=4):

RN
1
2
2
3
3
3
4
4
4
4

Solutions:

#1: Using CONNECT BY (for both, the range and the sequence generation)

WITH x AS (
SELECT ROWNUM rn
FROM dual
CONNECT BY LEVEL<=4
)
SELECT rn--, LEVEL
FROM x
CONNECT BY LEVEL<=rn
       AND rn>PRIOR rn
GROUP BY rn, LEVEL
ORDER BY 1;

#2: Using Recursive CTE

WITH x(rn, lvl) AS (
SELECT ROWNUM rn, 1
FROM dual
CONNECT BY LEVEL<=4
UNION ALL
SELECT rn, lvl+1
FROM x
WHERE rn>=lvl+1
)
SELECT rn
FROM x
ORDER BY 1;

#3: Using Self-Join

WITH x AS (
SELECT ROWNUM rn
FROM dual
CONNECT BY LEVEL<=4
)
SELECT a.rn
FROM x a JOIN x b ON a.rn>=b.rn
ORDER BY 1;

Naoto Katayama submitted one more elegant solution using MODEL clause:

#4: Using MODEL clause

SELECT RN
FROM (SELECT LEVEL rn
      FROM DUAL 
      CONNECT BY LEVEL<=4)
MODEL
PARTITION BY(ROWNUM AS par)
DIMENSION BY(0 AS dummy)
MEASURES(rn)
RULES ITERATE(100) UNTIL ITERATION_NUMBER+1>=rn[0]
(rn[ITERATION_NUMBER]=rn[0])
ORDER BY 1;

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Three Solutions to Puzzle of the Week #15

June 20, 2016June 20, 2016Zahar Hilkevich1 Comment

Puzzle of the Week #15:

Find all the year based intervals from 1975 up to now when the company did not hire employees. Use a single SELECT statement against emp table.

Expected Result:

years
------------
1975 - 1979
1983 - 1986
1988 - 2016

Solutions

#1: Grouping by an expression on ROWNUM (no Analytic functions!)

SQL> col years for a15

SQL> WITH x AS (
  2  SELECT 1975+LEVEL-1 yr
  3  FROM dual
  4  CONNECT BY 1975+LEVEL-1<=EXTRACT(YEAR FROM SYSDATE)
  5  MINUS
  6  SELECT EXTRACT(YEAR FROM hiredate)
  7  FROM emp
  8  )
  9  SELECT MIN(yr) || ' - ' || MAX(yr) "years"
 10  FROM x
 11  GROUP BY yr-ROWNUM
 12  ORDER BY yr-ROWNUM;

years
---------------
1975 - 1979
1983 - 1986
1988 - 2016

#2: Calculating steps with Analytic function and grouping by a sum of step.

WITH x AS (
SELECT 1975+LEVEL-1 yr
FROM dual
CONNECT BY 1975+LEVEL-1<=EXTRACT(YEAR FROM SYSDATE)
MINUS
SELECT EXTRACT(YEAR FROM hiredate)
FROM emp
), y AS (
SELECT DECODE(yr, LAG(yr,1)OVER(ORDER BY yr)+1, 0, 1) AS step, yr
FROM x
), z AS (
SELECT yr, SUM(step)OVER(ORDER BY yr) grp
FROM y
)
SELECT MIN(yr) || ' - ' || MAX(yr) "years"
FROM z
GROUP BY grp
ORDER BY grp;

years
---------------
1975 - 1979
1983 - 1986
1988 - 2016

#3: Using Self Outer Join to calculate steps

WITH x AS (
SELECT 1975+LEVEL-1 yr
FROM dual
CONNECT BY 1975+LEVEL-1<=EXTRACT(YEAR FROM SYSDATE)
MINUS
SELECT EXTRACT(YEAR FROM hiredate)
FROM emp
), y AS (
SELECT x1.yr, NVL2(x2.yr, 0, 1) step
FROM x x1 LEFT JOIN x x2 ON x1.yr=x2.yr+1
), z AS (
SELECT yr, SUM(step)OVER(ORDER BY yr) grp
FROM y
)
SELECT MIN(yr) || ' - ' || MAX(yr) "years"
FROM z
GROUP BY grp
ORDER BY grp;

years
---------------
1975 - 1979
1983 - 1986
1988 - 2016

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

List all employees in 2 columns based on the salary ranking.

December 16, 2015December 16, 2015Zahar Hilkevich2 Comments

Problem: List all employee names and their respective salaries in 2 columns based in the salary ranking (from the highest to the lowest).

Expected Result:

 ID LEFT_NAME      LEFT_SAL RIGHT_NAME    RIGHT_SAL
--- ------------ ---------- ------------ ----------
  1 KING               5000 FORD               3000
  2 SCOTT              3000 JONES              2975
  3 BLAKE              2850 CLARK              2450
  4 ALLEN              1600 TURNER             1500
  5 MILLER             1300 WARD               1250
  6 MARTIN             1250 ADAMS              1100
  7 JAMES               950 SMITH               800

Problem Level: Intermediate/Advanced

Solution:
I have picked 5 best performing methods to solve this problem. The idea behind each method can be found in my book: “Oracle SQL Tricks and Workarounds”

Method/Workaround #1: Using Hierarchical Query (Level: Advanced)

WITH X AS (
SELECT ename, sal, ROW_NUMBER()OVER(ORDER BY sal DESC) RN
FROM EMP
)
SELECT  rn/2 AS id, PRIOR ename left_name, PRIOR sal left_sal, ename right_name, sal right_sal
FROM X
WHERE MOD(level,2)=0
START WITH rn=1
CONNECT BY rn=1+PRIOR rn

Method/Workaround #2: Using Analytical Function (Level: Advanced)

WITH X AS (
SELECT ename left_name, sal left_sal, 
       LEAD(ename, 1) OVER(ORDER BY sal DESC) AS right_name, 
       LEAD(sal, 1) OVER(ORDER BY sal DESC) as right_sal,
       ROW_NUMBER() OVER(ORDER BY sal DESC) rn
from emp
)
SELECT (rn+1)/2 AS ID, left_name, left_sal,
                       right_name, right_sal
FROM X
WHERE MOD(rn,2)=1
ORDER BY rn

Method/Workaround #3: Using PIVOT Clause (Level: Advanced)

SELECT *
FROM (SELECT CEIL(rn/2) AS ID, ename, sal, 2-MOD(rn,2) AS col_no
      FROM (SELECT ename, sal, ROW_NUMBER() OVER(ORDER BY sal DESC) rn
            FROM emp
            )
      )
PIVOT (MAX(ename) AS name,
       MAX(sal)   AS sal
       FOR (col_no) IN (1 AS left, 2 AS right)
       )
ORDER BY 1

Method/Workaround #4: Using MAX function on concatenated column expression (Level: Advanced)

WITH X AS (
SELECT LPAD(sal, 5, '0') || ename as sname, ROW_NUMBER()OVER(ORDER BY sal DESC) rn
FROM EMP
)
SELECT CEIL(rn/2) ID, SUBSTR(MAX(SNAME), 6) left_name,  TO_NUMBER(SUBSTR(MAX(SNAME), 1, 5)) left_sal, 
                      SUBSTR(MIN(SNAME), 6) right_name, TO_NUMBER(SUBSTR(MIN(SNAME), 1, 5)) right_sal
FROM X
GROUP BY CEIL(rn/2)
ORDER BY 1

Method/Workaround #5: Using Self-Join (Level: Intermediate)

WITH X AS (
SELECT ename, sal, ROW_NUMBER()OVER(ORDER BY sal DESC) rn
FROM EMP
)
SELECT B.rn/2 AS ID, a.ename AS left_name, a.sal AS left_sal,
                     b.ename AS right_name, b.sal AS right_sal
FROM x a LEFT JOIN x b ON a.rn+1=b.rn 
WHERE mod(a.rn,2)=1

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Substitute a self outer join with Connect By, by Zahar Hilkevich

August 21, 2013August 21, 2013Zahar HilkevichLeave a comment

Self [outer] join is a very common and useful technique in Oracle SQL and in other flavors of SQL.

Let’s consider a trivial task of retrieving manager’s name and number next to each employee’s record (we are using scott schema):

SELECT a.ename, a.empno, a.job, a.deptno,
       b.ename AS manager,
       b.empno AS mgrno
FROM emp a LEFT JOIN emp b ON a.mgr=b.empno
ORDER BY 1

Result:

ENAME           EMPNO JOB           DEPTNO MANAGER         MGRNO
---------- ---------- --------- ---------- ---------- ----------
ADAMS            7876 CLERK             20 SCOTT            7788
ALLEN            7499 SALESMAN          30 BLAKE            7698
BLAKE            7698 MANAGER           30 KING             7839
CLARK            7782 MANAGER           10 KING             7839
FORD             7902 ANALYST           20 JONES            7566
JAMES            7900 CLERK             30 BLAKE            7698
JONES            7566 MANAGER           20 KING             7839
KING             7839 PRESIDENT         10
MARTIN           7654 SALESMAN          30 BLAKE            7698
MILLER           7934 CLERK             10 CLARK            7782
SCOTT            7788 ANALYST           20 JONES            7566
SMITH            7369 CLERK             20 FORD             7902
TURNER           7844 SALESMAN          30 BLAKE            7698
WARD             7521 SALESMAN          30 BLAKE            7698

Execution Plan
----------------------------------------------------------
Plan hash value: 2322654302

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    14 |   490 |     8  (25)| 00:00:01 |
|   1 |  SORT ORDER BY      |      |    14 |   490 |     8  (25)| 00:00:01 |
|*  2 |   HASH JOIN OUTER   |      |    14 |   490 |     7  (15)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP  |    14 |   350 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP  |    14 |   140 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."MGR"="B"."EMPNO"(+))

Note, that an outer join (left join) was used here to retrieve KING’s record as that employee is a president and does not have a manager.

The following workaround is based on the CONNECT BY clause without the use of START WITH as we need to retrieve all employee records:

SELECT ename, empno, job, deptno,
       MAX(PRIOR ename) AS manager,
       MAX(PRIOR empno) AS mgrno
FROM emp
WHERE LEVEL<=2
CONNECT BY mgr=PRIOR empno
GROUP BY ename, empno, job, deptno
ORDER BY 1

Result:

ENAME           EMPNO JOB           DEPTNO MANAGER         MGRNO
---------- ---------- --------- ---------- ---------- ----------
ADAMS            7876 CLERK             20 SCOTT            7788
ALLEN            7499 SALESMAN          30 BLAKE            7698
BLAKE            7698 MANAGER           30 KING             7839
CLARK            7782 MANAGER           10 KING             7839
FORD             7902 ANALYST           20 JONES            7566
JAMES            7900 CLERK             30 BLAKE            7698
JONES            7566 MANAGER           20 KING             7839
KING             7839 PRESIDENT         10
MARTIN           7654 SALESMAN          30 BLAKE            7698
MILLER           7934 CLERK             10 CLARK            7782
SCOTT            7788 ANALYST           20 JONES            7566
SMITH            7369 CLERK             20 FORD             7902
TURNER           7844 SALESMAN          30 BLAKE            7698
WARD             7521 SALESMAN          30 BLAKE            7698

Execution Plan
----------------------------------------------------------
Plan hash value: 2826654915

---------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |    14 |   350 |     4  (25)| 00:00:01 |
|   1 |  SORT GROUP BY                 |      |    14 |   350 |     4  (25)| 00:00:01 |
|*  2 |   FILTER                       |      |       |       |            |          |
|*  3 |    CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
|   4 |     TABLE ACCESS FULL          | EMP  |    14 |   350 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(LEVEL<=2)
   3 - access("MGR"=PRIOR "EMPNO")

This execution plan looks somewhat better than the previous one, mainly, because we only used a single copy of the emp table while the first query used two.

P.S. If you like this trick, you can find many more in my book “Oracle SQL Tricks and Workarounds”

Second Top Employee as of the Start of Employment

Solution #1. Using LATERAL view, RANK and cumulative MAX analytic functions (Oracle 12g+):

Solution #2. Using CTE, cumulative MAX analytic function and a correlated subquery with COUNT to mimic the filter by RANK:

Solution #3. Using CTE, cumulative MAX analytic function and an in-line scalar subquery in SELECT to mimic the RANK:

Solution #4. Using self-join and Cartesian Product with aggregation:

Solution #5. Using CTE and cumulative MAX analytic function (twice):

Solution #6. Using regular and recursive CTEs, ROWNUM, GREATEST, and CASE functions (no Analytic functions!):

Solution #7. Using CTE and MODEL clause to mimic Solution #6:

Further Reading:

Solution #1. Using MATCH_RECOGNIZE clause

Solution #2. Using Self-Join with Cartesian Product and GROUP BY

Solution #3. Using CTE, ROWNUM, and arithmetic formula

Solution #1. Using Self-Join and MAX functions

Solution #2. Using LISTAGG and REGEXP_SUBSTR functions

Solution #3. Using CTE, ROW_NUMBER, and Self-Join

Solution #4. Using Pivot

Solution #5. Simulating Pivot with MAX and DECODE functions

Solution #6. Using CONNECT BY

Solution #7. Using LEAD and ROW_NUMBER Analytic functions

Solution #8. Using Model Clause:

Solution #1: Using Subquery with HAVING clause:

Solution #2: Using Multi-Column Subquery with NO HAVING clause

Solution #3: Using Subquery with IN operator

Solution #4: Using Self-Join with duplicate elimination in GROUP BY

Solution #5: Using a filter by COUNT analytic function with PARTITION BY

Solution #6: Mimicking COUNT analytic function with MODEL clause

Solution #7: Filtering by LEAD and LAG analytic functions

Solution #8: Using MODEL clause with dummy measure for SIGN over analytic function expression

Solution #9: Using UNPIVOT with DISTINCT option over CONNECT BY with PRIOR

SQL Puzzle:

Solutions:

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?

Expected Result:

Solutions

#1: Grouping by an expression on ROWNUM (no Analytic functions!)

#2: Calculating steps with Analytic function and grouping by a sum of step.

#3: Using Self Outer Join to calculate steps

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?