A trick that helps avoiding multiple table scans.

Let’s look at a fairly simple SQL problem:

In a traditional scott.emp table, find all employees who work in the same department as the president.

  • Make your query work even if there are more than 1 president records exist in emp table
  • Make Oracle scan emp table just ONCE

A “traditional” solution to this problem may look like this:

Strategy #1: Using a subquery

SELECT *
FROM scott.emp
WHERE deptno IN (SELECT deptno 
                 FROM scott.emp 
                 WHERE job='PRESIDENT')

or this:

Strategy #2: Using a self-join

SELECT DISTINCT a.*
FROM scott.emp a JOIN scott.emp b ON a.deptno=b.deptno
WHERE b.job='PRESIDENT'

Note, that DISTINCT option in the above query is needed to prevent duplicates if there were multiple presidents in а specific department.

Both solutions above use 2 copies of the emp table which makes oracle scan the same scott.emp table twice.

A trick presented below allows you to use only a single copy of the emp table to solve the problem. The trick involves different conceptual and technical approaches compared to the solutions we have seen so far.

Conceptually, we should rephrase the problem in a way that would keep it identical and at the same time allows us to use different technical arsenal. This approach is explained in a detailed manner in my book “Oracle SQL Tricks and Workarounds”. We can rephrase the puzzle and say that we are looking for employees from departments with some “positive” number of presidents working there. Technically speaking, we need to use analytic function COUNT and check if it is greater than 0:

Strategy #3: Using analytic function COUNT

WITH x AS (
SELECT e.*, 
       COUNT(DECODE(job,'PRESIDENT',1))OVER(PARTITION BY deptno) cnt
FROM scott.emp e
)
SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM x
WHERE cnt>0
ORDER BY empno

We do need to use a common table expression as we cannot filter out by analytic function in the same query where the function is used. Nevertheless, we scan the emp table just once, and during this scan, Oracle engine counts the number of presidents in each department.

COUNT is not the only analytic function that can be employed to solve the problem.

Strategy #4: Using analytic function LISTAGG

WITH x AS (
SELECT e.*, 
       LISTAGG(job,'|') 
         WITHIN GROUP (ORDER BY job) OVER(PARTITION BY deptno) jobs
FROM scott.emp e 
)
SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM x
WHERE '|' || jobs || '|' LIKE '%|PRESIDENT|%'
ORDER BY empno

Instead of counting the presidents by department, we simply concatenate all the job titles and check if the resulting string includes a president.

Finally, if you don’t like using sub-queries in general, we can leverage the power of MODEL clause:

Strategy #5: Using MODEL clause to avoid sub-queries

SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM scott.emp
MODEL RETURN UPDATED ROWS 
DIMENSION BY (
  empno, 
  SIGN(COUNT(DECODE(job,'PRESIDENT',1))OVER(PARTITION BY deptno)) cnt
) 
MEASURES(ename, job, mgr, hiredate, sal, comm, deptno, 0 dummy) 
RULES(dummy[ANY, 1]=1)
ORDER BY empno

The tricky part here is using a composition of SIGN, COUNT, and DECODE functions (i.e. SIGN on top of what we used in Strategy #3) as a secondary dimension and empno as primary. Employee number is unique by itself, so adding another dimension will still maintain uniqueness required by MODEL clause. The only MODEL RULE changes the dummy measure which “triggers” the “RETURN UPDATED ROWS” instruction and returns only those rows where the dummy dimension was set to 1 – notice that its default value is 0.

You can check the execution plan for all of the above strategies to see how many times Oracle scans the emp table.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

 

 

Advertisements

Monday Hiring SQL Puzzle and Lateral View Usage Nuance

Let’s solve a fairly simple SQL problem:

For each department count the number of people hired on Monday

• Use scott.emp table
• Show department number and count columns
• If no employees from a given department was hired on Monday, we should list such
department with 0 in the count column
• Sort the result by the department number

Expected Result:

DEPTNO MON_HIRES
10 0
20 0
30 1

We will start with the in-line scalar subquery approach as it is probably one of the most intuitive:

Strategy #1: In-Line Scalar Subquery

SELECT deptno, (SELECT COUNT(*)
                FROM scott.emp
                WHERE deptno=e.deptno
                AND TO_CHAR(hiredate, 'DY')='MON') mon_hires
FROM scott.emp e
GROUP BY deptno
ORDER BY 1

When you only need a single value/column/expression from a correlated subquery, the in-line subquery in SELECT clause works just fine. If we needed more than one: count and let say total salary, we would need to use LATERAL view (or a completely different approach – see below).

Strategy #2: Lateral View

WITH x AS (
SELECT DISTINCT deptno 
FROM scott.emp
)
SELECT x.deptno, m.mon_hires, m.total_sal
FROM x, LATERAL (SELECT COUNT(*) mon_hires, SUM(sal) total_sal
                 FROM scott.emp e
                 WHERE e.deptno=x.deptno
                   AND TO_CHAR(e.hiredate, 'DY')='MON') m
ORDER BY 1

Result:

DEPTNO MON_HIRES TOTAL_SAL
10 0
20 0
30 1 1250

So far, all is good. I know that some database developers don’t like using table aliases too much and when an opportunity comes they use ANSI standard JOIN syntax with USING clause. Can it be applied in the lateral view?

WITH x AS (
SELECT DISTINCT deptno 
FROM scott.emp
)
SELECT x.deptno, m.mon_hires
FROM x, LATERAL (SELECT COUNT(empno) mon_hires
                 FROM scott.emp JOIN x USING (deptno)
                 WHERE TO_CHAR(hiredate, 'DY')='MON') m
ORDER BY 1

Result:

DEPTNO MON_HIRES
10 1
20 1
30 1

The syntax is correct but the result is apparently not! What happened?

We replaced the WHERE clause condition of e.deptno=x.deptno with the JOIN x USING(deptno) – it first looked legitimate to me until I realized that we have just added an extra join instead of a reference to an external table (CTE x). Essentially, our last (incorrect) query is the same as the following:

WITH x AS (
SELECT DISTINCT deptno 
FROM scott.emp
)
SELECT x.deptno, m.mon_hires
FROM x, LATERAL (SELECT COUNT(*) mon_hires
                 FROM scott.emp e, x
                 WHERE e.deptno=x.deptno
                   AND TO_CHAR(e.hiredate, 'DY')='MON') m
ORDER BY 1

We simply introduced a new (and unwanted) join on the CTE x and turned the correlated reference (in the WHERE clause) into an old-style joining condition which did not even require the LATERAL view functionality:

WITH x AS (
SELECT DISTINCT deptno 
FROM scott.emp
)
SELECT x.deptno, m.mon_hires
FROM x, (SELECT COUNT(*) mon_hires
         FROM scott.emp e, x
         WHERE e.deptno=x.deptno
           AND TO_CHAR(e.hiredate, 'DY')='MON') m
ORDER BY 1

Result:

DEPTNO MON_HIRES
10 1
20 1
30 1

The above examples prove that we need to be very careful when mixing up different techniques and new syntax options.

Finally, the last approach will show that only a single scan of the emp table is needed to get the result:

Strategy #3: Conditional Counting

SELECT deptno, COUNT(DECODE(TO_CHAR(hiredate, 'DY'), 'MON', 1)) mon_hires
FROM scott.emp
GROUP BY deptno
ORDER BY 1

Result:

DEPTNO MON_HIRES
10 0
20 0
30 1

This is the best and incidentally the shortest solution that once again demonstrates the power of conditional counting (aggregation) right in SELECT clause.

Likewise we can also show the total salary of those hired on Monday:

SELECT deptno, 
       COUNT(DECODE(TO_CHAR(hiredate, 'DY'), 'MON', 1)) mon_hires,
       SUM(DECODE(TO_CHAR(hiredate, 'DY'), 'MON', sal)) total_sal
FROM scott.emp
GROUP BY deptno
ORDER BY 1

Result:

DEPTNO MON_HIRES TOTAL_SAL
10 0
20 0
30 1 1250

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

8 Solutions to 2018 Oracle SQL Puzzle of the Week #9

Recent employment Puzzle

For each location, show 2 most recently hired employees

  • Use a single SELECT statement only.
  • ename1 and hiredate1 columns should correspond the latest hired employee while ename1 and hiredate1 columns – the previous one

Expected Result:

LOC ENAME1 HIREDATE1 ENAME2 HIREDATE2
NEW YORK MILLER 23-JAN-82 KING 17-NOV-81
CHICAGO JAMES 03-DEC-81 MARTIN 28-SEP-81
DALLAS ADAMS 23-MAY-87 SCOTT 19-APR-87

Solutions:

Solution #1. Using Self-Join and MAX functions

SELECT d.loc, 
     MAX(e1.ename) KEEP(DENSE_RANK FIRST ORDER BY e1.hiredate DESC) ename1, 
     MAX(e1.hiredate) hiredate1, 
     MAX(e2.ename) KEEP(DENSE_RANK FIRST ORDER BY e2.hiredate DESC) ename2, 
     MAX(e2.hiredate) hiredate2 
FROM scott.emp e1 JOIN scott.emp e2 ON e1.deptno=e2.deptno 
 AND e1.hiredate>=e2.hiredate 
 AND e1.ROWID!=e2.ROWID 
                  JOIN scott.dept d ON e1.deptno=d.deptno 
GROUP BY d.loc;

Solution #2. Using LISTAGG and REGEXP_SUBSTR functions

SELECT d.loc, 
       MAX(ename) KEEP(DENSE_RANK FIRST ORDER BY hiredate DESC) ename1,
       MAX(hiredate) hiredate1, 
       REGEXP_SUBSTR(LISTAGG(ename, '|') WITHIN GROUP (ORDER BY hiredate DESC),
 '[^|]+',1,2) ename2,
       REGEXP_SUBSTR(LISTAGG(hiredate, '|') WITHIN GROUP (ORDER BY hiredate DESC),
 '[^|]+',1,2) hiredate2
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
GROUP BY d.loc
ORDER BY 1;

Solution #3. Using CTE, ROW_NUMBER, and Self-Join

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT a.loc, a.ename ename1, a.hiredate hiredate1,
              b.ename ename2, b.hiredate hiredate2
FROM x a JOIN x b ON a.loc=b.loc AND a.rn=1 AND b.rn=2;

Solution #4. Using Pivot

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT loc, e1_ename AS ename1, e1_hdate AS hiredate1,
       e2_ename AS ename2, e2_hdate AS hiredate2
FROM x
PIVOT (
MAX(ename) ename, MAX(hiredate) hdate FOR rn IN (1 AS e1, 2 AS e2) 
)
ORDER BY 1;

Solution #5. Simulating Pivot with MAX and DECODE functions

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT loc, 
       MAX(DECODE(rn,1,ename)) ename1, 
       MAX(DECODE(rn,1,hiredate)) hiredate1,
       MAX(DECODE(rn,2,ename)) ename2, 
       MAX(DECODE(rn,2,hiredate)) hiredate2
FROM x
GROUP BY loc
ORDER BY 1;

Solution #6. Using CONNECT BY

WITH x AS (
SELECT d.loc, e.ename, e.hiredate, 
       ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
)
SELECT loc, PRIOR ename ename1, PRIOR hiredate hiredate1, 
       ename ename2, hiredate hriedate2
FROM x
WHERE rn=2
START WITH rn=1
CONNECT BY loc=PRIOR loc
       AND rn=PRIOR rn+1;

Solution #7. Using LEAD and ROW_NUMBER Analytic functions

WITH x AS (
SELECT d.loc, e.ename ename1, e.hiredate hiredate1, 
 LEAD(e.ename,1) OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) ename2,
 LEAD(e.hiredate,1) OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) hiredate2,
 ROW_NUMBER()OVER(PARTITION BY d.deptno ORDER BY e.hiredate DESC) rn
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
) 
SELECT loc, ename1, hiredate1, ename2, hiredate2
FROM x
WHERE rn=1
ORDER BY 1;

Solution #8. Using Model Clause:

SELECT loc, ename1, hiredate1, ename2, hiredate2
FROM scott.emp e JOIN scott.dept d ON e.deptno=d.deptno
MODEL
RETURN UPDATED ROWS
PARTITION BY (d.loc)
DIMENSION BY (
   ROW_NUMBER()OVER(PARTITION BY d.loc ORDER BY e.hiredate DESC) AS rn
)
MEASURES(
    ename AS ename1, hiredate AS hiredate1, 
    ename AS ename2, hiredate AS hiredate2
)
RULES(
    ename2[1]   =ename1[2],
    hiredate2[1]=hiredate1[2]
)
ORDER BY 1;

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

3 Solutions to 2018 Oracle SQL Puzzle of the Week #2

2018 Puzzle of the Week #2:

For each of the following salary ranges select two randomly chosen employees:
0-999
1000-1999
2000-2999
3000+

Expected Result (in SQL*Plus):

ENAME      SAL        RANGE
---------- ---------- ---------
SCOTT            3000 3000+
FORD             3000 3000+
BLAKE            2850 2000-2999
CLARK            2450 2000-2999
TURNER           1500 1000-1999
MILLER           1300 1000-1999
JAMES             950 0-999
SMITH             800 0-999
  • Remember to use only a single SELECT statement.
  • Use table emp (from Oracle scott schema)

Solutions:

Solution #1: Using ROW_NUMBER with random.value functions:

We are applying a random sorting order to each of the salary ranges and take 2 top records from each range:

WITH x AS (
SELECT CASE WHEN sal<=999  THEN '0-999'
            WHEN sal<=1999 THEN '1000-1999'
            WHEN sal<=2999 THEN '2000-2999'
            ELSE                '3000+'
       END range,
       ename, sal
FROM emp
), y AS (
SELECT ename, sal, range, 
       ROW_NUMBER()OVER(PARTITION BY range 
                        ORDER BY dbms_random.value) rn
FROM x
)
SELECT range, ename, sal
FROM y
WHERE rn<=2
ORDER BY range

Result:

RANGE     ENAME      SAL
--------- ---------- ----------
0-999     JAMES      950
0-999     SMITH      800
1000-1999 WARD 1250
1000-1999 TURNER 1500
2000-2999 JONES 2975
2000-2999 CLARK 2450
3000+     FORD 3000
3000+     KING 5000

Result (of subsequent execution):

RANGE     ENAME             SAL
--------- ---------- ----------
0-999     SMITH             800
0-999     JAMES             950
1000-1999 WARD             1250
1000-1999 MARTIN           1250
2000-2999 BLAKE            2850
2000-2999 JONES            2975
3000+     SCOTT            3000
3000+     KING             5000

Solution #2: Using DECODE, MAX() KEEP and UNION ALL:

Instead of taking top 2 records (randomly sorted), we are taking top 1 and bottom 1 and combine them together. DECODE function mimics the CASE from the previous solution.

WITH x AS (
SELECT DECODE(1, SIGN(999-sal), '0-999', SIGN(1999-sal), '1000-1999',
                 SIGN(2999-sal), '2000-2999', '3000+') range,
       ename, sal, ROWNUM || dbms_random.value rnd
FROM scott.emp
)
SELECT range, MAX(ename)KEEP(DENSE_RANK FIRST ORDER BY rnd) ename,
              MAX(sal)  KEEP(DENSE_RANK FIRST ORDER BY rnd) sal
FROM x
GROUP BY range
UNION ALL
SELECT range, MAX(ename)KEEP(DENSE_RANK LAST ORDER BY rnd) ename,
              MAX(sal)  KEEP(DENSE_RANK LAST ORDER BY rnd) sal
FROM x
GROUP BY range
ORDER BY range

Result:

RANGE     ENAME             SAL
--------- ---------- ----------
0-999     JAMES             950
0-999     SMITH             800
1000-1999 MARTIN           1250
1000-1999 WARD             1250
2000-2999 JONES            2975
2000-2999 BLAKE            2850
3000+     FORD             3000
3000+     KING             5000

Note, that we concatenated ROWNUM with dbms_random.value to produce UNIQUE random value. Without ROWNUM (or any other KEY) there is always a chance that dbms_random.value will repeat on different rows and hence top and bottom values could be mixed and the same employee will be repeated twice.

Solution #3: Using SIN for random value simulation and multi-column UNPIVOT with MAX() KEEP function:

Instead of combining top and bottom records from two statements using UNION ALL, here were calculating top and bottom values as 1 record and UNPIVOT them to produce two rows per salary range:

WITH x AS (
SELECT DECODE(1, SIGN(999-sal), '0-999', SIGN(1999-sal), '1000-1999',
                 SIGN(2999-sal), '2000-2999', '3000+') range,
       ename, sal, 
       SIN(ROWNUM*TO_NUMBER(SUBSTR(
                             extract(second 
                                     from current_timestamp),-3))
           ) rnd
FROM scott.emp
), y AS (
SELECT range, MAX(ename)KEEP(DENSE_RANK FIRST ORDER BY rnd) ename1,
              MAX(sal)  KEEP(DENSE_RANK FIRST ORDER BY rnd) sal1,
              MAX(ename)KEEP(DENSE_RANK LAST ORDER BY rnd) ename2,
              MAX(sal)  KEEP(DENSE_RANK LAST ORDER BY rnd) sal2
FROM x
GROUP BY range
)
SELECT range, ename, sal
FROM y
UNPIVOT (
  (ename, sal) for (t1, t2) in ((ename1,sal1), (ename2,sal2))
)
ORDER BY range

Result:

RANGE     ENAME             SAL
--------- ---------- ----------
0-999     SMITH             800
0-999     JAMES             950
1000-1999 MILLER           1300
1000-1999 MARTIN           1250
2000-2999 CLARK            2450
2000-2999 BLAKE            2850
3000+     FORD             3000
3000+     SCOTT            3000

Note the use of multi-column UNPIVOT. Randomization simulation is based on a fairly random selection of the last 3 digits in the current timestamp’s second value. This number is used as a “seed”. When this seed is multiplied by the rownum, the result is used as a SIN function argument which makes the outcome pseudo-random.

You can execute the above SQL statements in Oracle Live SQL environment.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

9 Solutions to Puzzle of the Week #21

Puzzle of the Week #21:

Produce a report that shows employee name, his/her immediate manager name, and the next level manager name. The following conditions should be met:

  • Use Single SELECT statement only
  • Use mgr column to identify employee’s immediate manager
  • The query should work in Oracle 11g.
  • A preferred solution should use only a single instance of emp table.

Expected Result:

NAME1      NAME2      NAME3
---------- ---------- ------
SMITH      FORD       JONES
ALLEN      BLAKE      KING
WARD       BLAKE      KING
JONES      KING
MARTIN     BLAKE      KING
BLAKE      KING
CLARK      KING
SCOTT      JONES      KING
KING
TURNER     BLAKE      KING
ADAMS      SCOTT      JONES
JAMES      BLAKE      KING
FORD       JONES      KING
MILLER     CLARK      KING

Solutions:

#1. Using connect_by_root, sys_connect_by_path, and regexp_substr functions

col name1 for a10
col name2 for a10
col name3 for a10
WITH x AS(
SELECT CONNECT_BY_ROOT(ename) name,
       SYS_CONNECT_BY_PATH(ename, ',') path,
       CONNECT_BY_ROOT(empno) empno
FROM emp
WHERE LEVEL<=3
CONNECT BY empno=PRIOR mgr
)
SELECT name, REGEXP_SUBSTR(MAX(path), '[^,]+', 1, 2) name2,
             REGEXP_SUBSTR(MAX(path), '[^,]+', 1, 3) name3
FROM x
GROUP BY name, empno
ORDER BY empno;

#2. Using CONNECT BY twice

WITH x AS (
SELECT ename, PRIOR ename mname, empno, mgr
FROM emp
WHERE LEVEL=2 OR mgr IS NULL
CONNECT BY PRIOR empno=mgr
)
SELECT ename name1, mname name2, MAX(PRIOR mname) name3
FROM x
WHERE LEVEL<=2
CONNECT BY PRIOR empno=mgr
GROUP BY ename, mname, empno
ORDER BY empno

#3. Using CONNECT BY and Self Outer Join

WITH x AS (
SELECT ename, PRIOR ename mname, PRIOR mgr AS mgr, empno
FROM emp
WHERE LEVEL=2 OR mgr IS NULL
CONNECT BY PRIOR empno=mgr
)
SELECT x.ename name1, x.mname name2, e.ename name3
FROM x LEFT JOIN emp e ON x.mgr=e.empno
ORDER BY x.empno

#4. Using 2 Self Outer Joins

SELECT a.ename name1, b.ename name2, c.ename name3
FROM emp a LEFT JOIN emp b ON a.mgr=b.empno
           LEFT JOIN emp c ON b.mgr=c.empno
ORDER BY a.empno

#5. Using CONNECT BY and PIVOT

SELECT name1, name2, name3
FROM (
SELECT ename, LEVEL lvl, CONNECT_BY_ROOT(empno) empno
FROM emp
WHERE LEVEL<=3
CONNECT BY empno=PRIOR mgr
)
PIVOT(
MAX(ename)
FOR lvl IN (1 AS name1, 2 AS name2, 3 AS name3)
)
ORDER BY empno;

#6. PIVOT Simulation

WITH x AS (
SELECT ename, LEVEL lvl, CONNECT_BY_ROOT(empno) empno
FROM emp
WHERE LEVEL<=3
CONNECT BY empno=PRIOR mgr
)
SELECT MAX(DECODE(lvl, 1, ename)) name1,
       MAX(DECODE(lvl, 2, ename)) name2,
       MAX(DECODE(lvl, 3, ename)) name3
FROM x
GROUP BY empno
ORDER BY empno;

#7. Using CONNECT BY and no WITH/Subqueries (Credit to Krishna Jamal)

SELECT ename Name1, PRIOR ename Name2,
DECODE(LEVEL, 
    3, CONNECT_BY_ROOT(ename), 
    4, TRIM(BOTH ' ' FROM 
        REPLACE(
            REPLACE(SYS_CONNECT_BY_PATH(PRIOR ename, ' '), PRIOR ename), 
        CONNECT_BY_ROOT(ename)))
        ) Name3
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr
ORDER BY empno;

#8. A composition of Methods 1 and 7:

SELECT ename Name1, PRIOR ename Name2,
       CASE WHEN LEVEL IN (3,4) 
          THEN REGEXP_SUBSTR(SYS_CONNECT_BY_PATH(ename, ','),'[^,]+',1,LEVEL-2) 
       END AS Name3
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr
ORDER BY empno;

#9. Using NTH_VALUE Analytic function (Oracle 11.2 and up):

WITH x AS (
SELECT CONNECT_BY_ROOT(ename) n1, CONNECT_BY_ROOT(empno) empno,
 NTH_VALUE(ename, 2) OVER(PARTITION BY CONNECT_BY_ROOT(ename) ORDER BY LEVEL) n2,
 NTH_VALUE(ename, 3) OVER(PARTITION BY CONNECT_BY_ROOT(ename) ORDER BY LEVEL) n3
FROM emp
WHERE LEVEL<=3
CONNECT BY empno=PRIOR mgr
)
SELECT n1 name1, MAX(n2) name2, MAX(n3) name3
FROM x
GROUP BY n1, empno
ORDER BY empno

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

3 Solutions to Puzzle of the Week #20

Puzzle of the Week #20:

Produce the historical highest/lowest salary report that should comply with the following requirements:

  • Use Single SELECT statement only
  • Only employees who was paid the highest or lowest salary in their respective department at the moment of hiring should be selected
  • Show name, date of hire, department number, job title, salary table (emp) columns and two additional calculated columns/flags: min_flag and max_flag to indicate that the employee was hired with the min/max salary in their respective department as of the time of hiring.
  • If two or more employees in the same department are paid the same max/min salary, only the one who was hired first should be picked for the report.
  • The query should work in Oracle 11g.

Expected Result:

POW20ER

#1. Using Common Table Expression (CTE) or Recursive WITH clause

WITH y AS (
SELECT ename, job, deptno, hiredate, sal, 
       ROW_NUMBER()OVER(PARTITION BY deptno ORDER BY hiredate) rn
FROM emp
), x (ename, job, deptno, hiredate, sal, min_sal, max_sal, min_flag, max_flag, rn) AS (
SELECT ename, job, deptno, hiredate, sal, sal, sal, 1, 1, 1
FROM y
WHERE rn=1
UNION ALL
SELECT y.ename, y.job, y.deptno, y.hiredate, y.sal, 
       LEAST(x.min_sal, y.sal), GREATEST(x.max_sal, y.sal),
       CASE WHEN y.sal<x.min_sal THEN 1 END, 
       CASE WHEN y.sal>x.max_sal THEN 1 END, y.rn
FROM y JOIN x ON y.deptno=x.deptno AND y.rn=x.rn+1
)
SELECT ename, job, deptno, hiredate, sal, min_flag, max_flag
FROM x
WHERE 1 IN (min_flag, max_flag)
ORDER BY deptno, hiredate;

#2. Using Cumulative Analytic Functions MIN, MAX, and ROW_NUMBER

WITH x AS (
SELECT ename, job, deptno, hiredate, sal,
       MIN(sal)OVER(PARTITION BY deptno ORDER BY hiredate) min_sal,
       MAX(sal)OVER(PARTITION BY deptno ORDER BY hiredate) max_sal,
       ROW_NUMBER()OVER(PARTITION BY deptno, sal ORDER BY hiredate) rn
FROM emp
)
SELECT ename, job, deptno, hiredate, sal,
       DECODE(sal, min_sal, 1) min_flag,
       DECODE(sal, max_sal, 1) max_flag
FROM x
WHERE sal IN (min_sal, max_sal)
  AND rn=1;

#3. Using Cumulative Analytic Functions MIN, MAX, and COUNT

WITH x AS (
SELECT ename, job, deptno, hiredate, sal,
       CASE WHEN MIN(sal)OVER(PARTITION BY deptno ORDER BY hiredate)=sal
             AND COUNT(*)OVER(PARTITION BY deptno, sal ORDER BY hiredate)=1 THEN 1 
       END min_flag,
       CASE WHEN MAX(sal)OVER(PARTITION BY deptno ORDER BY hiredate)=sal
             AND COUNT(*)OVER(PARTITION BY deptno, sal ORDER BY hiredate)=1 THEN 1 
       END max_flag
FROM emp
)
SELECT *
FROM x
WHERE 1 IN (min_flag, max_flag);

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

 

 

Interview Question: get 2 random employees for each salary range?

Interview Question

Level: Intermediate/Advanced

For each of the following salary ranges select two randomly chosen employees:

0-999
1000-1999
2000-2999
3000+

Expected Result:

ENAME             SAL RANGE
---------- ---------- ---------
SCOTT            3000 3000+
FORD             3000 3000+
BLAKE            2850 2000-2999
CLARK            2450 2000-2999
TURNER           1500 1000-1999
MILLER           1300 1000-1999
JAMES             950 0-999
SMITH             800 0-999

Solution:

WITH x AS (
SELECT ename, sal,
       CASE WHEN sal>=3000 THEN '3000+'
            WHEN sal>=2000 THEN '2000-2999'
            WHEN sal>=1000 THEN '1000-1999'
            ELSE                '0-999'
       END as range,
       ROW_NUMBER() OVER(PARTITION BY DECODE(GREATEST(sal, 3000), sal, 0, 1) +
                                      DECODE(GREATEST(sal, 2000), sal, 0, 1) +
                                      DECODE(GREATEST(sal, 1000), sal, 0, 1)
                         ORDER BY DBMS_RANDOM.VALUE) rn
FROM emp
)
SELECT ename, sal, range
FROM x
WHERE rn<=2
ORDER BY sal DESC

 

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Two ways to build a salary range report without using CASE function

Interview Question: Produce a salary range report with a single SELECT statement. Decode function is allowed, CASE function – is not.

Level: Intermediate

Expected Result:

RANGE                             Employees
-------------------------------- ----------
0-999                                     2
1000-2999                                 9
3000-5999                                 3

Strategy #1:

SELECT COALESCE(DECODE(LEAST(sal, 999), sal, '0-999'),
                DECODE(LEAST(sal, 2999), GREATEST(sal, 1000), '1000-2999'),
                DECODE(LEAST(sal, 9999), GREATEST(sal, 3000), '3000-5999')
                ) AS range,
       COUNT(*) "Employees"
FROM emp
GROUP BY COALESCE(DECODE(LEAST(sal, 999), sal, '0-999'),
                  DECODE(LEAST(sal, 2999), GREATEST(sal, 1000), '1000-2999'),
                  DECODE(LEAST(sal, 9999), GREATEST(sal, 3000), '3000-5999')
                  )
ORDER BY 1

Explanation:

In Oracle SQL terms, a mathematical condition

a <=x <=b

can be  interpreted as

x BETWEEN a AND b

however, this condition is good only for CASE function, and not for DECODE. The trick is to use another interpretation:

LEAST(b,x)=GREATEST(x,a)

– that can be used in DECODE.

CASE-based Solution:

SELECT CASE WHEN sal<=999 THEN '0-999'
            WHEN sal BETWEEN 1000 AND 2999 THEN '1000-2999'
            WHEN sal BETWEEN 3000 AND 5999 THEN '3000-5999'
       END AS range,
       COUNT(*) "Employees"
FROM emp
GROUP BY CASE WHEN sal<=999 THEN '0-999'
              WHEN sal BETWEEN 1000 AND 2999 THEN '1000-2999'
              WHEN sal BETWEEN 3000 AND 5999 THEN '3000-5999'
         END
ORDER BY 1

Strategy #2:

WITH x AS (
SELECT DECODE(1, (SELECT COUNT(*) FROM dual WHERE emp.sal<=999), '0-999',
                 (SELECT COUNT(*) FROM dual WHERE emp.sal BETWEEN 1000 AND 2999), '1000-2999',
                 (SELECT COUNT(*) FROM dual WHERE emp.sal BETWEEN 3000 AND 5999), '3000-5999'
             ) AS range
FROM emp
)
SELECT range, COUNT(*) AS "Employees"
FROM x
GROUP BY range
ORDER BY 1

Explanation:
This query demonstrates how to mimic CASE function using DECODE and in-line scalar subquery from dual.

Suggested further reading:

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions. The book is also available on Amazon and in all major book stores.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

How to split numeric and text values into separate columns

Puzzle: There is a table T with a single column C VARCHAR2(20) that contains random values. Some of the values are numeric. Write a single SELECT statement that outputs 2 columns: NUM and TEXT with numeric and non-numeric values correspondingly.

To mimic the T table, we will create a view:

CREATE OR REPLACE VIEW T
AS
WITH x AS (
SELECT CASE WHEN MOD(level,2)=0 THEN dbms_random.string('x',3)
            ELSE TO_CHAR(TRUNC(dbms_random.VALUE(-999,999)))
       END rnd,
       ROW_NUMBER()OVER(ORDER BY dbms_random.value) rk
FROM dual
CONNECT BY ROWNUM<=40
)
SELECT CAST(rnd AS VARCHAR2(4)) C
FROM x
WHERE rk<=10;

Expected Result:
Due to the random nature of the values in T view, actual results will be different every time you run a query; however, all the results will look somewhat like this:

NUM  TEXT
---- ----
-146 4R9
-362 78R
-762 ICY
236  U3W
     VIK
     Y21

Solution:

WITH x AS (
SELECT c, CASE WHEN REGEXP_LIKE(c,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END is_int,
       RANK()OVER(PARTITION BY CASE WHEN REGEXP_LIKE(c,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END ORDER BY c) rk
FROM t	   
)
SELECT MAX(DECODE(is_int, 1, c)) NUM,
       MAX(DECODE(is_int, 0, c)) TEXT
FROM x
GROUP BY rk
ORDER BY rk

Explanation:

The above solution uses regular expression for identifying numeric integer values (positive and negative):

^-?[[:digit:]]+$

The way to break down the results into two columns was demonstrated in my previous post: Generate a department/employee roll report

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

 

Interview Question: Show Location for every employee without using joins

Interview Question: Show Location for every employee without using joins

Level: Intermediate

Expected Result:

ENAME          DEPTNO LOCATION
---------- ---------- ---------
CLARK              10 NEW YORK
KING               10 NEW YORK
MILLER             10 NEW YORK
ADAMS              20 DALLAS
FORD               20 DALLAS
JONES              20 DALLAS
SCOTT              20 DALLAS
SMITH              20 DALLAS
ALLEN              30 CHICAGO
BLAKE              30 CHICAGO
JAMES              30 CHICAGO
MARTIN             30 CHICAGO
TURNER             30 CHICAGO
WARD               30 CHICAGO

Method/Workaround #1: Use Aggregation over Cartesian Product

SELECT e.ename, e.deptno, MAX(DECODE(e.deptno,d.deptno, d.loc)) location
FROM emp e, dept d
GROUP BY e.ename, e.deptno
ORDER BY 2,1;

Method/Workaround #2: Use In-Line Scalar Subquery

SELECT e.ename, e.deptno, 
       (SELECT loc FROM dept d WHERE deptno=e.deptno) location
FROM emp e
ORDER BY 2,1;

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.