Substitution SQL Puzzle

May 17, 2020Zahar HilkevichLeave a comment

Level: Advanced

A colleague of mine approached me recently with a puzzle he struggled with: you have a table (let’s call it data_table) with id and val (i.e. value) columns. You are given two parameters: value_to_overwrite and value_to_use that should transform the content of the data_table in a special way:

If both parameters exist in the data_table in the val column for the same id, then the one that is equal to value_to_overwrite should be substituted with value_to_use
If none or just one of the parameters exist in the data_table.val column, than the val column should remain the same
List all the rows from the data_table after the transformation.

Let’s create the data_table using the following DDL command:

CREATE TABLE data_table AS
SELECT 1 id, 'a' val FROM dual
UNION ALL
SELECT 1 id, 'b' val FROM dual
UNION ALL
SELECT 1 id, 'c' val FROM dual
UNION ALL
SELECT 2 id, 'b' val FROM dual
UNION ALL
SELECT 2 id, 'd' val FROM dual

ID	VAL
1	a
1	b
1	c
2	b
2	d

For parameters value_to_overwrite = ‘a’ and value_to_use = ‘b’ the expected result should look like this:

ID	ORIGINAL_VALUE	NEW_VALUE
1	a	a
1	b	a
1	c	c
2	b	b
2	d	d

Note, that for id = 1, value ‘b’ is substituted with new value ‘a’ because both, value_to_overwrite (‘a’) and value_to_use (‘b’) exist in the val column. All other values should remain the same as substitution condition is not met.

To mimic the parameter use in the query we will create another table (rule_table) with a single row in it.

CREATE TABLE rule_table AS
SELECT 'a' value_to_use, 'b' value_to_overwrite
FROM dual

Translating requirements from English to SQL will likely result in a bulky and inefficient query. Let’s demonstrate that:

/* Values that need to be substituted */
SELECT d.id, d.val AS original_value, r.value_to_use AS new_value
FROM data_table d JOIN rule_table r ON d.val = r.value_to_overwrite
WHERE r.value_to_use IN (SELECT val
                         FROM data_table
                         WHERE id = d.id)
UNION ALL
/* Values that remain the same as only value_to_overwrite exist for given id */
SELECT d.id, d.val, d.val
FROM data_table d JOIN rule_table r ON d.val = r.value_to_overwrite
WHERE r.value_to_use NOT IN (SELECT val
                             FROM data_table
                             WHERE id = d.id)
UNION ALL
/* Values that remain the same as value_to_overwrite does not match val */
SELECT d.id, d.val, d.val
FROM data_table d
WHERE d.val NOT IN (SELECT value_to_overwrite
                    FROM rule_table)

As you can see, there are multiple (five) copies of the data_table used, which will lead to a poor performance when the size of the table increases dramatically.

A way better approach is to take the first SELECT from the UNIONed statement above and turn the INNER JOIN into an LEFT OUTER JOIN. At the same time, we need to move the filtering condition from the WHERE clause to the JOIN (otherwise, the LEFT JOIN will work as INNER JOIN):

SELECT d.id,
       d.val                      AS original_value,
       NVL(r.value_to_use, d.val) AS new_value
FROM data_table d LEFT JOIN rule_table r 
                  ON d.val = r.value_to_overwrite
                 AND r.value_to_use IN (SELECT val
                                        FROM data_table
                                        WHERE id = d.id)

This is a quite efficient and fairly short query that uses only two copies of the data_table. Can we do better than that? Yes, we can!

WITH x AS (
SELECT id, val,
       MIN(CASE WHEN val IN (value_to_use, value_to_overwrite) 
                THEN val 
           END)
       OVER(PARTITION BY id, value_to_overwrite)  min_val,
       MAX(CASE WHEN val IN (value_to_use, value_to_overwrite) 
                THEN val 
           END)
       OVER(PARTITION BY id, value_to_overwrite)  max_val,
       LEAST(value_to_use, value_to_overwrite)    min_ow,
       GREATEST(value_to_use, value_to_overwrite) max_ow,
       value_to_use, value_to_overwrite
FROM data_table CROSS JOIN rule_table 
)
SELECT id, val AS original_value,
       CASE WHEN min_val=min_ow AND
                 max_val=max_ow AND
                 val=value_to_overwrite THEN value_to_use
       ELSE val
       END AS new_value
FROM x

Analytic functions MIN and MAX let us scan the data_table vertically while LEAST and GREATEST do the same horizontally. The later pair of functions come very handy when you need to compare pairs of values, so the smaller of the values should match LEAST and the other – GREATEST.

And still, the last strategy has one flaw: we used a Cartesian Product (CROSS JOIN) which means that had we have more than one substitution rule, the method would not work properly. Let’s fix it.

First, we will add one more rule:

INSERT INTO rule_table VALUES('b', 'c')

Now, the expected result should looks as the following:

ID	ORIGINAL_VALUE	NEW_VALUE
1	a	a
1	b	a
1	c	b
2	b	b
2	d	d

Note, that the second rule turns original ‘c’ value into ‘b’.

And again, Analytic functions do all the magic:

WITH x AS (
SELECT id, val, value_to_overwrite, value_to_use,
       LEAST(value_to_overwrite, value_to_use) || '|' ||
       GREATEST(value_to_overwrite, value_to_use) rule_vals,
       LISTAGG(DISTINCT val, '|') WITHIN GROUP(ORDER BY val)
       OVER(PARTITION BY id) vals
FROM data_table LEFT JOIN rule_table ON val = value_to_overwrite
)
SELECT id, val AS original_value,
       CASE WHEN value_to_overwrite IS NULL THEN val
            WHEN INSTR(vals, rule_vals)=0 THEN val
            ELSE value_to_use
       END     AS new_value
FROM x

This time, LISTAGG analytic function (with DISTINCT option – recently supported by Oracle) helps matching the val against value_to_overwrite and value_to_use pair.

I strongly recommend executing parts of the above queries to gain a better understanding of the demonstrated strategies. livesql.oracle.com site offers you a great query tool with the latest version of Oracle database.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

How to pass arbitrary number of arguments to a PL/SQL procedure?

October 16, 2018Zahar HilkevichLeave a comment

I was looking for an answer to this question for quite some time and ended up developing my own approach.

As of version 12.1, Oracle does not offer this feature which is widely available in most modern procedural languages. I found only one Oracle feature that somehow resembles the one in question – a built-in default constructor for PL/SQL collection types. We can pass arbitrary number of values into such constructors.

Here is a simple example:

DECLARE
   TYPE vc_table IS TABLE OF VARCHAR2(30);
   v_table vc_table;
BEGIN
   v_table:=vc_table('ABC', 'DEF', 'GHI');
   FOR i IN 1..v_table.COUNT LOOP
       DBMS_OUTPUT.PUT_LINE(v_table(i));
   END LOOP;
END;
/

Result:

ABC
DEF
GHI

The line that initializes v_table variable references a constructor that takes 3 values as arguments. It can take more (or less) values as well.

How could we exploit this feature to accept variable number of arguments in our procedures/functions?

The problem with the above example is that we have to have a collection type before we can use it and this would make our procedure/function dependent on such custom type.

A necessary help comes from Oracle’s built-in collection types:

sys.odcivarchar2list, sys.odcinumberlist, etc.

Let say we need to mimic Oracle’s built-in GREATEST function for a variable number of numeric arguments. Here is how we could use sys.odcinumberlist type:

CREATE OR REPLACE FUNCTION my_greatest(p_list sys.odcinumberlist)
    RETURN NUMBER
AS
    v_result NUMBER;
BEGIN
    SELECT CASE WHEN SUM(NVL2(COLUMN_VALUE,0,1))>0 THEN TO_NUMBER(NULL)
                ELSE MAX(COLUMN_VALUE)
           END
     INTO v_result
    FROM TABLE(p_list);
    RETURN v_result;
END;
/

Remember, the GREATEST function returns NULL if at least one of its arguments is NULL. That’s why we need to check for NULLs in the p_list collection.

Here is how we could test the function:

SELECT my_greatest(sys.odcinumberlist(45,2,46,65,2,1,0)) "greatest",
       my_greatest(sys.odcinumberlist(45,2,NULL,65,2,1)) "null_greatest"
FROM dual

Result:

greatest	null_greatest
65	–

The use of sys.odcinumberlist constructor is not very elegant as the data type name is very long, but it does do the trick. You can pass as many arguments to the constructor as you wish. To make things look a bit prettier, we can create a short synonym:

CREATE OR REPLACE SYNONYM nl FOR sys.odcinumberlist
/

Now, the last (testing) query will transform to the following:

SELECT my_greatest(nl(45,2,46,65,2,1,0)) "greatest",
       my_greatest(nl(45,2,NULL,65,2,1)) "null_greatest"
FROM dual

It still does not look like true “parameter array” with the arbitrary length, but it is very close.

The following is a short list of Oracle’s built-in collection types that you can use for mimicking “arbitrary number of arguments”:

sys.odcidatelist
sys.odciobjectlist
sys.odcirawlist
sys.odcinumberlist
sys.odcivarchar2list

For anything more complex, you may need to create your own collection type.

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

3 Solutions to 2018 Oracle SQL Puzzle of the Week #15

May 7, 2018May 7, 2018Zahar HilkevichLeave a comment

800 Phone Puzzle

For a given 800 phone number (like 1-800-123-4567) find all number-letter representations.

Use a single SELECT statement only.
Only last 4 digits of the phone number have to be replaced with letters.
Exactly 1 letter (out of 4) must be vowel, the rest – consonant
The following table shows all possible mappings:

Digit	Maps to
1	1
2	A, B, C
3	D, E, F
4	G, H, I
5	J, K, L
6	M, N, O
7	P, Q, R, S
8	T, U, V
9	W, X, Y, Z
0	0

Solutions:

Essentially, all solutions below share the same idea of generating the letter based phone numbers. The differences are in a way the mapping CTE is created and a way to limit the number of vowels to 1.

Solution #1. Compact form of creating the map CTE with recursive check for the vowels:

WITH map AS (
SELECT digit, letter, '4357' phone
FROM TABLE(sys.odcivarchar2list('00','11','2ABC','3DEF','4GHI',
                                '5JKL','6MNO','7PQRS','8TUV','9WXYZ')) t,
 LATERAL(SELECT SUBSTR(t.column_value,1,1) digit, 
                SUBSTR(t.column_value,1+LEVEL,1) letter
         FROM dual
         CONNECT BY SUBSTR(t.column_value,1+LEVEL,1) IS NOT NULL) x
), res(str, lvl, phone,has_vowel) AS ( 
SELECT letter, 1, phone, 
 CASE WHEN letter IN ('A','E','I','O','U') THEN 1 ELSE 0 END
FROM map 
WHERE SUBSTR(phone,1,1)=TO_CHAR(map.digit) 
UNION ALL 
SELECT res.str || letter, res.lvl+1, res.phone,
       CASE WHEN letter IN ('A','E','I','O','U') 
               OR res.has_vowel=1 THEN 1 ELSE 0 END
FROM res JOIN map ON SUBSTR(res.phone, res.lvl+1,1)=TO_CHAR(map.digit) 
WHERE res.lvl+1<=LENGTH(res.phone) 
  AND NOT (letter IN ('A','E','I','O','U') AND res.has_vowel=1)
) 
SELECT '1-800-123-' || str phone 
FROM res
WHERE lvl=LENGTH(phone)
  AND has_vowel=1

Solution #2. Using more efficient way of creating the map CTE :

WITH x AS (
SELECT ROWNUM-1 digit,COLUMN_VALUE letters
FROM TABLE(sys.odcivarchar2list('0','1','ABC','DEF','GHI','JKL',
                                'MNO','PQRS','TUV','WXYZ'))
), map AS (
SELECT digit, SUBSTR(letters, level, 1) letter, '4357' phone
FROM x
CONNECT BY SUBSTR(letters, level, 1) IS NOT NULL
       AND PRIOR digit = digit 
       AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
), res(str, lvl, phone,has_vowel) AS ( 
SELECT letter, 1, phone, 
       CASE WHEN letter IN ('A','E','I','O','U') THEN 1 ELSE 0 END
FROM map 
WHERE SUBSTR(phone,1,1)=TO_CHAR(map.digit) 
UNION ALL 
SELECT res.str || letter, res.lvl+1, res.phone,
       CASE WHEN letter IN ('A','E','I','O','U') 
              OR res.has_vowel=1 THEN 1 ELSE 0 END
FROM res JOIN map ON SUBSTR(res.phone, res.lvl+1,1)=TO_CHAR(map.digit) 
WHERE res.lvl+1<=LENGTH(res.phone) 
  AND NOT (letter IN ('A','E','I','O','U') AND res.has_vowel=1)
) 
SELECT '1-800-123-' || str phone 
FROM res
WHERE lvl=LENGTH(phone)
 AND has_vowel=1

Solution #3. Much more efficient way of creating the map CTE and using Regular Expression to limit the vowels :

WITH d AS ( 
SELECT LEVEL+1 n, CASE WHEN LEVEL+1 IN (7,9) THEN 4 ELSE 3 END cnt,
       '4357' phone
FROM dual 
CONNECT BY LEVEL<=8 
), a AS ( 
SELECT CHR(ASCII('A')+LEVEL-1) letter, ROWNUM rn 
FROM dual 
CONNECT BY CHR(ASCII('A')+LEVEL-1)<='Z' 
), x AS ( 
SELECT n, 
       1+NVL(SUM(cnt) OVER(ORDER BY n ROWS BETWEEN UNBOUNDED PRECEDING 
                                           AND 1 PRECEDING),0) c1, 
       SUM(cnt) OVER(ORDER BY n) c2,
       phone
FROM d 
), map AS ( 
SELECT n digit, letter, x.phone
FROM x JOIN a ON a.rn BETWEEN x.c1 AND x.c2 
UNION 
SELECT ROWNUM-1, TO_CHAR(ROWNUM-1), x.phone
FROM x
WHERE ROWNUM<=2
), res(str, lvl) AS ( 
SELECT letter, 1 
FROM map 
WHERE SUBSTR(map.phone,1,1)=TO_CHAR(map.digit) 
UNION ALL 
SELECT res.str || letter, res.lvl+1
FROM res JOIN map ON SUBSTR(map.phone, res.lvl+1,1)=TO_CHAR(map.digit) 
WHERE res.lvl+1<=LENGTH(map.phone) 
 AND REGEXP_COUNT(res.str || letter,'[AEIOU]')<=1
) 
SELECT str phone 
FROM res 
WHERE lvl=4
 AND REGEXP_COUNT(str,'[AEIOU]')=1

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Tip of the day: use LNNVL function

April 20, 2018April 20, 2018Zahar Hilkevich1 Comment

Recently, I came across LNNVL Oracle function and decided to assess its usability.

According to Oracle documentation, it “provides a concise way to evaluate a condition when one or both operands of the condition may be null.”

Let’s see a couple of problems this function can be applied to:

For each department count the number of employees who get no commission.

There are over 10 possible solutions that I am aware of, I will focus only on a couple where we can leverage LNNVL function.

Solution #1. Filter with LNNVL in WHERE clause:

SELECT deptno, COUNT(*) cnt 
FROM scott.emp 
WHERE LNNVL(comm>0)
GROUP BY deptno
ORDER BY 1;

Result:

DEPTNO        CNT
------ ----------
    10          3
    20          5
    30          3

In the above query, LNNVL(comm>0) filter is equivalent to: (comm<=0 OR comm IS NULL) and since comm cannot be negative, we can say that it is the same as NVL(comm,0)=0.

According to that same Oracle documentation, LNNVL “can be used only in the WHERE clause of a query.” This does not seem to be true, at least in 12g+ releases:

Solution #2. Using LNNVL in conditional aggregation:

SELECT deptno, SUM(CASE WHEN LNNVL(comm>0) THEN 1 ELSE 0 END) cnt
FROM scott.emp 
GROUP BY deptno
ORDER BY 1;

LNNVL can also be used with IN operator. We will illustrate it with a solution (one of the many) to the following problem:

List all employees who is not a manager of somebody else.

SELECT empno, ename, deptno, job, mgr
FROM scott.emp
WHERE LNNVL(empno IN (SELECT mgr FROM scott.emp))
ORDER BY 1;

The following boolean expression defines a condition to see managers only:

empno IN (SELECT mgr FROM scott.emp)

We need the opposite one that handles NULLs – as mgr is a nullable column. And this is where LNNVL helps us.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

7 Solutions to 2018 Oracle SQL Puzzle of the Week #13

April 17, 2018April 17, 2018Zahar HilkevichLeave a comment

Second Top Employee as of the Start of Employment

List all employees who were 2nd top paid in the entire company as of the time their employment started

Use a single SELECT statement only.
At the time of employment start the rank of the employee by salary should be 2.
Show the top salary at the time when the employee started with the company.
We assume that no employees have ever been terminated since day 1.

Expected Result:

ENAME	JOB	SAL	HIREDATE	MAX_SAL
WARD	SALESMAN	1250	22-FEB-81	1600
BLAKE	MANAGER	2850	01-MAY-81	2975
FORD	ANALYST	3000	03-DEC-81	5000
SCOTT	ANALYST	3000	19-APR-87	5000

Solutions:

Solution #1. Using LATERAL view, RANK and cumulative MAX analytic functions (Oracle 12g+):

SELECT e.ename, e.job, e.hiredate, e.sal, r.max_sal 
FROM scott.emp e, LATERAL(SELECT a.empno,  
                                 RANK() OVER(ORDER BY a.sal DESC) rk, 
                                 MAX(a.sal) OVER() max_sal 
                          FROM scott.emp a 
                          WHERE a.hiredate<=e.hiredate) r 
WHERE e.empno=r.empno  
  AND rk=2 
ORDER BY e.hiredate

Solution #2. Using CTE, cumulative MAX analytic function and a correlated subquery with COUNT to mimic the filter by RANK:

WITH x AS ( 
SELECT ename, job, hiredate, sal, MAX(sal)OVER(ORDER BY hiredate) max_sal 
FROM scott.emp a 
) 
SELECT * 
FROM x 
WHERE 1=(SELECT COUNT(*) 
         FROM scott.emp 
         WHERE hiredate<=x.hiredate 
           AND sal>x.sal) 
ORDER BY hiredate

Solution #3. Using CTE, cumulative MAX analytic function and an in-line scalar subquery in SELECT to mimic the RANK:

WITH x AS ( 
SELECT ename, job, hiredate, sal, MAX(sal)OVER(ORDER BY hiredate) max_sal, 
       (SELECT COUNT(*)+1 
        FROM scott.emp 
        WHERE sal>e.sal  
          AND hiredate<=e.hiredate) rk 
FROM scott.emp e 
) 
SELECT ename, job, hiredate, sal, max_sal 
FROM x 
WHERE rk=2 
ORDER BY hiredate

Solution #4. Using self-join and Cartesian Product with aggregation:

SELECT a.ename, a.job, a.hiredate, a.sal, MAX(b.sal) max_sal 
FROM scott.emp a JOIN scott.emp b ON b.hiredate<=a.hiredate 
                                 AND b.sal>a.sal 
GROUP BY a.ename, a.job, a.hiredate, a.sal 
HAVING COUNT(DISTINCT b.empno)=1 
ORDER BY a.hiredate

Solution #5. Using CTE and cumulative MAX analytic function (twice):

WITH x AS ( 
SELECT ename, job, hiredate, sal, 
       MAX(sal) OVER(ORDER BY hiredate) max_sal 
FROM scott.emp  
), y  AS ( 
SELECT ename, job, hiredate, sal, max_sal, MAX(sal) OVER(ORDER BY hiredate) max_sal2 
FROM x 
WHERE sal<max_sal 
) 
SELECT ename, job, hiredate, sal, max_sal 
FROM y 
WHERE sal=max_sal2 
ORDER BY hiredate

Solution #6. Using regular and recursive CTEs, ROWNUM, GREATEST, and CASE functions (no Analytic functions!):

WITH e AS ( 
SELECT ename, job, sal, hiredate 
FROM scott.emp 
ORDER BY hiredate 
), x AS ( 
SELECT ename, job, sal, hiredate, ROWNUM rn 
FROM e 
), y(max_sal, sal2, rn) AS ( 
SELECT sal, 0, 1 
FROM x 
WHERE rn=1 
UNION ALL 
SELECT GREATEST(x.sal, y.max_sal) AS max_sal, 
       CASE WHEN x.sal>y.max_sal THEN y.max_sal 
            WHEN x.sal>y.sal2 AND x.sal<=y.max_sal THEN x.sal  
            ELSE y.sal2  
       END AS sal2, 
       x.rn 
FROM x JOIN y ON x.rn=y.rn+1 
) 
SELECT x.ename, x.job, x.sal, x.hiredate, y.max_sal 
FROM y JOIN x ON y.rn=x.rn AND y.sal2=x.sal

Solution #7. Using CTE and MODEL clause to mimic Solution #6:

WITH x AS ( 
SELECT * 
FROM scott.emp 
MODEL 
DIMENSION BY (ROW_NUMBER() OVER(ORDER BY hiredate) rn) 
MEASURES(ename, job, sal, hiredate, sal max_sal, 0 sal2) 
RULES( 
    max_sal[rn>1] = GREATEST(max_sal[CV()-1], sal[CV()]), 
    sal2[rn>1] = CASE WHEN sal[CV()]> max_sal[CV()-1] THEN max_sal[CV()-1] 
                      WHEN sal[CV()]> sal2[CV()-1]   
		       AND sal[CV()]<=max_sal[CV()-1] THEN sal[CV()]  
                      ELSE sal2[CV()-1] 
                 END 
     ) 
) 
SELECT ename, job, sal, hiredate, max_sal 
FROM x 
WHERE sal=sal2

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

3 Solutions to 2018 Oracle SQL Puzzle of the Week #2

January 22, 2018January 22, 2018Zahar HilkevichLeave a comment

2018 Puzzle of the Week #2:

For each of the following salary ranges select two randomly chosen employees:
0-999
1000-1999
2000-2999
3000+

Expected Result (in SQL*Plus):

ENAME      SAL        RANGE
---------- ---------- ---------
SCOTT            3000 3000+
FORD             3000 3000+
BLAKE            2850 2000-2999
CLARK            2450 2000-2999
TURNER           1500 1000-1999
MILLER           1300 1000-1999
JAMES             950 0-999
SMITH             800 0-999

Remember to use only a single SELECT statement.
Use table emp (from Oracle scott schema)

Solutions:

Solution #1: Using ROW_NUMBER with random.value functions:

We are applying a random sorting order to each of the salary ranges and take 2 top records from each range:

WITH x AS (
SELECT CASE WHEN sal<=999  THEN '0-999'
            WHEN sal<=1999 THEN '1000-1999'
            WHEN sal<=2999 THEN '2000-2999'
            ELSE                '3000+'
       END range,
       ename, sal
FROM emp
), y AS (
SELECT ename, sal, range, 
       ROW_NUMBER()OVER(PARTITION BY range 
                        ORDER BY dbms_random.value) rn
FROM x
)
SELECT range, ename, sal
FROM y
WHERE rn<=2
ORDER BY range

Result:

RANGE     ENAME      SAL
--------- ---------- ----------
0-999     JAMES      950
0-999     SMITH      800
1000-1999 WARD 1250
1000-1999 TURNER 1500
2000-2999 JONES 2975
2000-2999 CLARK 2450
3000+     FORD 3000
3000+     KING 5000

Result (of subsequent execution):

RANGE     ENAME             SAL
--------- ---------- ----------
0-999     SMITH             800
0-999     JAMES             950
1000-1999 WARD             1250
1000-1999 MARTIN           1250
2000-2999 BLAKE            2850
2000-2999 JONES            2975
3000+     SCOTT            3000
3000+     KING             5000

Solution #2: Using DECODE, MAX() KEEP and UNION ALL:

Instead of taking top 2 records (randomly sorted), we are taking top 1 and bottom 1 and combine them together. DECODE function mimics the CASE from the previous solution.

WITH x AS (
SELECT DECODE(1, SIGN(999-sal), '0-999', SIGN(1999-sal), '1000-1999',
                 SIGN(2999-sal), '2000-2999', '3000+') range,
       ename, sal, ROWNUM || dbms_random.value rnd
FROM scott.emp
)
SELECT range, MAX(ename)KEEP(DENSE_RANK FIRST ORDER BY rnd) ename,
              MAX(sal)  KEEP(DENSE_RANK FIRST ORDER BY rnd) sal
FROM x
GROUP BY range
UNION ALL
SELECT range, MAX(ename)KEEP(DENSE_RANK LAST ORDER BY rnd) ename,
              MAX(sal)  KEEP(DENSE_RANK LAST ORDER BY rnd) sal
FROM x
GROUP BY range
ORDER BY range

Result:

RANGE     ENAME             SAL
--------- ---------- ----------
0-999     JAMES             950
0-999     SMITH             800
1000-1999 MARTIN           1250
1000-1999 WARD             1250
2000-2999 JONES            2975
2000-2999 BLAKE            2850
3000+     FORD             3000
3000+     KING             5000

Note, that we concatenated ROWNUM with dbms_random.value to produce UNIQUE random value. Without ROWNUM (or any other KEY) there is always a chance that dbms_random.value will repeat on different rows and hence top and bottom values could be mixed and the same employee will be repeated twice.

Solution #3: Using SIN for random value simulation and multi-column UNPIVOT with MAX() KEEP function:

Instead of combining top and bottom records from two statements using UNION ALL, here were calculating top and bottom values as 1 record and UNPIVOT them to produce two rows per salary range:

WITH x AS (
SELECT DECODE(1, SIGN(999-sal), '0-999', SIGN(1999-sal), '1000-1999',
                 SIGN(2999-sal), '2000-2999', '3000+') range,
       ename, sal, 
       SIN(ROWNUM*TO_NUMBER(SUBSTR(
                             extract(second 
                                     from current_timestamp),-3))
           ) rnd
FROM scott.emp
), y AS (
SELECT range, MAX(ename)KEEP(DENSE_RANK FIRST ORDER BY rnd) ename1,
              MAX(sal)  KEEP(DENSE_RANK FIRST ORDER BY rnd) sal1,
              MAX(ename)KEEP(DENSE_RANK LAST ORDER BY rnd) ename2,
              MAX(sal)  KEEP(DENSE_RANK LAST ORDER BY rnd) sal2
FROM x
GROUP BY range
)
SELECT range, ename, sal
FROM y
UNPIVOT (
  (ename, sal) for (t1, t2) in ((ename1,sal1), (ename2,sal2))
)
ORDER BY range

Result:

RANGE     ENAME             SAL
--------- ---------- ----------
0-999     SMITH             800
0-999     JAMES             950
1000-1999 MILLER           1300
1000-1999 MARTIN           1250
2000-2999 CLARK            2450
2000-2999 BLAKE            2850
3000+     FORD             3000
3000+     SCOTT            3000

Note the use of multi-column UNPIVOT. Randomization simulation is based on a fairly random selection of the last 3 digits in the current timestamp’s second value. This number is used as a “seed”. When this seed is multiplied by the rownum, the result is used as a SIN function argument which makes the outcome pseudo-random.

You can execute the above SQL statements in Oracle Live SQL environment.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

3 Solutions to Puzzle of the Week #20

August 15, 2016August 15, 2016Zahar HilkevichLeave a comment

Puzzle of the Week #20:

Produce the historical highest/lowest salary report that should comply with the following requirements:

Use Single SELECT statement only
Only employees who was paid the highest or lowest salary in their respective department at the moment of hiring should be selected
Show name, date of hire, department number, job title, salary table (emp) columns and two additional calculated columns/flags: min_flag and max_flag to indicate that the employee was hired with the min/max salary in their respective department as of the time of hiring.
If two or more employees in the same department are paid the same max/min salary, only the one who was hired first should be picked for the report.
The query should work in Oracle 11g.

Expected Result:

POW20ER

#1. Using Common Table Expression (CTE) or Recursive WITH clause

WITH y AS (
SELECT ename, job, deptno, hiredate, sal, 
       ROW_NUMBER()OVER(PARTITION BY deptno ORDER BY hiredate) rn
FROM emp
), x (ename, job, deptno, hiredate, sal, min_sal, max_sal, min_flag, max_flag, rn) AS (
SELECT ename, job, deptno, hiredate, sal, sal, sal, 1, 1, 1
FROM y
WHERE rn=1
UNION ALL
SELECT y.ename, y.job, y.deptno, y.hiredate, y.sal, 
       LEAST(x.min_sal, y.sal), GREATEST(x.max_sal, y.sal),
       CASE WHEN y.sal<x.min_sal THEN 1 END, 
       CASE WHEN y.sal>x.max_sal THEN 1 END, y.rn
FROM y JOIN x ON y.deptno=x.deptno AND y.rn=x.rn+1
)
SELECT ename, job, deptno, hiredate, sal, min_flag, max_flag
FROM x
WHERE 1 IN (min_flag, max_flag)
ORDER BY deptno, hiredate;

#2. Using Cumulative Analytic Functions MIN, MAX, and ROW_NUMBER

WITH x AS (
SELECT ename, job, deptno, hiredate, sal,
       MIN(sal)OVER(PARTITION BY deptno ORDER BY hiredate) min_sal,
       MAX(sal)OVER(PARTITION BY deptno ORDER BY hiredate) max_sal,
       ROW_NUMBER()OVER(PARTITION BY deptno, sal ORDER BY hiredate) rn
FROM emp
)
SELECT ename, job, deptno, hiredate, sal,
       DECODE(sal, min_sal, 1) min_flag,
       DECODE(sal, max_sal, 1) max_flag
FROM x
WHERE sal IN (min_sal, max_sal)
  AND rn=1;

#3. Using Cumulative Analytic Functions MIN, MAX, and COUNT

WITH x AS (
SELECT ename, job, deptno, hiredate, sal,
       CASE WHEN MIN(sal)OVER(PARTITION BY deptno ORDER BY hiredate)=sal
             AND COUNT(*)OVER(PARTITION BY deptno, sal ORDER BY hiredate)=1 THEN 1 
       END min_flag,
       CASE WHEN MAX(sal)OVER(PARTITION BY deptno ORDER BY hiredate)=sal
             AND COUNT(*)OVER(PARTITION BY deptno, sal ORDER BY hiredate)=1 THEN 1 
       END max_flag
FROM emp
)
SELECT *
FROM x
WHERE 1 IN (min_flag, max_flag);

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Three Solutions to Puzzle of the Week #16

June 27, 2016June 27, 2016Zahar HilkevichLeave a comment

Puzzle of the Week #16:

With a single SELECT statement find the biggest prime factor of a given integer value (N).

Expected Result:

--For N=100:

Biggest Prime Factor
--------------------
                  5

--For N=52:

Biggest Prime Factor
--------------------
                 13

--For N=21:

Biggest Prime Factor
--------------------
                   7

Solutions

#1: Using CTE (recursive WITH)

WITH input AS (
SELECT &N n
FROM dual
), x(num, flag) AS (
SELECT 2, CASE WHEN MOD(n, 2)=0 THEN 1 ELSE 0 END AS flag
FROM input
UNION ALL
SELECT x.num+1, CASE WHEN MOD(i.n, x.num+1)=0 THEN 1 ELSE 0 END
FROM input i, x
WHERE x.num+1<=i.n
), y AS (
SELECT num, (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
             FROM dual
             WHERE MOD(x.num,LEVEL)=0
             CONNECT BY LEVEL<=x.num) is_prime
FROM x
WHERE flag=1
)
SELECT MAX(num) "Biggest Prime Factor"
FROM y
WHERE is_prime=1;

Enter value for n: 100
old   2: SELECT &N n
new   2: SELECT 100 n

Biggest Prime Factor
--------------------
                   5

SQL> /
Enter value for n: 52
old   2: SELECT &N n
new   2: SELECT 52 n

Biggest Prime Factor
--------------------
                  13

SQL> /
Enter value for n: 21
old   2: SELECT &N n
new   2: SELECT 21 n

Biggest Prime Factor
--------------------
                   7

#2: Using CONNECT BY clause , version 1

WITH input AS (
SELECT &N n
FROM dual
), x AS (
SELECT LEVEL num
FROM input i
WHERE MOD(i.N, LEVEL)=0
CONNECT BY LEVEL<=i.N
), y AS (
SELECT num, (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
             FROM dual
             WHERE MOD(x.num,LEVEL)=0
             CONNECT BY LEVEL<=x.num) is_prime
FROM x
WHERE flag=1
)
SELECT MAX(num) "Biggest Prime Factor"
FROM y
WHERE is_prime=1;

#3: Using CONNECT BY clause, version 2

WITH input AS (
SELECT &N n
FROM dual
), range AS (
SELECT LEVEL num
FROM input i
CONNECT BY LEVEL <= i.N
), x AS(
SELECT r1.num
FROM range r1, range r2, input i
WHERE MOD(i.N, r1.num)=0
GROUP BY r1.num
HAVING COUNT(CASE WHEN MOD(r1.num, r2.num)=0 THEN 1 END)=2
)
SELECT MAX(num) "Biggest Prime Factor"
FROM x;

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

4 Solutions to Puzzle of the Week #12

May 23, 2016May 17, 2017Zahar Hilkevich1 Comment

Puzzle of the Week #12

With a single SELECT statement produce a list of first 10 prime numbers above a given number of N.

Expected Result: (for N=15)

     Prime
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Expected Result: (for N=50)

     Prime
----------
        53
        59
        61
        67
        71
        73
        79
        83
        89
        97

10 rows selected.

Solutions:

#1: Liming number of found prime numbers in CTE (Recursive WITH clsue)

WITH y AS (
SELECT 500 fromN
FROM dual
), x (n, cnt, flag) AS (
SELECT fromN,
      (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
       FROM dual
       WHERE MOD(fromN, LEVEL)=0
       CONNECT BY LEVEL<=fromN),
      (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
       FROM dual
       WHERE MOD(fromN, LEVEL)=0
       CONNECT BY LEVEL<=fromN)
FROM y
UNION ALL
SELECT x.n+1, (SELECT x.cnt+CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
               FROM dual
               WHERE MOD(x.n+1, LEVEL)=0
               CONNECT BY LEVEL<=x.n+1),
              (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
               FROM dual
               WHERE MOD(x.n+1, LEVEL)=0
               CONNECT BY LEVEL<=x.n+1)
FROM x
WHERE x.cnt

#2: Limiting number of found prime numbers outside of CTE (Recursive WITH clsue)

WITH y AS (
SELECT 50 fromN
FROM dual
), x (n, flag) AS (
SELECT fromN,
      (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
       FROM dual
       WHERE MOD(fromN, LEVEL)=0
       CONNECT BY LEVEL<=fromN)
FROM y
UNION ALL
SELECT x.n+1, (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
               FROM dual
               WHERE MOD(x.n+1, LEVEL)=0
               CONNECT BY LEVEL<=x.n+1) FROM x WHERE x.n>0
)
SELECT n AS prime
FROM x
WHERE flag=1
  AND ROWNUM<=10;

     PRIME
----------
        53
        59
        61
        67
        71
        73
        79
        83
        89
        97

10 rows selected.

Elapsed: 00:00:00.02

#3: Using TABLE and MULTISET functions

WITH y AS (
SELECT 16 fromN
FROM dual
), x (n, flag) AS (
SELECT fromN, column_value flag 
FROM y, TABLE(CAST(MULTISET(SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END flag
                            FROM dual
                            WHERE MOD(fromN, LEVEL)=0
                            CONNECT BY LEVEL<=fromN) AS sys.odcinumberlist))  
UNION ALL
SELECT x.n+1, column_value flag  
FROM x, TABLE(CAST(MULTISET(SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END flag
                            FROM dual
                            WHERE MOD(x.n+1, LEVEL)=0
                            CONNECT BY LEVEL<=x.n+1) AS sys.odcinumberlist))  WHERE x.n>0
)
SELECT n AS prime
FROM x
WHERE flag=1
  AND ROWNUM<=10;

     PRIME
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Elapsed: 00:00:00.12

#4: Using LATERAL views

WITH y AS (
SELECT 16 fromN
FROM dual
), x (n, flag) AS (
SELECT fromN, is_prime
FROM y, LATERAL (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END is_prime
                 FROM dual
                 WHERE MOD(fromN, LEVEL)=0
                 CONNECT BY LEVEL<=fromN)
UNION ALL
SELECT x.n+1, is_prime 
FROM x, LATERAL (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END is_prime
                 FROM dual
                 WHERE MOD(x.n+1, LEVEL)=0
                 CONNECT BY LEVEL<=x.n+1) WHERE x.n>0
)
SELECT n AS prime
FROM x
WHERE flag=1
 AND ROWNUM<=10;

     PRIME
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Elapsed: 00:00:00.11

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Interview Question: get 2 random employees for each salary range?

May 19, 2016Zahar HilkevichLeave a comment

Interview Question

Level: Intermediate/Advanced

For each of the following salary ranges select two randomly chosen employees:

Expected Result:

ENAME             SAL RANGE
---------- ---------- ---------
SCOTT            3000 3000+
FORD             3000 3000+
BLAKE            2850 2000-2999
CLARK            2450 2000-2999
TURNER           1500 1000-1999
MILLER           1300 1000-1999
JAMES             950 0-999
SMITH             800 0-999

Solution:

WITH x AS (
SELECT ename, sal,
       CASE WHEN sal>=3000 THEN '3000+'
            WHEN sal>=2000 THEN '2000-2999'
            WHEN sal>=1000 THEN '1000-1999'
            ELSE                '0-999'
       END as range,
       ROW_NUMBER() OVER(PARTITION BY DECODE(GREATEST(sal, 3000), sal, 0, 1) +
                                      DECODE(GREATEST(sal, 2000), sal, 0, 1) +
                                      DECODE(GREATEST(sal, 1000), sal, 0, 1)
                         ORDER BY DBMS_RANDOM.VALUE) rn
FROM emp
)
SELECT ename, sal, range
FROM x
WHERE rn<=2
ORDER BY sal DESC

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Solution #1. Compact form of creating the map CTE with recursive check for the vowels:

Solution #2. Using more efficient way of creating the map CTE :

Solution #3. Much more efficient way of creating the map CTE and using Regular Expression to limit the vowels :

Further Reading:

Further Reading:

Second Top Employee as of the Start of Employment

Solution #1. Using LATERAL view, RANK and cumulative MAX analytic functions (Oracle 12g+):

Solution #2. Using CTE, cumulative MAX analytic function and a correlated subquery with COUNT to mimic the filter by RANK:

Solution #3. Using CTE, cumulative MAX analytic function and an in-line scalar subquery in SELECT to mimic the RANK:

Solution #4. Using self-join and Cartesian Product with aggregation:

Solution #5. Using CTE and cumulative MAX analytic function (twice):

Solution #6. Using regular and recursive CTEs, ROWNUM, GREATEST, and CASE functions (no Analytic functions!):

Solution #7. Using CTE and MODEL clause to mimic Solution #6:

Further Reading:

Solutions:

Solution #1: Using ROW_NUMBER with random.value functions:

Solution #2: Using DECODE, MAX() KEEP and UNION ALL:

Solution #3: Using SIN for random value simulation and multi-column UNPIVOT with MAX() KEEP function:

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?

Expected Result:

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?

Expected Result:

Solutions

#1: Using CTE (recursive WITH)

#2: Using CONNECT BY clause , version 1

#3: Using CONNECT BY clause, version 2

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?

Solutions:

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?

Interview Question

Expected Result:

Solution:

My Oracle Group on Facebook:

Would you like to read about many more tricks and puzzles?