Puzzle of the Week #9

Puzzle: All employees are sorted by employee number and need to be split in 3 groups equal in size (as close as possible).  Employees with the smallest numbers will get into the 1st group, the 2nd group will have employees with the next (bigger) range of numbers, etc.

Write a single SELECT statement (against emp table) that would show group number, the range of employee numbers, and the size of each group.

Expected Result:

Group RANGE           Count
----- ---------- ----------
    1 7369-7654           5
    2 7698-7844           5
    3 7876-7934           4

To submit your answer (one or more!) please start following this blog and add a comment to this post.

A correct answer (and workarounds!) will be published here in a week.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

Solutions to Puzzle of the Week #8

Puzzle of the Week #8:

Find job titles represented in every department. Write a single SELECT statement only.

Expected Result: (Only clerks and managers work in all 3 departments: 10,20, and 30)

JOB
--------
CLERK
MANAGER

Solutions:

#1: Two COUNT(DISTINCT ..) in HAVING

SELECT job
FROM emp
GROUP BY job
HAVING COUNT(DISTINCT deptno)=(SELECT COUNT(DISTINCT deptno) FROM emp)

#2: Analytic COUNT(DISTINCT ..) with CONNECT BY

SELECT DISTINCT job
FROM (
SELECT job, deptno, LEVEL level#, COUNT(DISTINCT deptno) OVER() cnt
FROM emp
CONNECT BY job=PRIOR job
AND deptno>PRIOR deptno
)
WHERE level#=cnt

#3: Two Analytic COUNT(DISTINCT..)

WITH x AS (
SELECT deptno, job, COUNT(DISTINCT deptno)OVER() cnt, COUNT(DISTINCT deptno)OVER(PARTITION BY job) cnt2
FROM emp
)
SELECT DISTINCT job
FROM x
WHERE cnt=cnt2

OR

WITH x AS (
SELECT deptno, job, COUNT(DISTINCT deptno)OVER() cnt, COUNT(DISTINCT deptno)OVER(PARTITION BY job) cnt2
FROM emp
)
SELECT job
FROM x
WHERE cnt=cnt2
GROUP BY job

#4: Cartesian Product and Two COUNT(DISTINCT ..)

SELECT a.job
FROM emp a, emp b
GROUP BY a.job
HAVING COUNT(DISTINCT a.deptno)=COUNT(DISTINCT b.deptno)

#5: ROLLUP with RANK OVER COUNT(DISTINCT..)

WITH x AS (
SELECT job, COUNT(DISTINCT deptno) cnt, 
       RANK()OVER(ORDER BY COUNT(DISTINCT deptno)  DESC) rk
FROM emp
GROUP BY ROLLUP(job)
)
SELECT job
FROM x
WHERE rk=1
  AND job IS NOT NULL

#6: Analytic COUNT(DITSINCT..) comparison with MINUS

WITH x AS (
SELECT job, 
       CASE WHEN COUNT(DISTINCT deptno)OVER()=COUNT(DISTINCT deptno)OVER(PARTITION BY job) THEN 1 END
FROM emp
MINUS
SELECT job, NULL
FROM emp
)
SELECT job
FROM x

#7: No COUNT(DISTINCT ..) solution:

WITH x AS (
SELECT a.deptno, b.job, NVL(COUNT(c.empno),0) idx
FROM (SELECT DISTINCT deptno FROM emp) a CROSS JOIN (SELECT DISTINCT job FROM emp) b
    LEFT JOIN emp c ON a.deptno=c.deptno AND b.job=c.job
GROUP BY a.deptno, b.job
)
SELECT job
FROM x
GROUP BY job
HAVING MIN(idx)>0

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Two ways to build a salary range report without using CASE function

Interview Question: Produce a salary range report with a single SELECT statement. Decode function is allowed, CASE function – is not.

Level: Intermediate

Expected Result:

RANGE                             Employees
-------------------------------- ----------
0-999                                     2
1000-2999                                 9
3000-5999                                 3

Strategy #1:

SELECT COALESCE(DECODE(LEAST(sal, 999), sal, '0-999'),
                DECODE(LEAST(sal, 2999), GREATEST(sal, 1000), '1000-2999'),
                DECODE(LEAST(sal, 9999), GREATEST(sal, 3000), '3000-5999')
                ) AS range,
       COUNT(*) "Employees"
FROM emp
GROUP BY COALESCE(DECODE(LEAST(sal, 999), sal, '0-999'),
                  DECODE(LEAST(sal, 2999), GREATEST(sal, 1000), '1000-2999'),
                  DECODE(LEAST(sal, 9999), GREATEST(sal, 3000), '3000-5999')
                  )
ORDER BY 1

Explanation:

In Oracle SQL terms, a mathematical condition

a <=x <=b

can be  interpreted as

x BETWEEN a AND b

however, this condition is good only for CASE function, and not for DECODE. The trick is to use another interpretation:

LEAST(b,x)=GREATEST(x,a)

– that can be used in DECODE.

CASE-based Solution:

SELECT CASE WHEN sal<=999 THEN '0-999'
            WHEN sal BETWEEN 1000 AND 2999 THEN '1000-2999'
            WHEN sal BETWEEN 3000 AND 5999 THEN '3000-5999'
       END AS range,
       COUNT(*) "Employees"
FROM emp
GROUP BY CASE WHEN sal<=999 THEN '0-999'
              WHEN sal BETWEEN 1000 AND 2999 THEN '1000-2999'
              WHEN sal BETWEEN 3000 AND 5999 THEN '3000-5999'
         END
ORDER BY 1

Strategy #2:

WITH x AS (
SELECT DECODE(1, (SELECT COUNT(*) FROM dual WHERE emp.sal<=999), '0-999',
                 (SELECT COUNT(*) FROM dual WHERE emp.sal BETWEEN 1000 AND 2999), '1000-2999',
                 (SELECT COUNT(*) FROM dual WHERE emp.sal BETWEEN 3000 AND 5999), '3000-5999'
             ) AS range
FROM emp
)
SELECT range, COUNT(*) AS "Employees"
FROM x
GROUP BY range
ORDER BY 1

Explanation:
This query demonstrates how to mimic CASE function using DECODE and in-line scalar subquery from dual.

Suggested further reading:

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions. The book is also available on Amazon and in all major book stores.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

How to split numeric and text values into separate columns

Puzzle: There is a table T with a single column C VARCHAR2(20) that contains random values. Some of the values are numeric. Write a single SELECT statement that outputs 2 columns: NUM and TEXT with numeric and non-numeric values correspondingly.

To mimic the T table, we will create a view:

CREATE OR REPLACE VIEW T
AS
WITH x AS (
SELECT CASE WHEN MOD(level,2)=0 THEN dbms_random.string('x',3)
            ELSE TO_CHAR(TRUNC(dbms_random.VALUE(-999,999)))
       END rnd,
       ROW_NUMBER()OVER(ORDER BY dbms_random.value) rk
FROM dual
CONNECT BY ROWNUM<=40
)
SELECT CAST(rnd AS VARCHAR2(4)) C
FROM x
WHERE rk<=10;

Expected Result:
Due to the random nature of the values in T view, actual results will be different every time you run a query; however, all the results will look somewhat like this:

NUM  TEXT
---- ----
-146 4R9
-362 78R
-762 ICY
236  U3W
     VIK
     Y21

Solution:

WITH x AS (
SELECT c, CASE WHEN REGEXP_LIKE(c,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END is_int,
       RANK()OVER(PARTITION BY CASE WHEN REGEXP_LIKE(c,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END ORDER BY c) rk
FROM t	   
)
SELECT MAX(DECODE(is_int, 1, c)) NUM,
       MAX(DECODE(is_int, 0, c)) TEXT
FROM x
GROUP BY rk
ORDER BY rk

Explanation:

The above solution uses regular expression for identifying numeric integer values (positive and negative):

^-?[[:digit:]]+$

The way to break down the results into two columns was demonstrated in my previous post: Generate a department/employee roll report

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

 

How to delete duplicate records without using Joins and aggregate functions

Interview Question: How to delete duplicate records without using Joins and aggregate functions

Level: Intermediate/Advanced

Answer/Solution:

DELETE
FROM emp_dups a
WHERE ROWID<ANY(SELECT ROWID
                FROM emp_dups b
                WHERE b.empno=a.empno)

Explanation:

If we were allowed to use group functions, we could have used a very well known strategy:

DELETE 
FROM emp_dups a
WHERE ROWID<(SELECT MAX(ROWID)
             FROM emp b
             WHERE b.empno=a.empno)

Since group functions are not allowed, we need to find a substitution. In our case, the solution is based on the equivalence of the following 2 conditions:

a<(SELECT MAX(a) ...)
a<ANY(SELECT a ...)

Another example of such equivalent conditions:

a=(SELECT MAX(a) ...)
a>=ALL(SELECT a ...)

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

 

Puzzle of the Week #8: Find job titles represented in every department

Find job titles represented in every department. Write a single SELECT statement only.

Note: Use only emp table.

Expected Result: (Only clerks and managers work in all 3 departments: 10,20, and 30)

JOB
--------
CLERK
MANAGER

To submit your answer (one or more!) please start following this blog and add a comment to this post.

A correct answer (and workarounds!) will be published here in a week.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Solutions to Puzzle of the Week #7

Puzzle of the Week #7

For every employee find the sum of ASCII codes of all the characters in their names. Write a single SELECT statement only.

Expected Result:

EMPNO ENAME       SUM_ASCII
----- ---------- ----------
 7788 SCOTT             397
 7876 ADAMS             358
 7566 JONES             383
 7499 ALLEN             364
 7521 WARD              302
 7934 MILLER            453
 7902 FORD              299
 7369 SMITH             389
 7844 TURNER            480
 7698 BLAKE             351
 7782 CLARK             365
 7654 MARTIN            459
 7839 KING              297
 7900 JAMES             368

Solutions:

Solution/Workaround #1: Oracle 12c and up Only (submitted by Zohar Elkayam)

WITH
    FUNCTION sumascii(p_str in varchar2)  RETURN NUMBER 
    IS 
       x NUMBER:= 0;
    BEGIN
      FOR i IN 1..LENGTH(p_str) LOOP
          x := x + ASCII(SUBSTR(p_str, i, 1)) ;
      END LOOP;
      RETURN x;
    END;
SELECT empno, ename, sumascii(ename) AS sum_ascii
FROM emp
/

Variation of Solution #1 (Recursive function):

WITH
    FUNCTION sumascii(p_str in varchar2)  RETURN NUMBER 
    IS 
    BEGIN
      IF p_str IS NULL THEN 
	RETURN 0;
      END IF;
      RETURN ASCII(p_str) + sumascii(SUBSTR(p_str,2));      
    END;
SELECT empno, ename, sumascii(ename) AS sum_ascii
FROM emp
/

Solution/Workaround #2: Cartesian Product with Generated Numeric Range (by Zohar Elkayam)

SELECT empno, ename, SUM(ASCII(ename_char)) sum_ascii
FROM (SELECT empno, ename, SUBSTR(ename, i, 1) ename_char
      FROM emp, (SELECT LEVEL i
                 FROM dual
                 CONNECT BY LEVEL<=(SELECT MAX(LENGTH(ename))                                      FROM emp)                  )       WHERE LENGTH(ename)>=i
      )
GROUP BY empno, ename
/

Simplified variation of Workaround #2:

SELECT empno, ename, 
       SUM(ASCII(SUBSTR(ename, i, 1))) sum_ascii      
FROM emp, (SELECT LEVEL i
           FROM dual
           CONNECT BY LEVEL<=(SELECT MAX(LENGTH(ename))                                FROM emp)            )  WHERE LENGTH(ename)>=i
GROUP BY empno, ename 
/

Solution/Workaround #3: In-Line Scalar Subquery

SELECT empno, ename, 
      (SELECT SUM(ASCII(SUBSTR(a.ename, LEVEL, 1)))
       FROM dual
       CONNECT BY LEVEL<=LENGTH(a.ename)) AS sum_ascii
FROM emp a
/

Solution #4/Workaround : Recursive WITH clause

WITH x(n, empno, ename, letter) AS (
SELECT 1 AS n, empno, ename, SUBSTR(ename, 1, 1)
FROM emp
UNION ALL
SELECT x.n+1, empno, ename, SUBSTR(ename, n+1, 1)
FROM x
WHERE LENGTH(ename)>=n+1
)
SELECT empno, ename, SUM(ASCII(letter)) sum_ascii
FROM x
GROUP BY empno, ename
/

Solution/Workaround #5: Use DUMP function and Regular Expressions (submitted by Sunitha)

SELECT empno, ename, SUM(REGEXP_SUBSTR(nm, '\d+', 1, occ)) AS sum_ascii
FROM (SELECT empno, ename, REGEXP_REPLACE(DUMP(ename), '.*: (\d.*)$', '\1') nm
      FROM emp), 
     (SELECT LEVEL occ FROM dual CONNECT BY LEVEL <=ANY(SELECT LENGTH(ename) FROM emp))
GROUP BY empno, ename
/

Solution/Workaround #6: Use LATERAL View (Oracle 12c and up)

SELECT empno, ename, sum_ascii
FROM emp e, LATERAL (SELECT SUM(ASCII(SUBSTR(e.ename,LEVEL,1)) ) sum_ascii
                     FROM dual
                     CONNECT BY LEVEL<=LENGTH(e.ename) ) x

Solution/Workaround #7: Use TABLE/CAST/MULTISET function composition

SELECT empno, ename, x.column_value AS sum_ascii
FROM emp e, 
     TABLE(CAST(MULTISET(SELECT SUM(ASCII(SUBSTR(e.ename,LEVEL,1)) ) sum_ascii
                         FROM dual
                         CONNECT BY LEVEL<=LENGTH(e.ename) 
                         ) AS sys.odcinumberlist
                )
          ) x

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

 

Round-Robin Schedule SQL Puzzle

Puzzle: With a single SELECT statement create a schedule of play for a round-robin tournament. The query should work for odd or even number of players. For odd number of players, the player with bye should be listed last for that round.

You can read about round-robin tournament definition here: https://en.wikipedia.org/wiki/Round-robin_tournament

Expected Results:

4 players:

 ROUND PAIRING
------ --------
     1 1 vs 2
     1 3 vs 4
     2 1 vs 4
     2 2 vs 3
     3 3 vs 1
     3 4 vs 2

OR

ROUND PAIRING
----- --------
    1 1 vs 2
      3 vs 4

    2 1 vs 4
      2 vs 3

    3 3 vs 1
      4 vs 2

5 players:

  ROUND PAIRING
------ --------
     1 5 vs 1
     1 4 vs 2
     1 3 - bye
     2 1 vs 2
     2 5 vs 3
     2 4 - bye
     3 1 vs 4
     3 2 vs 3
     3 5 - bye
     4 2 vs 5
     4 3 vs 4
     4 1 - bye
     5 3 vs 1
     5 4 vs 5
     5 2 - bye

OR

ROUND PAIRING
----- -------
    1 5 vs 1
      4 vs 2
      3 - bye

    2 1 vs 2
      5 vs 3
      4 - bye

    3 1 vs 4
      2 vs 3
      5 - bye

    4 2 vs 5
      3 vs 4
      1 - bye

    5 3 vs 1
      4 vs 5
      2 - bye

6 players:

ROUND PAIRING
----- --------
    1 5 vs 1
    1 4 vs 2
    1 3 vs 6
    2 1 vs 2
    2 5 vs 3
    2 6 vs 4
    3 1 vs 4
    3 2 vs 3
    3 5 vs 6
    4 1 vs 6
    4 2 vs 5
    4 3 vs 4
    5 3 vs 1
    5 6 vs 2
    5 4 vs 5

OR

ROUND PAIRING
----- -------
    1 5 vs 1
      4 vs 2
      3 vs 6

    2 1 vs 2
      5 vs 3
      6 vs 4

    3 1 vs 4
      2 vs 3
      5 vs 6

    4 1 vs 6
      2 vs 5
      3 vs 4

    5 3 vs 1
      6 vs 2
      4 vs 5

 Solution (SQL*Plus script):

accept players prompt "Enter the number of players: "
set pagesize 100
break on "ROUND" skip 1
col pairing for a10

WITH prompt AS (
   SELECT &players AS oplayers
   FROM dual
), x AS (
SELECT LEVEL n, oplayers,  oplayers + MOD(oplayers,2) AS players
FROM dual, prompt
CONNECT BY LEVEL<=oplayers + MOD(oplayers,2)-1
), w AS (
SELECT a.n AS rnd, 1+MOD(a.n+b.n,a.players-1) AS p,
       CASE WHEN ROW_NUMBER()OVER(PARTITION BY a.n ORDER BY 1)<=a.players/2 THEN ROW_NUMBER()OVER(PARTITION BY a.n ORDER BY a.n, b.n )-1
            ELSE a.players-ROW_NUMBER()OVER(PARTITION BY a.n ORDER BY 1)
       END AS match_id, a.oplayers  
FROM x a, x b 
UNION ALL
SELECT n, players AS p, 0 AS match_id, oplayers
FROM x
)
SELECT rnd AS "ROUND", 
       CASE WHEN MAX(p)>oplayers THEN  MIN(p) || ' - bye ' 
            WHEN MOD(MAX(p)-MIN(p),2)=1 THEN MIN(p) || ' vs ' || MAX(p)
            ELSE MAX(p) || ' vs ' || MIN(p) 
       END AS pairing
FROM w
GROUP BY rnd, match_id, oplayers
ORDER BY 1, CASE WHEN MAX(p)>oplayers THEN oplayers ELSE MIN(p) END;

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Puzzle of the Week #7: find the sum of ASCII codes of employee names

Puzzle of the Week #7

For every employee find the sum of ASCII codes of all the characters in their names. Write a single SELECT statement only.

Expected Result:

EMPNO ENAME       SUM_ASCII
----- ---------- ----------
 7788 SCOTT             397
 7876 ADAMS             358
 7566 JONES             383
 7499 ALLEN             364
 7521 WARD              302
 7934 MILLER            453
 7902 FORD              299
 7369 SMITH             389
 7844 TURNER            480
 7698 BLAKE             351
 7782 CLARK             365
 7654 MARTIN            459
 7839 KING              297
 7900 JAMES             368

To submit your answer (one or more!) please start following this blog and add a comment to this post.

A correct answer (and workarounds!) will be published here in a week.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Solutions to Puzzle of the Week #6

Puzzle of the Week #6

Find all employees who is paid one of top 4 salaries in the entire company without using sub-queries and in-line views/WITH clause.

Expected Result:

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850

 

Method/Workaround #1: Use Aggregation over Cartesian Product

This method works in all RDBMS systems, such as Oracle, SQL Server, Sybase, Teradata, etc.

SELECT a.ename, a.sal
FROM emp a, emp b
WHERE a.sal<=b.sal
GROUP BY a.ename, a.sal
HAVING COUNT(DISTINCT b.sal)<=4
ORDER BY 2 DESC;

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850

This method is described in details in my book Oracle SQL Tricks and Workarounds.

Method/Workaround #2: Use DENSE_RANK and MINUS

This method is more Oracle specific, but it is as good as the 1st one. It was suggested by one of the contest participants:

SELECT ename, 
       CASE WHEN DENSE_RANK()OVER(ORDER BY sal DESC)<=4 THEN sal END sal
FROM emp
MINUS
SELECT ename, NULL
FROM emp
ORDER BY 2 DESC;

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850

This is a very elegant variation of the following query with MINUS substituting the filter for analitic function result:

SELECT ename, sal
FROM (SELECT ename, sal, DENSE_RANK()OVER(ORDER BY sal DESC) rk
      FROM emp)
WHERE rk<=4;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850

Interestingly enough, Teradata SQL has a special clause QUALIFY for that matter. It is very elegant and maybe one day it will become an ANSII standard:

SELECT ename, sal
FROM emp
QUALIFY DENSE_RANK()OVER(ORDER BY sal DESC)<=4

Finally, I would like to share a couple of more traditional approaches that DO USE subqueries:

SELECT ename, sal
FROM emp a
WHERE 4>=(SELECT COUNT(DISTINCT b.sal)
          FROM emp b
          WHERE b.sal>=a.sal)
ORDER BY sal DESC;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850
SELECT ename, sal
FROM emp a
WHERE 4>(SELECT COUNT(DISTINCT b.sal)
          FROM emp b
          WHERE b.sal>a.sal)
ORDER BY sal DESC;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.