7 Solutions to Puzzle of the Week #11

Puzzle of the Week #11

Produce the Employee Roll Report that satisfies the following list of requirements:

  • Use single SELECT statement
  • Single column “Names” should have a list of the employee names separated by comma
  • The maximum size of the values in the “Names” column should be 23
  • The report should have as few rows as possible
  • All the employee names should be concatenated in the alphabetical order

Expected Result:

(the Length column is added for length verification only)

Names                                        Length
---------------------------------------- ----------
ADAMS,ALLEN,BLAKE,CLARK                          23
FORD,JAMES,JONES,KING                            21
MARTIN,MILLER,SCOTT                              19
SMITH,TURNER,WARD                                17

Solutions

#1 – Using Recursive WITH clause (Common Table Expression) – to contactenate names

WITH e AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn, 23 AS maxlen
FROM emp
), x (rn, txt, grp) AS (
SELECT 1, CAST(ename AS VARCHAR2(100)), 1
FROM e
WHERE rn=1
UNION ALL
SELECT e.rn,
       CASE WHEN LENGTH(x.txt||','||e.ename)>e.maxlen THEN e.ename
            ELSE x.txt||','||e.ename
       END,
       CASE WHEN LENGTH(x.txt||','||e.ename)>e.maxlen THEN x.grp+1
            ELSE x.grp
       END
FROM e JOIN x ON e.rn=x.rn+1
)
SELECT MAX(txt) "Names", LENGTH(MAX(txt)) "Length"
FROM x
GROUP BY grp
ORDER BY grp;

Names                         Length
------------------------- ----------
ADAMS,ALLEN,BLAKE,CLARK           23
FORD,JAMES,JONES,KING             21
MARTIN,MILLER,SCOTT               19
SMITH,TURNER,WARD                 17

#2 – Using Recursive WITH clause (Common Table Expression) – to group names AND LISTAGG function

WITH t (ename, len, rn) AS (  
SELECT ename, LENGTH(ename) + 1, ROW_NUMBER() OVER(ORDER BY ename)  
FROM emp  
), r (ename, running_len, rn, gp) AS (  
SELECT ename, len, rn, 1 
FROM t 
WHERE rn = 1  
UNION ALL  
SELECT t.ename,  
       CASE WHEN t.len > 24 - r.running_len THEN t.len ELSE r.running_len + t.len END,  
       t.rn,  
       r.gp + CASE WHEN t.len > 24 - r.running_len THEN 1 ELSE 0 END  
FROM t JOIN r ON t.rn = r.rn + 1  
)  
SELECT LISTAGG(ename, ',') WITHIN GROUP(ORDER BY rn) AS "Names", MAX(running_len) - 1 AS "Length"  
FROM r  
GROUP BY gp  
ORDER BY gp
/

#3: Using Recursive WITH clause (CTE) – to group names in a different way

WITH data (ename, grp, pass) AS (  
SELECT ename,  
    CASE WHEN SUM(LENGTH(ename) + 1) OVER(ORDER BY  ename) - 1 <= 23  
   THEN 1  
   ELSE 0  
    END, 1  
FROM emp  
UNION ALL  
SELECT ename,  
    CASE WHEN SUM(LENGTH(ename) + 1) OVER (ORDER BY  ename) - 1 <= 23  
   THEN 1  
    END, pass + 1  
FROM data  
WHERE (grp = 0 AND pass = 1) OR grp IS NULL  
), x AS (
SELECT LISTAGG(ename, ',') WITHIN GROUP(ORDER BY ename) AS names, pass  
FROM data  
WHERE grp = 1  
GROUP BY pass 
)  
SELECT names "Names", LENGTH(names) AS "Length"  
FROM x 
ORDER BY 1;

#4: Using XMLAGG with Regular Expressions

WITH t AS (
SELECT TRIM(',' FROM XMLAGG(xmlelement(e, ename||',') ORDER BY ename).EXTRACT('//text()')) AS txt
FROM  emp
), x AS (
SELECT LEVEL AS l,
       TRIM(',' FROM TRIM(REGEXP_SUBSTR(txt,'.{1,23}(,|$)',1,LEVEL))) AS names
       FROM t
       CONNECT BY TRIM(',' FROM TRIM(REGEXP_SUBSTR(txt,'.{1,23}(,|$)',1,LEVEL))) IS NOT NULL
)
SELECT names "Names", LENGTH(names) "Length"
FROM x
/

#5: Using LISTAGG with Regular Expressions

WITH  x AS (
SELECT LISTAGG (ename, ',') WITHIN GROUP (ORDER BY 1) str
FROM emp
)
SELECT RTRIM(REGEXP_SUBSTR (str, '.{1,23}(,|$)', 1, LEVEL), ',')  "Names",
       LENGTH(RTRIM(REGEXP_SUBSTR (str, '.{1,23}(,|$)', 1, LEVEL), ',')) "Length"
FROM x
CONNECT BY RTRIM(REGEXP_SUBSTR (str, '.{1,23}(,|$)', 1, LEVEL), ',') IS NOT NULL

#6: Using MODEL clause for grouping names

WITH m AS (
SELECT i, ename, grp, len, prevlen  
FROM emp  
MODEL  
   DIMENSION BY (ROW_number() OVER (ORDER BY  ename) AS i)  
   MEASURES 
    (
       ename AS ename, 
    CAST('' AS VARCHAR2(24)) AS names,
    0 AS grp,
    0 AS len, 
    0 AS prevlen
 )  
    RULES 
 (
   len[i] = LENGTH(ename[CV()]),
   prevlen[i] = CASE WHEN (CASE WHEN NVL(prevlen[CV()-1],0) = 0 THEN NVL(len[CV()-1],0) 
           ELSE NVL(prevlen[CV()-1],0) + 1 +  NVL(len[CV()-1],0) 
         END) > 23  
         THEN NVL(len[CV()-1],0)  
         ELSE CASE WHEN NVL(prevlen[CV()-1],0) = 0 THEN NVL(len[CV()-1],0) 
          ELSE NVL(prevlen[CV()-1],0) + 1 +  NVL(len[CV()-1],0) 
           END  
       END,
   grp[i] = NVL(grp[CV()-1],0) + CASE WHEN prevlen[CV()+1] < prevlen[CV()] THEN 1 ELSE 0 END   
 )  
)             
SELECT LISTAGG(ename,',') WITHIN GROUP (ORDER BY ename) AS "Names" , LENGTH(listagg(ename,',') WITHIN GROUP (ORDER BY  ename)) AS "Length"  
FROM m  
GROUP BY grp;  
 

#7: Oracle 12c Solution – Using MATCH_RECOGNIZE clause

SELECT  LISTAGG(name,',') WITHIN GROUP(ORDER BY name) "Names",
        LENGTH(LISTAGG(name,',') WITHIN GROUP(ORDER BY name)) "Length"
FROM  EMP
MATCH_RECOGNIZE
 (
  ORDER BY ENAME
  MEASURES
 MATCH_NUMBER() rn,
 UP.ENAME name
 ALL ROWS PER MATCH
 PATTERN (UP+)
 DEFINE
   UP AS SUM(LENGTH(UP.ENAME || ',')) <= 24
  )
GROUP BY RN
ORDER BY RN

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Advertisements

How to split numeric and text values into separate columns

Puzzle: There is a table T with a single column C VARCHAR2(20) that contains random values. Some of the values are numeric. Write a single SELECT statement that outputs 2 columns: NUM and TEXT with numeric and non-numeric values correspondingly.

To mimic the T table, we will create a view:

CREATE OR REPLACE VIEW T
AS
WITH x AS (
SELECT CASE WHEN MOD(level,2)=0 THEN dbms_random.string('x',3)
            ELSE TO_CHAR(TRUNC(dbms_random.VALUE(-999,999)))
       END rnd,
       ROW_NUMBER()OVER(ORDER BY dbms_random.value) rk
FROM dual
CONNECT BY ROWNUM<=40
)
SELECT CAST(rnd AS VARCHAR2(4)) C
FROM x
WHERE rk<=10;

Expected Result:
Due to the random nature of the values in T view, actual results will be different every time you run a query; however, all the results will look somewhat like this:

NUM  TEXT
---- ----
-146 4R9
-362 78R
-762 ICY
236  U3W
     VIK
     Y21

Solution:

WITH x AS (
SELECT c, CASE WHEN REGEXP_LIKE(c,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END is_int,
       RANK()OVER(PARTITION BY CASE WHEN REGEXP_LIKE(c,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END ORDER BY c) rk
FROM t	   
)
SELECT MAX(DECODE(is_int, 1, c)) NUM,
       MAX(DECODE(is_int, 0, c)) TEXT
FROM x
GROUP BY rk
ORDER BY rk

Explanation:

The above solution uses regular expression for identifying numeric integer values (positive and negative):

^-?[[:digit:]]+$

The way to break down the results into two columns was demonstrated in my previous post: Generate a department/employee roll report

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

 

Solutions to Puzzle of the Week #7

Puzzle of the Week #7

For every employee find the sum of ASCII codes of all the characters in their names. Write a single SELECT statement only.

Expected Result:

EMPNO ENAME       SUM_ASCII
----- ---------- ----------
 7788 SCOTT             397
 7876 ADAMS             358
 7566 JONES             383
 7499 ALLEN             364
 7521 WARD              302
 7934 MILLER            453
 7902 FORD              299
 7369 SMITH             389
 7844 TURNER            480
 7698 BLAKE             351
 7782 CLARK             365
 7654 MARTIN            459
 7839 KING              297
 7900 JAMES             368

Solutions:

Solution/Workaround #1: Oracle 12c and up Only (submitted by Zohar Elkayam)

WITH
    FUNCTION sumascii(p_str in varchar2)  RETURN NUMBER 
    IS 
       x NUMBER:= 0;
    BEGIN
      FOR i IN 1..LENGTH(p_str) LOOP
          x := x + ASCII(SUBSTR(p_str, i, 1)) ;
      END LOOP;
      RETURN x;
    END;
SELECT empno, ename, sumascii(ename) AS sum_ascii
FROM emp
/

Variation of Solution #1 (Recursive function):

WITH
    FUNCTION sumascii(p_str in varchar2)  RETURN NUMBER 
    IS 
    BEGIN
      IF p_str IS NULL THEN 
	RETURN 0;
      END IF;
      RETURN ASCII(p_str) + sumascii(SUBSTR(p_str,2));      
    END;
SELECT empno, ename, sumascii(ename) AS sum_ascii
FROM emp
/

Solution/Workaround #2: Cartesian Product with Generated Numeric Range (by Zohar Elkayam)

SELECT empno, ename, SUM(ASCII(ename_char)) sum_ascii
FROM (SELECT empno, ename, SUBSTR(ename, i, 1) ename_char
      FROM emp, (SELECT LEVEL i
                 FROM dual
                 CONNECT BY LEVEL<=(SELECT MAX(LENGTH(ename))                                      FROM emp)                  )       WHERE LENGTH(ename)>=i
      )
GROUP BY empno, ename
/

Simplified variation of Workaround #2:

SELECT empno, ename, 
       SUM(ASCII(SUBSTR(ename, i, 1))) sum_ascii      
FROM emp, (SELECT LEVEL i
           FROM dual
           CONNECT BY LEVEL<=(SELECT MAX(LENGTH(ename))                                FROM emp)            )  WHERE LENGTH(ename)>=i
GROUP BY empno, ename 
/

Solution/Workaround #3: In-Line Scalar Subquery

SELECT empno, ename, 
      (SELECT SUM(ASCII(SUBSTR(a.ename, LEVEL, 1)))
       FROM dual
       CONNECT BY LEVEL<=LENGTH(a.ename)) AS sum_ascii
FROM emp a
/

Solution #4/Workaround : Recursive WITH clause

WITH x(n, empno, ename, letter) AS (
SELECT 1 AS n, empno, ename, SUBSTR(ename, 1, 1)
FROM emp
UNION ALL
SELECT x.n+1, empno, ename, SUBSTR(ename, n+1, 1)
FROM x
WHERE LENGTH(ename)>=n+1
)
SELECT empno, ename, SUM(ASCII(letter)) sum_ascii
FROM x
GROUP BY empno, ename
/

Solution/Workaround #5: Use DUMP function and Regular Expressions (submitted by Sunitha)

SELECT empno, ename, SUM(REGEXP_SUBSTR(nm, '\d+', 1, occ)) AS sum_ascii
FROM (SELECT empno, ename, REGEXP_REPLACE(DUMP(ename), '.*: (\d.*)$', '\1') nm
      FROM emp), 
     (SELECT LEVEL occ FROM dual CONNECT BY LEVEL <=ANY(SELECT LENGTH(ename) FROM emp))
GROUP BY empno, ename
/

Solution/Workaround #6: Use LATERAL View (Oracle 12c and up)

SELECT empno, ename, sum_ascii
FROM emp e, LATERAL (SELECT SUM(ASCII(SUBSTR(e.ename,LEVEL,1)) ) sum_ascii
                     FROM dual
                     CONNECT BY LEVEL<=LENGTH(e.ename) ) x

Solution/Workaround #7: Use TABLE/CAST/MULTISET function composition

SELECT empno, ename, x.column_value AS sum_ascii
FROM emp e, 
     TABLE(CAST(MULTISET(SELECT SUM(ASCII(SUBSTR(e.ename,LEVEL,1)) ) sum_ascii
                         FROM dual
                         CONNECT BY LEVEL<=LENGTH(e.ename) 
                         ) AS sys.odcinumberlist
                )
          ) x

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

 

How to generate a random sample of numeric and alpha-numeric values

Puzzle: Generate a random sample of numeric and alpha-numeric values

Level: Intermediate/Advanced

This problem often arises when you need to run a test and you don’t have a table to experiment with. Not a problem any more!

Related Post: Use TRUNC function to generate various date ranges

Step 1: Generate a random list of 10 alpha-numeric values:

SELECT dbms_random.string('x',3) rnd
FROM dual
CONNECT BY ROWNUM<=10

RND
-----
3TI
1JB
CIP
9SE
79K
YNZ
VEG
V0B
KPN
ILR

Step 2: Generate a random list of 10 integer values:

SELECT TRUNC(dbms_random.VALUE(-999,999)) rnd
FROM dual
CONNECT BY ROWNUM<=10

RND
----
 539
 153
 979
 689
 212
 267
  -5
-832
-160
 665

Step 3: Mix the above lists together (with equal shares):

SELECT CASE WHEN MOD(level,2)=0 THEN dbms_random.string('x',3)
            ELSE TO_CHAR(TRUNC(dbms_random.VALUE(-999,999)))
       END rnd
FROM dual
CONNECT BY ROWNUM<=10

RND
-------
-513
SVA
-475
NRM
-903
G45
-654
2S3
415
0HG

Step 4: Let’s have a fairly random number of integers and strings in the output

WITH x AS (
SELECT CASE WHEN MOD(level,2)=0 THEN dbms_random.string('x',3)
            ELSE TO_CHAR(TRUNC(dbms_random.VALUE(-999,999)))
       END rnd,
       ROW_NUMBER()OVER(ORDER BY dbms_random.value) rk
FROM dual
CONNECT BY ROWNUM<=40
)
SELECT CAST(rnd AS VARCHAR2(4)) rnd
FROM x
WHERE rk<=10

RND
----
-985
0TG
8JZ
-714
500
199
7IJ
249
RNI
F2G

Step 5: Final touch – let’s add a column that would flag integers

WITH x AS (
SELECT CASE WHEN MOD(level,2)=0 THEN dbms_random.string('x',3)
            ELSE TO_CHAR(TRUNC(dbms_random.VALUE(-999,999)))
       END rnd,
       ROW_NUMBER()OVER(ORDER BY dbms_random.value) rk
FROM dual
CONNECT BY ROWNUM<=40
)
SELECT CAST(rnd AS VARCHAR2(4)) rnd, CASE WHEN REGEXP_LIKE(rnd,'^-?[[:digit:]]+$') THEN 1 ELSE 0 END is_int
FROM x
WHERE rk<=10

RND      IS_INT
---- ----------
WIS           0
-558          1
0QR           0
-433          1
RB0           0
PT8           0
409           1
YOV           0
969           1
FFI           0

 

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

A function that converts a specially formatted character string to Oracle’s TIMESTAMP data type.

Problem: Convert a character string formatted ‘YYYY-MM-DD HH24:MI:SS.FF’ to TIMESTAMP. The string has to be validated before converting VARCHAR2 variable to TIMESTAMP.

The following function does the trick:

CREATE OR REPLACE FUNCTION TO_TS(p_ts_text VARCHAR2) RETURN TIMESTAMP
AS
BEGIN
    IF REGEXP_LIKE(p_ts_text, '^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{2}') THEN
        RETURN TO_TIMESTAMP(p_ts_text, 'YYYY-MM-DD HH24:MI:SS.FF');
    END IF;
    
    RAISE_APPLICATION_ERROR(-20012,'Invalid timestamp format is used');
END;

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Passing comma delimtied string to a stored procedure

It would be nice if we could always avoid dynamic SQL in our PL/SQL. The execution path would be mostly known upfront, it would be so easy debug and troubleshoot the code, etc. However, there are times when it seems almost impossible to write SQL queries in PL/SQL without relying on dynamic SQL functionality. One of the great examples is passing multiple ids separated by some delimiter (mostly by a comma).

Let’s consider a stored procedure with the following signature:

PROCEDURE sp_get_empployees(p_dept_ids VARCHAR2, p_result OUT SYS_REFCURSOR)

We need to retrieve all employees who work in one of the departments with department numbers listed in p_dept_ids line.

The query may look something like this:

SELECT empno, ename, job, deptno
FROM emp
WHERE deptno IN ([list of ids])
ORDER BY deptno, ename

Let’s first review a “traditional” approach based on dynamic SQL:

CREATE OR REPLACE PROCEDURE sp_get_empployees(p_dept_ids VARCHAR2, p_result OUT SYS_REFCURSOR)
AS
  v_sql VARCHAR2(2000);
BEGIN
  v_sql:='SELECT empno, ename, job, deptno
	  FROM emp
	  WHERE deptno IN (' || p_dept_ids || ')
	  ORDER BY deptno, ename';
  OPEN p_result FOR v_sql;

END sp_get_empployees;
/

If you need to test this procedure in SQL*Plus, here is how to do it:

SQL> var c refcursor
SQL> exec sp_get_empployees('10,20', :c)

PL/SQL procedure successfully completed.

SQL> print c

     EMPNO ENAME      JOB           DEPTNO
---------- ---------- --------- ----------
      7782 CLARK      MANAGER           10
      7839 KING       PRESIDENT         10
      7934 MILLER     CLERK             10
      7876 ADAMS      CLERK             20
      7902 FORD       ANALYST           20
      7566 JONES      MANAGER           20
      7788 SCOTT      ANALYST           20
      7369 SMITH      CLERK             20

8 rows selected.

Everything seems to look good at the first glance, but there are some issues.

What if p_dept_ids value is NULL (empty) which could mean that we don’t want any employees to be returned back?

Passing NULL will lead to an error:

SQL> exec sp_get_empployees('', :c)
BEGIN sp_get_empployees('', :c); END;

*
ERROR at line 1:
ORA-00936: missing expression
ORA-06512: at "SCOTT.SP_GET_EMPPLOYEES", line 9
ORA-06512: at line 1

Yes, we could add a validation of the input parameter, and this is not a bad idea in general. The main purpose of this post is to show you a workaround that would not rely on the dynamic SQL at all and also it won’t use the input parameter validation.
We are going to leverage the power of regular expressions to split the comma delimited line:

CREATE OR REPLACE PROCEDURE sp_get_empployees2(p_dept_ids VARCHAR2, p_result OUT SYS_REFCURSOR)
AS
BEGIN
  OPEN p_result FOR 
  SELECT empno, ename, job, deptno
  FROM emp
  WHERE deptno IN (SELECT REGEXP_SUBSTR(p_dept_ids, '[^,]+', 1, LEVEL) deptno
                   FROM dual
                   CONNECT BY LEVEL <= LENGTH(p_dept_ids) - LENGTH(REPLACE(p_dept_ids,','))+1 
                  )
  ORDER BY deptno, ename;
END sp_get_empployees2;
/

Here is the test:

SQL> exec sp_get_empployees2('10,20', :c)

PL/SQL procedure successfully completed.

SQL> print c

     EMPNO ENAME      JOB           DEPTNO
---------- ---------- --------- ----------
      7782 CLARK      MANAGER           10
      7839 KING       PRESIDENT         10
      7934 MILLER     CLERK             10
      7876 ADAMS      CLERK             20
      7902 FORD       ANALYST           20
      7566 JONES      MANAGER           20
      7788 SCOTT      ANALYST           20
      7369 SMITH      CLERK             20

8 rows selected.

SQL> exec sp_get_empployees2('', :c)

PL/SQL procedure successfully completed.

SQL> print c

no rows selected

In subsequent posts I will show some more workarounds to the presented solution.

If you want to learn how to come up with numerous workarounds on your own, check my book “Oracle SQL Tricks and Workarounds” for instructions.