Puzzle of the Week #13: Duplicate Names

Puzzle of the Week #13

Consider the following table:

SQL> SELECT *
  2  FROM name_list;

   NAME_ID FULL_NAME
---------- -----------------------------------
         1 John Smith
         2 John L. Smith
         3 Tom Blake
         4 Smith, John
         5 Tom Khan
         6 Pete Ward
         7 Peter Ward
         8 Oscar Pedro Fernando Rodriguez
         9 Rodriguez, Oscar Pedro Fernando
        10 Oscar Fernando Rodriguez Pedro
        11 KHAN, TOM S.
        12 Ward, Peter Peter

12 rows selected.

With a single SELECT statement retrieve a list of all duplicate names assuming the following:

  • Case should be ignored
  • Punctuation characters should be ignored
  • Single character initials should be ignored
  • Names are considered matching if they consist of the same set of name tokens, for ex: “John Smith” should match “SMITH, John L.” as the names consist of the same list of two tokens: John and Smith (case is ignored as well as punctuation and middle initial).

Expected Result:

   ID FULL_NAME                             GROUP_ID
---- ----------------------------------- ----------
   8 Oscar Pedro Fernando Rodriguez               1
   9 Rodriguez, Oscar Pedro Fernando              1
  10 Oscar Fernando Rodriguez Pedro               1
   1 John Smith                                   2
   2 John L. Smith                                2
   4 Smith, John                                  2
   5 Tom Khan                                     3
  11 KHAN, TOM S.                                 3

Table Setup:

CREATE TABLE name_list
(
    name_id   NUMBER,
    full_name VARCHAR2(100)
);

INSERT INTO name_list VALUES(1,'John Smith');
INSERT INTO name_list VALUES(2,'John L. Smith');
INSERT INTO name_list VALUES(3,'Tom Blake');
INSERT INTO name_list VALUES(4,'Smith, John');
INSERT INTO name_list VALUES(5,'Tom Khan');
INSERT INTO name_list VALUES(6,'Pete Ward');
INSERT INTO name_list VALUES(7,'Peter Ward');
INSERT INTO name_list VALUES(8,'Oscar Pedro Fernando Rodriguez');
INSERT INTO name_list VALUES(9,'Rodriguez, Oscar Pedro Fernando');
INSERT INTO name_list VALUES(10,'Oscar Fernando Rodriguez Pedro');
INSERT INTO name_list VALUES(11,'KHAN, TOM S.');
INSERT INTO name_list VALUES(12,'Ward, Peter Peter');

COMMIT;

To submit your answer (one or more!) please start following this blog and add a comment to this post.

A correct answer (and workarounds!) will be published here in a week.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

 

Advertisements

4 Solutions to Puzzle of the Week #12

Puzzle of the Week #12

With a single SELECT statement produce a list of first 10 prime numbers above a given number of N.

Expected Result: (for N=15)

     Prime
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Expected Result: (for N=50)

     Prime
----------
        53
        59
        61
        67
        71
        73
        79
        83
        89
        97

10 rows selected.

Solutions:

#1: Liming number of found prime numbers in CTE (Recursive WITH clsue)

WITH y AS (
SELECT 500 fromN
FROM dual
), x (n, cnt, flag) AS (
SELECT fromN,
      (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
       FROM dual
       WHERE MOD(fromN, LEVEL)=0
       CONNECT BY LEVEL<=fromN),
      (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
       FROM dual
       WHERE MOD(fromN, LEVEL)=0
       CONNECT BY LEVEL<=fromN)
FROM y
UNION ALL
SELECT x.n+1, (SELECT x.cnt+CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
               FROM dual
               WHERE MOD(x.n+1, LEVEL)=0
               CONNECT BY LEVEL<=x.n+1),
              (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
               FROM dual
               WHERE MOD(x.n+1, LEVEL)=0
               CONNECT BY LEVEL<=x.n+1)
FROM x
WHERE x.cnt

#2: Limiting number of found prime numbers outside of CTE (Recursive WITH clsue)

WITH y AS (
SELECT 50 fromN
FROM dual
), x (n, flag) AS (
SELECT fromN,
      (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
       FROM dual
       WHERE MOD(fromN, LEVEL)=0
       CONNECT BY LEVEL<=fromN)
FROM y
UNION ALL
SELECT x.n+1, (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END
               FROM dual
               WHERE MOD(x.n+1, LEVEL)=0
               CONNECT BY LEVEL<=x.n+1) FROM x WHERE x.n>0
)
SELECT n AS prime
FROM x
WHERE flag=1
  AND ROWNUM<=10;

     PRIME
----------
        53
        59
        61
        67
        71
        73
        79
        83
        89
        97

10 rows selected.

Elapsed: 00:00:00.02

#3: Using TABLE and MULTISET functions

WITH y AS (
SELECT 16 fromN
FROM dual
), x (n, flag) AS (
SELECT fromN, column_value flag 
FROM y, TABLE(CAST(MULTISET(SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END flag
                            FROM dual
                            WHERE MOD(fromN, LEVEL)=0
                            CONNECT BY LEVEL<=fromN) AS sys.odcinumberlist))  
UNION ALL
SELECT x.n+1, column_value flag  
FROM x, TABLE(CAST(MULTISET(SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END flag
                            FROM dual
                            WHERE MOD(x.n+1, LEVEL)=0
                            CONNECT BY LEVEL<=x.n+1) AS sys.odcinumberlist))  WHERE x.n>0
)
SELECT n AS prime
FROM x
WHERE flag=1
  AND ROWNUM<=10;

     PRIME
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Elapsed: 00:00:00.12

#4: Using LATERAL views

WITH y AS (
SELECT 16 fromN
FROM dual
), x (n, flag) AS (
SELECT fromN, is_prime
FROM y, LATERAL (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END is_prime
                 FROM dual
                 WHERE MOD(fromN, LEVEL)=0
                 CONNECT BY LEVEL<=fromN)
UNION ALL
SELECT x.n+1, is_prime 
FROM x, LATERAL (SELECT CASE WHEN COUNT(*)=2 THEN 1 ELSE 0 END is_prime
                 FROM dual
                 WHERE MOD(x.n+1, LEVEL)=0
                 CONNECT BY LEVEL<=x.n+1) WHERE x.n>0
)
SELECT n AS prime
FROM x
WHERE flag=1
 AND ROWNUM<=10;

     PRIME
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Elapsed: 00:00:00.11

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

How to get Session ID in Teradata vs Oracle

Getting Session ID in Teradata is trivial:

SELECT SESSION

In Oracle, you would need to use sys_context (Oracle 10g and up)  or userenv function (legacy function) if you are not permitted to reference v$mystat or some other dynamic performance views:

SELECT SYS_CONTEXT('userenv','sid') sid
FROM dual;

SELECT USERENV('sid') sid
FROM dual;

Interview Question: get 2 random employees for each salary range?

Interview Question

Level: Intermediate/Advanced

For each of the following salary ranges select two randomly chosen employees:

0-999
1000-1999
2000-2999
3000+

Expected Result:

ENAME             SAL RANGE
---------- ---------- ---------
SCOTT            3000 3000+
FORD             3000 3000+
BLAKE            2850 2000-2999
CLARK            2450 2000-2999
TURNER           1500 1000-1999
MILLER           1300 1000-1999
JAMES             950 0-999
SMITH             800 0-999

Solution:

WITH x AS (
SELECT ename, sal,
       CASE WHEN sal>=3000 THEN '3000+'
            WHEN sal>=2000 THEN '2000-2999'
            WHEN sal>=1000 THEN '1000-1999'
            ELSE                '0-999'
       END as range,
       ROW_NUMBER() OVER(PARTITION BY DECODE(GREATEST(sal, 3000), sal, 0, 1) +
                                      DECODE(GREATEST(sal, 2000), sal, 0, 1) +
                                      DECODE(GREATEST(sal, 1000), sal, 0, 1)
                         ORDER BY DBMS_RANDOM.VALUE) rn
FROM emp
)
SELECT ename, sal, range
FROM x
WHERE rn<=2
ORDER BY sal DESC

 

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Puzzle of the Week #12: Prime Numbers

Puzzle of the Week #12

With a single SELECT statement produce a list of first 10 prime numbers above a given number of N.

Expected Result: (for N=15)

     Prime
----------
        17
        19
        23
        29
        31
        37
        41
        43
        47
        53

10 rows selected.

Expected Result: (for N=50)

     Prime
----------
        53
        59
        61
        67
        71
        73
        79
        83
        89
        97

10 rows selected.

To submit your answer (one or more!) please start following this blog and add a comment to this post.

A correct answer (and workarounds!) will be published here in a week.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

Create Database Link using EZConnect Syntax

EZConnect syntax eliminates the need for service name lookup in tnsnames.ora files when connecting to an Oracle Database across a TCP/IP network. In a previous post we have already demonstrated how to use this syntax to establish connection in SQL*Plus.

Now, we are going even further. You can use the same syntax for creating Database Links (assuming that you have a privilege to create database links in general).

CREATE DATABASE LINK dbl_test
CONNECT TO scott
IDENTIFIED BY tiger
USING ‘scott/tiger@192.168.1.180:1521/ORCL’;

Database link created.

See http://www.orafaq.com/wiki/EZCONNECT for more information on the syntax and prerequisites.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

7 Solutions to Puzzle of the Week #11

Puzzle of the Week #11

Produce the Employee Roll Report that satisfies the following list of requirements:

  • Use single SELECT statement
  • Single column “Names” should have a list of the employee names separated by comma
  • The maximum size of the values in the “Names” column should be 23
  • The report should have as few rows as possible
  • All the employee names should be concatenated in the alphabetical order

Expected Result:

(the Length column is added for length verification only)

Names                                        Length
---------------------------------------- ----------
ADAMS,ALLEN,BLAKE,CLARK                          23
FORD,JAMES,JONES,KING                            21
MARTIN,MILLER,SCOTT                              19
SMITH,TURNER,WARD                                17

Solutions

#1 – Using Recursive WITH clause (Common Table Expression) – to contactenate names

WITH e AS (
SELECT ename, ROW_NUMBER()OVER(ORDER BY ename) rn, 23 AS maxlen
FROM emp
), x (rn, txt, grp) AS (
SELECT 1, CAST(ename AS VARCHAR2(100)), 1
FROM e
WHERE rn=1
UNION ALL
SELECT e.rn,
       CASE WHEN LENGTH(x.txt||','||e.ename)>e.maxlen THEN e.ename
            ELSE x.txt||','||e.ename
       END,
       CASE WHEN LENGTH(x.txt||','||e.ename)>e.maxlen THEN x.grp+1
            ELSE x.grp
       END
FROM e JOIN x ON e.rn=x.rn+1
)
SELECT MAX(txt) "Names", LENGTH(MAX(txt)) "Length"
FROM x
GROUP BY grp
ORDER BY grp;

Names                         Length
------------------------- ----------
ADAMS,ALLEN,BLAKE,CLARK           23
FORD,JAMES,JONES,KING             21
MARTIN,MILLER,SCOTT               19
SMITH,TURNER,WARD                 17

#2 – Using Recursive WITH clause (Common Table Expression) – to group names AND LISTAGG function

WITH t (ename, len, rn) AS (  
SELECT ename, LENGTH(ename) + 1, ROW_NUMBER() OVER(ORDER BY ename)  
FROM emp  
), r (ename, running_len, rn, gp) AS (  
SELECT ename, len, rn, 1 
FROM t 
WHERE rn = 1  
UNION ALL  
SELECT t.ename,  
       CASE WHEN t.len > 24 - r.running_len THEN t.len ELSE r.running_len + t.len END,  
       t.rn,  
       r.gp + CASE WHEN t.len > 24 - r.running_len THEN 1 ELSE 0 END  
FROM t JOIN r ON t.rn = r.rn + 1  
)  
SELECT LISTAGG(ename, ',') WITHIN GROUP(ORDER BY rn) AS "Names", MAX(running_len) - 1 AS "Length"  
FROM r  
GROUP BY gp  
ORDER BY gp
/

#3: Using Recursive WITH clause (CTE) – to group names in a different way

WITH data (ename, grp, pass) AS (  
SELECT ename,  
    CASE WHEN SUM(LENGTH(ename) + 1) OVER(ORDER BY  ename) - 1 <= 23  
   THEN 1  
   ELSE 0  
    END, 1  
FROM emp  
UNION ALL  
SELECT ename,  
    CASE WHEN SUM(LENGTH(ename) + 1) OVER (ORDER BY  ename) - 1 <= 23  
   THEN 1  
    END, pass + 1  
FROM data  
WHERE (grp = 0 AND pass = 1) OR grp IS NULL  
), x AS (
SELECT LISTAGG(ename, ',') WITHIN GROUP(ORDER BY ename) AS names, pass  
FROM data  
WHERE grp = 1  
GROUP BY pass 
)  
SELECT names "Names", LENGTH(names) AS "Length"  
FROM x 
ORDER BY 1;

#4: Using XMLAGG with Regular Expressions

WITH t AS (
SELECT TRIM(',' FROM XMLAGG(xmlelement(e, ename||',') ORDER BY ename).EXTRACT('//text()')) AS txt
FROM  emp
), x AS (
SELECT LEVEL AS l,
       TRIM(',' FROM TRIM(REGEXP_SUBSTR(txt,'.{1,23}(,|$)',1,LEVEL))) AS names
       FROM t
       CONNECT BY TRIM(',' FROM TRIM(REGEXP_SUBSTR(txt,'.{1,23}(,|$)',1,LEVEL))) IS NOT NULL
)
SELECT names "Names", LENGTH(names) "Length"
FROM x
/

#5: Using LISTAGG with Regular Expressions

WITH  x AS (
SELECT LISTAGG (ename, ',') WITHIN GROUP (ORDER BY 1) str
FROM emp
)
SELECT RTRIM(REGEXP_SUBSTR (str, '.{1,23}(,|$)', 1, LEVEL), ',')  "Names",
       LENGTH(RTRIM(REGEXP_SUBSTR (str, '.{1,23}(,|$)', 1, LEVEL), ',')) "Length"
FROM x
CONNECT BY RTRIM(REGEXP_SUBSTR (str, '.{1,23}(,|$)', 1, LEVEL), ',') IS NOT NULL

#6: Using MODEL clause for grouping names

WITH m AS (
SELECT i, ename, grp, len, prevlen  
FROM emp  
MODEL  
   DIMENSION BY (ROW_number() OVER (ORDER BY  ename) AS i)  
   MEASURES 
    (
       ename AS ename, 
    CAST('' AS VARCHAR2(24)) AS names,
    0 AS grp,
    0 AS len, 
    0 AS prevlen
 )  
    RULES 
 (
   len[i] = LENGTH(ename[CV()]),
   prevlen[i] = CASE WHEN (CASE WHEN NVL(prevlen[CV()-1],0) = 0 THEN NVL(len[CV()-1],0) 
           ELSE NVL(prevlen[CV()-1],0) + 1 +  NVL(len[CV()-1],0) 
         END) > 23  
         THEN NVL(len[CV()-1],0)  
         ELSE CASE WHEN NVL(prevlen[CV()-1],0) = 0 THEN NVL(len[CV()-1],0) 
          ELSE NVL(prevlen[CV()-1],0) + 1 +  NVL(len[CV()-1],0) 
           END  
       END,
   grp[i] = NVL(grp[CV()-1],0) + CASE WHEN prevlen[CV()+1] < prevlen[CV()] THEN 1 ELSE 0 END   
 )  
)             
SELECT LISTAGG(ename,',') WITHIN GROUP (ORDER BY ename) AS "Names" , LENGTH(listagg(ename,',') WITHIN GROUP (ORDER BY  ename)) AS "Length"  
FROM m  
GROUP BY grp;  
 

#7: Oracle 12c Solution – Using MATCH_RECOGNIZE clause

SELECT  LISTAGG(name,',') WITHIN GROUP(ORDER BY name) "Names",
        LENGTH(LISTAGG(name,',') WITHIN GROUP(ORDER BY name)) "Length"
FROM  EMP
MATCH_RECOGNIZE
 (
  ORDER BY ENAME
  MEASURES
 MATCH_NUMBER() rn,
 UP.ENAME name
 ALL ROWS PER MATCH
 PATTERN (UP+)
 DEFINE
   UP AS SUM(LENGTH(UP.ENAME || ',')) <= 24
  )
GROUP BY RN
ORDER BY RN

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

How to Copy a Table: Teradata vs Oracle syntax

In Oracle, you copy a table with the following CREATE TABLE statement:

CREATE TABLE <table_copy> AS 
SELECT * FROM <table>

And if you only want to copy the structure (with no content) you add WHERE clause that filters out all rows:

CREATE TABLE <table_copy> AS 
SELECT * FROM <table>
WHERE 1=2

In Teradata, you have two ways to do the same task:

CREATE TABLE <table_copy> AS (
SELECT * FROM <table>
) WITH [NO] DATA

and a shorter way:

CREATE TABLE <table_copy> AS <table> WITH [NO] DATA

In both cases, WITH DATA (or WITH NO DATA) clause is required.

 

 

Puzzle of the Week #11

Puzzle: Produce the Employee Roll Report that satisfies the following list of requirements:

  • Use single SELECT statement
  • Single column “Names” should have a list of the employee names separated by comma
  • The maximum size of the values in the “Names” column should be 23
  • The report should have as few rows as possible
  • All the employee names should be concatenated in the alphabetical order

Expected Result: (the Length column is added for length verification only)

Names                                        Length
---------------------------------------- ----------
ADAMS,ALLEN,BLAKE,CLARK                          23
FORD,JAMES,JONES,KING                            21
MARTIN,MILLER,SCOTT                              19
SMITH,TURNER,WARD                                17

To submit your answer (one or more!) please start following this blog and add a comment to this post.

A correct answer (and workarounds!) will be published here in a week.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

3 Solutions to Puzzle of the Week #10: Fibonacci

Puzzle of the Week #10: Fibonacci

With a single SELECT statement calculate first 20 Fibonacci numbers without using Binet’s formula.

Expected Result:

   N     Fib(n)
---- ----------
   1          1
   2          1
   3          2
   4          3
   5          5
   6          8
   7         13
   8         21
   9         34
  10         55
  11         89
  12        144
  13        233
  14        377
  15        610
  16        987
  17       1597
  18       2584
  19       4181
  20       6765

Solutions:

#1. Oracle 10g solution (using MODEL clause):

SELECT n, f AS "Fib(n)"
FROM dual
MODEL
  DIMENSION BY (0 d)
  MEASURES (0 n, 0 f)
  RULES ITERATE(20) (
     f[iteration_number]=DECODE(iteration_number, 0,1, 1,1,
                                f[iteration_number-2]+f[iteration_number-1]),
     n[iteration_number]=iteration_number+1
  );

  N     Fib(n)
--- ----------
  1          1
  2          1
  3          2
  4          3
  5          5
  6          8
  7         13
  8         21
  9         34
 10         55
 11         89
 12        144
 13        233
 14        377
 15        610
 16        987
 17       1597
 18       2584
 19       4181
 20       6765

#2. Oracle 11.2g solution (using Recursive WITH clause):

WITH x(n, f1, f2) AS (
SELECT 1, 1, 1
FROM dual
UNION ALL
SELECT n+1, f2, f1+f2
FROM x
WHERE n<20
)
SELECT n, f1 AS “Fib(n)”
FROM x

#3. Oracle 12c solution (using WITH for PL/SQL function):

WITH
  FUNCTION fib(n INTEGER) RETURN NUMBER DETERMINISTIC
  AS
  BEGIN
     RETURN CASE WHEN n IN (1,2) THEN 1
                 ELSE fib(n-2)+fib(n-1)
            END;
  END;
SELECT LEVEL n, fib(LEVEL) AS "Fib(n)"
FROM dual
CONNECT BY LEVEL<=20

Suggested further reading:

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions. The book is also available on Amazon and in all major book stores.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/