Three Soluitions to Puzzle of the Week #17

Puzzle of the Week #17:

Write a single SELECT statement that would show the years of hire in each department. The result should have 3 columns (see below): deptno, year1, and year2. If a department only hired during 1 calendar year, this year should be shown in year1 column (see deptno 30) and year2 column should be blank. If a department hired during 2 calendar years, the first year should be should be shown in year1 column, and the 2nd year should be shown in year2 column (see deptno 10). In all other cases, show 1st year in year1 column and “More (N)” where N is the number of years that department did the hiring (see deptno 20).

Expected Result:

DEPTNO Year 1   Year 2
------ -------- --------
    10 1981     1982
    20 1980     More (3)
    30 1981

Solutions

#1: Using COUNT(DISTINCT ..)

SELECT deptno, MIN(EXTRACT(YEAR FROM hiredate)) AS "Year 1",
       CASE COUNT(DISTINCT EXTRACT(YEAR FROM hiredate))
            WHEN 1 THEN ''
            WHEN 2 THEN TO_CHAR(MAX(EXTRACT(YEAR FROM hiredate)))
            ELSE 'More (' || COUNT(DISTINCT EXTRACT(YEAR FROM hiredate)) || ')'
       END AS "Year 2"
FROM emp
GROUP BY deptno
ORDER BY 1;

DEPTNO     Year 1 Year 2
------ ---------- --------
    10       1981 1982
    20       1980 More (3)
    30       1981

#2: Using DENSE_RANK Analytic Function

WITH x AS (
SELECT deptno, EXTRACT(YEAR FROM hiredate) hire_year,
       DENSE_RANK()OVER(PARTITION BY deptno ORDER BY EXTRACT(YEAR FROM hiredate)) rk
FROM emp
)
SELECT deptno, MIN(hire_year) "Year 1",
       CASE MAX(rk) WHEN 1 THEN ''
                    WHEN 2 THEN CAST(MAX(hire_year) AS CHAR(4))
                    ELSE 'More (' || MAX(rk) || ')'
       END AS "Year 2"
FROM x
GROUP BY deptno
ORDER BY 1;

DEPTNO     Year 1 Year 2
------ ---------- --------
    10       1981 1982
    20       1980 More (3)
    30       1981

#3: Using PIVOT clause

SELECT deptno, Y1 "Year 1",
       CASE WHEN cnt>2 THEN 'More (' || cnt || ')'
       ELSE TO_CHAR(Y2)
       END "Year 2"
FROM
(
SELECT deptno, EXTRACT(YEAR FROM hiredate) yr,
       CASE DENSE_RANK()OVER(PARTITION BY deptno ORDER BY EXTRACT(YEAR FROM hiredate))
       WHEN 1 THEN 'Y1'
            WHEN 2 THEN 'Y2'
       END AS Y,
       COUNT(DISTINCT EXTRACT(YEAR FROM hiredate))OVER(PARTITION BY deptno) cnt
FROM emp
)
PIVOT
(
  MAX(yr)
  FOR y IN ('Y1' y1,'Y2' y2)
);

DEPTNO     Year 1 Year 2
------ ---------- --------
    10       1981 1982
    20       1980 More (3)
    30       1981

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

 

 

 

 

Advertisements

Solutions to Puzzle of the Week #13

Puzzle of the Week #13:

Table Setup and Puzzle description can be located here

Expected Result:

  ID FULL_NAME                             GROUP_ID
---- ----------------------------------- ----------
   8 Oscar Pedro Fernando Rodriguez               1
   9 Rodriguez, Oscar Pedro Fernando              1
  10 Oscar Fernando Rodriguez Pedro               1
   1 John Smith                                   2
   2 John L. Smith                                2
   4 Smith, John                                  2
   5 Tom Khan                                     3
  11 KHAN, TOM S.                                 3

Solutions:

#1. Using CTE (Recursive WITH) and LISTAGG

WITH x AS (
SELECT name_id, UPPER(REGEXP_REPLACE(full_name,'[[:punct:]]')) full_name
FROM name_list
), y(id, token, lvl) AS (
SELECT name_id, REGEXP_SUBSTR(full_name, '[^ ]+', 1, 1), 1 
FROM x
UNION ALL
SELECT x.name_id, REGEXP_SUBSTR(full_name, '[^ ]+', 1, y.lvl+1), y.lvl+1
FROM x JOIN y ON x.name_id=y.id AND REGEXP_SUBSTR(full_name, '[^ ]+', 1, y.lvl+1) IS NOT NULL
), z AS (
SELECT id, LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token) ordered_name, 
       COUNT(*)OVER(PARTITION BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) cnt,
       DENSE_RANK()OVER(ORDER BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) group_id
FROM y
WHERE LENGTH(token)>1
GROUP BY id
)
SELECT z.id, n.full_name, DENSE_RANK()OVER(ORDER BY group_id) group_id
FROM z JOIN name_list n ON z.id=n.name_id
WHERE z.cnt>1
ORDER BY 3, 1;

  ID FULL_NAME                                  GROUP_ID
--- ---------------------------------------- ----------
  8 Oscar Pedro Fernando Rodrigues                    1
  9 Rodrigues, Oscar Pedro Fernando                   1
 10 Oscar Fernando Rodrigues Pedro                    1
  1 John Smith                                        2
  2 John L. Smith                                     2
  4 Smith, John                                       2
  5 Tom Khan                                          3
 11 KHAN, TOM S.                                      3

Explanation:

The key idea is to split each name into multiple name tokens, then sort and merge them back into a single line. Matching (duplicate) names will have the same merged line so we could use it to identify duplicates. DENSE_RANK analytic function is used to generate sequential group id values.

The same idea is used in the solution below. The only difference is the way to split the names into tokens.

#2: Using CONNECT BY and TABLE/CAST/MULTISET functions

 WITH x AS (
SELECT name_id, UPPER(REGEXP_REPLACE(full_name,'[[:punct:]]')) full_name
FROM name_list
), y AS (
SELECT name_id AS id, y.column_value AS token
FROM x,
     TABLE(CAST(MULTISET(SELECT REGEXP_SUBSTR(x.full_name, '[^ ]+', 1, LEVEL) token
                    FROM dual
                    CONNECT BY LEVEL <= LENGTH(full_name)-LENGTH(REPLACE(full_name,' '))+1
                        )
                AS sys.odcivarchar2list)
          ) y
WHERE LENGTH(y.column_value)>1
), z AS (
SELECT id, LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token) ordered_name,
       COUNT(*)OVER(PARTITION BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) cnt,
       DENSE_RANK()OVER(ORDER BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) group_id
FROM y
WHERE LENGTH(token)>1
GROUP BY id
)
SELECT z.id, n.full_name, DENSE_RANK()OVER(ORDER BY group_id) group_id
FROM z JOIN name_list n ON z.id=n.name_id
WHERE z.cnt>1
ORDER BY 3, 1;

  ID FULL_NAME                                  GROUP_ID
---- ---------------------------------------- ----------
   8 Oscar Pedro Fernando Rodrigues                    1
   9 Rodrigues, Oscar Pedro Fernando                   1
  10 Oscar Fernando Rodrigues Pedro                    1
   1 John Smith                                        2
   2 John L. Smith                                     2
   4 Smith, John                                       2
   5 Tom Khan                                          3
  11 KHAN, TOM S.                                      3

 

Solutions to Puzzle of the Week #6

Puzzle of the Week #6

Find all employees who is paid one of top 4 salaries in the entire company without using sub-queries and in-line views/WITH clause.

Expected Result:

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850

 

Method/Workaround #1: Use Aggregation over Cartesian Product

This method works in all RDBMS systems, such as Oracle, SQL Server, Sybase, Teradata, etc.

SELECT a.ename, a.sal
FROM emp a, emp b
WHERE a.sal<=b.sal
GROUP BY a.ename, a.sal
HAVING COUNT(DISTINCT b.sal)<=4
ORDER BY 2 DESC;

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850

This method is described in details in my book Oracle SQL Tricks and Workarounds.

Method/Workaround #2: Use DENSE_RANK and MINUS

This method is more Oracle specific, but it is as good as the 1st one. It was suggested by one of the contest participants:

SELECT ename, 
       CASE WHEN DENSE_RANK()OVER(ORDER BY sal DESC)<=4 THEN sal END sal
FROM emp
MINUS
SELECT ename, NULL
FROM emp
ORDER BY 2 DESC;

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850

This is a very elegant variation of the following query with MINUS substituting the filter for analitic function result:

SELECT ename, sal
FROM (SELECT ename, sal, DENSE_RANK()OVER(ORDER BY sal DESC) rk
      FROM emp)
WHERE rk<=4;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850

Interestingly enough, Teradata SQL has a special clause QUALIFY for that matter. It is very elegant and maybe one day it will become an ANSII standard:

SELECT ename, sal
FROM emp
QUALIFY DENSE_RANK()OVER(ORDER BY sal DESC)<=4

Finally, I would like to share a couple of more traditional approaches that DO USE subqueries:

SELECT ename, sal
FROM emp a
WHERE 4>=(SELECT COUNT(DISTINCT b.sal)
          FROM emp b
          WHERE b.sal>=a.sal)
ORDER BY sal DESC;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850
SELECT ename, sal
FROM emp a
WHERE 4>(SELECT COUNT(DISTINCT b.sal)
          FROM emp b
          WHERE b.sal>a.sal)
ORDER BY sal DESC;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Interview Question: Get the 2nd highest salary in the company

Question: Get the 2nd highest salary in the company.

Question level: Intermediate

We picked just 6 workarounds for this fairly simple problem. Overall, there are at least 15 different approaches available.

Method/Workaround #1 (Level: Beginner)

SELECT MAX(SAL)
FROM (SELECT SAL
      FROM EMP
      WHERE SAL<(SELECT MAX(SAL) 
                 FROM EMP)
     )

Method/Workaround #2 (Level: Intermediate)

SELECT sal
FROM (SELECT sal, ROWNUM rn
      FROM (SELECT SAL
            FROM EMP
            GROUP BY SAL
            ORDER BY 1 DESC)
      WHERE ROWNUM<=2
      )
WHERE RN=2 

Method/Workaround #3 (Level: Intermediate)

SELECT DISTINCT sal
FROM (SELECT SAL, DENSE_RANK()OVER(ORDER BY SAL DESC) RK
      FROM EMP)
WHERE RK=2      

Method/Workaround #4 (Level: Intermediate)

SELECT SAL
FROM (SELECT DISTINCT SAL, DENSE_RANK()OVER(ORDER BY SAL DESC) RK
      FROM EMP)
WHERE RK=2     

Method/Workaround #5 (Level: Advanced)

SELECT A.SAL
FROM EMP A JOIN EMP B ON A.SAL<=b.SAL
GROUP BY a.sal
HAVING COUNT(DISTINCT b.sal)=2

Method/Workaround #6 (Level: Intermediate)

SELECT DISTINCT SAL
FROM emp a
WHERE 2=(SELECT COUNT(DISTINCT sal)
         FROM emp b
         WHERE b.sal>=a.sal) 

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.