Puzzle of the Week #5:

Find the shortest and longest last names of the employees in each department.

• If two or more employees tie for the shortest or longest name, pick the name that comes first in alphabetical order
• Use hr.employees or scott.emp tables
• Use a single SELECT statement only
• Ideally, the solution should NOT rely on any sub-queries, CTEs  (WITH clause), or inline views
• Exclude unknown (NULL) departments

Solutions:

#1. Using MIN() KEEP Group Function

```SELECT department_id,
MIN(last_name) KEEP(DENSE_RANK FIRST
ORDER BY LENGTH(last_name)) shortest,
MIN(last_name) KEEP(DENSE_RANK FIRST
ORDER BY LENGTH(last_name) DESC) longest
FROM hr.employees
WHERE department_id IS NOT NULL
GROUP BY department_id```

#2. Using FIRST_VALUE Analytic Function and DISTINCT option

(Credit to Igor Shpungin)

```SELECT DISTINCT department_id,
FIRST_VALUE(last_name) OVER(PARTITION BY department_id
ORDER BY LENGTH(last_name))      shortest,
FIRST_VALUE(last_name) OVER(PARTITION BY department_id
ORDER BY LENGTH(last_name) DESC) longest
FROM hr.employees
WHERE department_id IS NOT NULL
ORDER BY 1```

#3. Using MODEL clause

(Credit to Naoto Katayama)

```SELECT department_id, shortest, longest
FROM hr.employees
WHERE department_id IS NOT NULL
MODEL
RETURN UPDATED ROWS
PARTITION BY (department_id)
DIMENSION BY (
ROW_NUMBER()OVER(PARTITION BY department_id
ORDER BY LENGTH(last_name), last_name) rn1,
ROW_NUMBER()OVER(PARTITION BY department_id
ORDER BY LENGTH(last_name) DESC, last_name) rn2)
MEASURES(last_name,
CAST(NULL AS VARCHAR2(25)) AS shortest,
CAST(NULL AS VARCHAR2(25)) AS longest)
RULES(
shortest[0,0]=MAX(last_name)[1,ANY],
longest [0,0]=MAX(last_name)[ANY,1]
)
ORDER BY department_id```

You can execute the above SQL statements in Oracle Live SQL environment.

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

2018 Puzzle of the Week #1:

For a given text string, find the first (from the beginning) longest sub-string that does not have repeating characters.

Solutions:

Solution #1: Using CONNECT BY clause (for range generation), REGEXP_COUNT, and RANK() functions:

```WITH w AS (
SELECT 'arkansas' AS word
FROM dual
), r AS (
SELECT ROWNUM rn
FROM w
CONNECT BY LEVEL<=LENGTH(word)
), x AS (
SELECT SUBSTR(w.word, r1.rn, r2.rn - r1.rn + 1) substr,
RANK() OVER(ORDER BY r2.rn - r1.rn DESC, r1.rn) rk
FROM r r1, r r2, w
WHERE r1.rn<=r2.rn
AND REGEXP_COUNT(SUBSTR(w.word, r1.rn, r2.rn - r1.rn + 1), '(.).*\1') = 0
)
SELECT substr
FROM x
WHERE rk=1```

Result of execution in Oracle Live SQL client:

SUBSTR
rkans

Solution #2: Using CONNECT BY clause (for range generation), REGEXP_LIKE, and MAX() KEEP functions:

```WITH w AS (
SELECT 'arkansas' AS word
FROM dual
), r AS (
SELECT ROWNUM rn
FROM w
CONNECT BY LEVEL<=LENGTH(word)
)
SELECT MAX(SUBSTR(w.word, r1.rn, r2.rn - r1.rn + 1))
KEEP(DENSE_RANK FIRST ORDER BY r2.rn - r1.rn DESC, r1.rn) substr
FROM r r1, r r2, w
WHERE r1.rn<=r2.rn
AND NOT REGEXP_LIKE(SUBSTR(w.word, r1.rn, r2.rn - r1.rn + 1), '(.).*\1')```

Solution #3: Using CONNECT BY clause (twice), LATERAL view, REGEXP_COUNT, and RANK() functions:

```WITH w AS (
SELECT 'arkansas' AS word
FROM dual
), s AS (
SELECT SUBSTR(word, LEVEL) word, LEVEL rn
FROM w
CONNECT BY LEVEL<=LENGTH(word)
)
SELECT MAX(x.substr)
KEEP(DENSE_RANK FIRST ORDER BY LENGTH(x.substr) DESC, s.rn) substr
FROM s, LATERAL(SELECT SUBSTR(s.word, 1, LEVEL) substr
FROM dual
CONNECT BY LEVEL<=LENGTH(s.word)) x
WHERE REGEXP_COUNT(x.substr, '(.).*\1') = 0```

Solution #4: Using XMLTable function (for range generation), Correlated subquery with COUNT(DISTINCT), and MAX() KEEP function:

```WITH w AS (
SELECT 'arkansas' AS word
FROM dual
), r AS (
SELECT ROWNUM rn, word
FROM w, XMLTABLE('for \$i in 1 to \$N cast as xs:integer return \$i'
PASSING LENGTH(w.word) AS N) x
)
SELECT MAX(SUBSTR(r1.word, r1.rn, r2.rn - r1.rn + 1))
KEEP(DENSE_RANK FIRST ORDER BY r2.rn - r1.rn DESC, r1.rn) substr
FROM r r1, r r2
WHERE r1.rn<=r2.rn
AND r2.rn - r1.rn + 1 =
(SELECT COUNT(DISTINCT SUBSTR(SUBSTR(r1.word, r1.rn, r2.rn - r1.rn + 1),
LEVEL, 1))
FROM dual
CONNECT BY LEVEL<=r2.rn - r1.rn + 1
)```

Solution #5: Using CONNECT BY, Recursive CTE, INSTR, SUBSTR, and MAX() KEEP functions:

```WITH w AS (
SELECT 'arkansas' word
FROM dual
), s(sub, word, lvl, rn) AS (
SELECT SUBSTR(word, LEVEL, 1), SUBSTR(word, LEVEL) word, 1, ROWNUM
FROM w
CONNECT BY SUBSTR(word, LEVEL) IS NOT NULL
UNION ALL
SELECT SUBSTR(word, 1, lvl+1), word, lvl+1, ROWNUM
FROM s
WHERE LENGTH(SUBSTR(word, 1, lvl+1))=lvl+1
AND INSTR(sub, SUBSTR(word, lvl+1, 1))=0
)
SELECT MAX(sub) KEEP (DENSE_RANK FIRST ORDER BY lvl DESC, rn) substr
FROM s```

You can execute the above SQL statements in Oracle Live SQL environment.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Puzzle of the Week #17:

Write a single SELECT statement that would show the years of hire in each department. The result should have 3 columns (see below): deptno, year1, and year2. If a department only hired during 1 calendar year, this year should be shown in year1 column (see deptno 30) and year2 column should be blank. If a department hired during 2 calendar years, the first year should be should be shown in year1 column, and the 2nd year should be shown in year2 column (see deptno 10). In all other cases, show 1st year in year1 column and “More (N)” where N is the number of years that department did the hiring (see deptno 20).

Expected Result:

```DEPTNO Year 1   Year 2
------ -------- --------
10 1981     1982
20 1980     More (3)
30 1981

```

Solutions

#1: Using COUNT(DISTINCT ..)

```SELECT deptno, MIN(EXTRACT(YEAR FROM hiredate)) AS "Year 1",
CASE COUNT(DISTINCT EXTRACT(YEAR FROM hiredate))
WHEN 1 THEN ''
WHEN 2 THEN TO_CHAR(MAX(EXTRACT(YEAR FROM hiredate)))
ELSE 'More (' || COUNT(DISTINCT EXTRACT(YEAR FROM hiredate)) || ')'
END AS "Year 2"
FROM emp
GROUP BY deptno
ORDER BY 1;

DEPTNO     Year 1 Year 2
------ ---------- --------
10       1981 1982
20       1980 More (3)
30       1981```

#2: Using DENSE_RANK Analytic Function

```WITH x AS (
SELECT deptno, EXTRACT(YEAR FROM hiredate) hire_year,
DENSE_RANK()OVER(PARTITION BY deptno ORDER BY EXTRACT(YEAR FROM hiredate)) rk
FROM emp
)
SELECT deptno, MIN(hire_year) "Year 1",
CASE MAX(rk) WHEN 1 THEN ''
WHEN 2 THEN CAST(MAX(hire_year) AS CHAR(4))
ELSE 'More (' || MAX(rk) || ')'
END AS "Year 2"
FROM x
GROUP BY deptno
ORDER BY 1;

DEPTNO     Year 1 Year 2
------ ---------- --------
10       1981 1982
20       1980 More (3)
30       1981```

#3: Using PIVOT clause

```SELECT deptno, Y1 "Year 1",
CASE WHEN cnt>2 THEN 'More (' || cnt || ')'
ELSE TO_CHAR(Y2)
END "Year 2"
FROM
(
SELECT deptno, EXTRACT(YEAR FROM hiredate) yr,
CASE DENSE_RANK()OVER(PARTITION BY deptno ORDER BY EXTRACT(YEAR FROM hiredate))
WHEN 1 THEN 'Y1'
WHEN 2 THEN 'Y2'
END AS Y,
COUNT(DISTINCT EXTRACT(YEAR FROM hiredate))OVER(PARTITION BY deptno) cnt
FROM emp
)
PIVOT
(
MAX(yr)
FOR y IN ('Y1' y1,'Y2' y2)
);

DEPTNO     Year 1 Year 2
------ ---------- --------
10       1981 1982
20       1980 More (3)
30       1981```

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Solutions to Puzzle of the Week #13

Puzzle of the Week #13:

Table Setup and Puzzle description can be located here

Expected Result:

```  ID FULL_NAME                             GROUP_ID
---- ----------------------------------- ----------
8 Oscar Pedro Fernando Rodriguez               1
9 Rodriguez, Oscar Pedro Fernando              1
10 Oscar Fernando Rodriguez Pedro               1
1 John Smith                                   2
2 John L. Smith                                2
4 Smith, John                                  2
5 Tom Khan                                     3
11 KHAN, TOM S.                                 3```

Solutions:

#1. Using CTE (Recursive WITH) and LISTAGG

```WITH x AS (
SELECT name_id, UPPER(REGEXP_REPLACE(full_name,'[[:punct:]]')) full_name
FROM name_list
), y(id, token, lvl) AS (
SELECT name_id, REGEXP_SUBSTR(full_name, '[^ ]+', 1, 1), 1
FROM x
UNION ALL
SELECT x.name_id, REGEXP_SUBSTR(full_name, '[^ ]+', 1, y.lvl+1), y.lvl+1
FROM x JOIN y ON x.name_id=y.id AND REGEXP_SUBSTR(full_name, '[^ ]+', 1, y.lvl+1) IS NOT NULL
), z AS (
SELECT id, LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token) ordered_name,
COUNT(*)OVER(PARTITION BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) cnt,
DENSE_RANK()OVER(ORDER BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) group_id
FROM y
WHERE LENGTH(token)>1
GROUP BY id
)
SELECT z.id, n.full_name, DENSE_RANK()OVER(ORDER BY group_id) group_id
FROM z JOIN name_list n ON z.id=n.name_id
WHERE z.cnt>1
ORDER BY 3, 1;

ID FULL_NAME                                  GROUP_ID
--- ---------------------------------------- ----------
8 Oscar Pedro Fernando Rodrigues                    1
9 Rodrigues, Oscar Pedro Fernando                   1
10 Oscar Fernando Rodrigues Pedro                    1
1 John Smith                                        2
2 John L. Smith                                     2
4 Smith, John                                       2
5 Tom Khan                                          3
11 KHAN, TOM S.                                      3```

Explanation:

The key idea is to split each name into multiple name tokens, then sort and merge them back into a single line. Matching (duplicate) names will have the same merged line so we could use it to identify duplicates. DENSE_RANK analytic function is used to generate sequential group id values.

The same idea is used in the solution below. The only difference is the way to split the names into tokens.

#2: Using CONNECT BY and TABLE/CAST/MULTISET functions

``` WITH x AS (
SELECT name_id, UPPER(REGEXP_REPLACE(full_name,'[[:punct:]]')) full_name
FROM name_list
), y AS (
SELECT name_id AS id, y.column_value AS token
FROM x,
TABLE(CAST(MULTISET(SELECT REGEXP_SUBSTR(x.full_name, '[^ ]+', 1, LEVEL) token
FROM dual
CONNECT BY LEVEL <= LENGTH(full_name)-LENGTH(REPLACE(full_name,' '))+1
)
AS sys.odcivarchar2list)
) y
WHERE LENGTH(y.column_value)>1
), z AS (
SELECT id, LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token) ordered_name,
COUNT(*)OVER(PARTITION BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) cnt,
DENSE_RANK()OVER(ORDER BY LISTAGG(token, ' ') WITHIN GROUP(ORDER BY token)) group_id
FROM y
WHERE LENGTH(token)>1
GROUP BY id
)
SELECT z.id, n.full_name, DENSE_RANK()OVER(ORDER BY group_id) group_id
FROM z JOIN name_list n ON z.id=n.name_id
WHERE z.cnt>1
ORDER BY 3, 1;

ID FULL_NAME                                  GROUP_ID
---- ---------------------------------------- ----------
8 Oscar Pedro Fernando Rodrigues                    1
9 Rodrigues, Oscar Pedro Fernando                   1
10 Oscar Fernando Rodrigues Pedro                    1
1 John Smith                                        2
2 John L. Smith                                     2
4 Smith, John                                       2
5 Tom Khan                                          3
11 KHAN, TOM S.                                      3```

Puzzle of the Week #6

Find all employees who is paid one of top 4 salaries in the entire company without using sub-queries and in-line views/WITH clause.

Expected Result:

```ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850
```

Method/Workaround #1: Use Aggregation over Cartesian Product

This method works in all RDBMS systems, such as Oracle, SQL Server, Sybase, Teradata, etc.

```SELECT a.ename, a.sal
FROM emp a, emp b
WHERE a.sal<=b.sal
GROUP BY a.ename, a.sal
HAVING COUNT(DISTINCT b.sal)<=4
ORDER BY 2 DESC;

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850
```

This method is described in details in my book Oracle SQL Tricks and Workarounds.

Method/Workaround #2: Use DENSE_RANK and MINUS

This method is more Oracle specific, but it is as good as the 1st one. It was suggested by one of the contest participants:

```SELECT ename,
CASE WHEN DENSE_RANK()OVER(ORDER BY sal DESC)<=4 THEN sal END sal
FROM emp
MINUS
SELECT ename, NULL
FROM emp
ORDER BY 2 DESC;

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850
```

This is a very elegant variation of the following query with MINUS substituting the filter for analitic function result:

```SELECT ename, sal
FROM (SELECT ename, sal, DENSE_RANK()OVER(ORDER BY sal DESC) rk
FROM emp)
WHERE rk<=4;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850
```

Interestingly enough, Teradata SQL has a special clause QUALIFY for that matter. It is very elegant and maybe one day it will become an ANSII standard:

```SELECT ename, sal
FROM emp
QUALIFY DENSE_RANK()OVER(ORDER BY sal DESC)<=4
```

Finally, I would like to share a couple of more traditional approaches that DO USE subqueries:

```SELECT ename, sal
FROM emp a
WHERE 4>=(SELECT COUNT(DISTINCT b.sal)
FROM emp b
WHERE b.sal>=a.sal)
ORDER BY sal DESC;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850
```
```SELECT ename, sal
FROM emp a
WHERE 4>(SELECT COUNT(DISTINCT b.sal)
FROM emp b
WHERE b.sal>a.sal)
ORDER BY sal DESC;

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
JONES            2975
BLAKE            2850
```

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Interview Question: Get the 2nd highest salary in the company

Question: Get the 2nd highest salary in the company.

Question level: Intermediate

We picked just 6 workarounds for this fairly simple problem. Overall, there are at least 15 different approaches available.

Method/Workaround #1 (Level: Beginner)

```SELECT MAX(SAL)
FROM (SELECT SAL
FROM EMP
WHERE SAL<(SELECT MAX(SAL)
FROM EMP)
)
```

Method/Workaround #2 (Level: Intermediate)

```SELECT sal
FROM (SELECT sal, ROWNUM rn
FROM (SELECT SAL
FROM EMP
GROUP BY SAL
ORDER BY 1 DESC)
WHERE ROWNUM<=2
)
WHERE RN=2
```

Method/Workaround #3 (Level: Intermediate)

```SELECT DISTINCT sal
FROM (SELECT SAL, DENSE_RANK()OVER(ORDER BY SAL DESC) RK
FROM EMP)
WHERE RK=2
```

Method/Workaround #4 (Level: Intermediate)

```SELECT SAL
FROM (SELECT DISTINCT SAL, DENSE_RANK()OVER(ORDER BY SAL DESC) RK
FROM EMP)
WHERE RK=2
```

```SELECT A.SAL
FROM EMP A JOIN EMP B ON A.SAL<=b.SAL
GROUP BY a.sal
HAVING COUNT(DISTINCT b.sal)=2
```

Method/Workaround #6 (Level: Intermediate)

```SELECT DISTINCT SAL
FROM emp a
WHERE 2=(SELECT COUNT(DISTINCT sal)
FROM emp b
WHERE b.sal>=a.sal)
```

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.