Explore the power of analytic functions – Part 2

In the previous blog post on that subject we reviewed a couple a notable applications of Oracle analytic functions. Today, I came across another interesting illustration of the same concept.

Problem: List all employees from the same department and holding the same job title as ADAMS.

Expected Result:

ENAME JOB DEPTNO
ADAMS CLERK 20
SMITH CLERK 20

Like before, we start with traditional approaches that every experienced developer would easily demonstrate.

Strategy #1:  Using multi-column subquery

SELECT ename, job, deptno
FROM scott.emp
WHERE (deptno, job) IN (SELECT deptno, job
                        FROM scott.emp
                        WHERE ename = 'ADAMS')
ORDER BY 2, 3, 1

Strategy #2:  Using self-join

SELECT a.ename, job, deptno
FROM scott.emp a JOIN scott.emp b USING(deptno, job)
WHERE b.ename = 'ADAMS'
ORDER BY job, deptno, a.ename

Strategy #3:  Using EXISTS predicate

SELECT ename, job, deptno
FROM scott.emp a
WHERE EXISTS (SELECT 1
              FROM scott.emp
              WHERE ename  = 'ADAMS'
                AND deptno = a.deptno
                AND job    = a.job)
ORDER BY 2, 3, 1

A common feature of all the strategies above is having two copies of the emp table with two joining conditions (deptno, job) and one filter (ename = ‘ADAMS’)

As we have seen before, with analytic functions, we can get away with a single copy of th emp table.

Strategy #4:  Using COUNT Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       COUNT(DECODE(ename, 'ADAMS', 1)) 
             OVER(PARTITION BY deptno, job) cnt
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE cnt > 0
ORDER BY 2, 3, 1

Of course, you can use different analytic functions here:

Strategy #5:  Using MAX Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       MAX(DECODE(ename, 'ADAMS', ename)) 
           OVER(PARTITION BY deptno, job) adams
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE adams = 'ADAMS'
ORDER BY 2, 3, 1

Strategy #6:  Using LISTAGG Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       LISTAGG(DECODE(ename, 'ADAMS', 'Y'), '|') WITHIN GROUP (ORDER BY 1) 
               OVER(PARTITION BY deptno, job) flag
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE flag LIKE 'Y%'
ORDER BY 2, 3, 1

We need to use LIKE operator in case we have more than a single Adams working in the same department and holding the same job title.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

Interview Question: For each department count the number of employees who get no commission.

Interview Question:

Write a single SELECT statement that returns  the number of employees who get no commission broken down by department. (Use scott.emp table)

Level:

Intermediate

Expected Result:

DEPTNO NO_COMM_COUNT
10 3
20 5
30 3

 Solutions

A very typical attempt to solve this problem results in the following query:

SELECT deptno, COUNT(*) no_comm_count
FROM scott.emp
WHERE comm IS NULL OR comm=0
GROUP BY deptno
ORDER BY 1

Yes, the result looks correct, but is the query correct?

The answer is NO! It would become apparent if we had a department where all employees get paid commission, so the number of those who does not would be 0.

Let’s change the requirement a bit – we will show all department and number of employees hired on Friday:

SELECT deptno, COUNT(*) fri_count
FROM scott.emp
WHERE TO_CHAR(hiredate, 'DY')='FRI'
GROUP BY deptno 
ORDER BY 1

The result of this query is clearly not what we want:

DEPTNO FRI_COUNT
30 2

We would expect the following instead:

DEPTNO FRI_COUNT
10 0
20 0
30 2

Why don’t we get the departments 10 and 20? The answer is very simple – because we filter “all” those department rows with our WHERE clause. So how should we work around?

Let’s start with more intuitive but less efficient approaches – we will use the same query as before and UNION it with another query that returns “empty” departments. Essentially, the original problem transforms into a new one – find all department where no employees were hired on Friday.

Strategy #1: Using UNION ALL with multi-column non-correlated subquery:

SELECT deptno, COUNT(*) fri_count 
FROM scott.emp 
WHERE TO_CHAR(hiredate, 'DY')='FRI' 
GROUP BY deptno
UNION ALL
SELECT deptno, 0 fri_count 
FROM scott.emp
WHERE (deptno, 'FRI') NOT IN (SELECT deptno, TO_CHAR(hiredate, 'DY')
                              FROM scott.emp)
GROUP BY deptno
ORDER BY 1
DEPTNO FRI_COUNT
10 0
20 0
30 2

Strategy #2: Using UNION ALL with ALL predicate on correlated subquery:

SELECT deptno, COUNT(*) fri_count
FROM scott.emp
WHERE TO_CHAR(hiredate, 'DY')='FRI'
GROUP BY deptno
UNION ALL 
SELECT deptno, 0 no_comm_count 
FROM scott.emp a
WHERE 'FRI'!=ALL(SELECT TO_CHAR(hiredate, 'DY')
                 FROM scott.emp b
                 WHERE a.deptno=b.deptno) 
GROUP BY deptno
ORDER BY 1

It is apparent that the ALL predicate ensures that no employees were hired on Friday.

Now we will mimic the behavior of the UNION ALL operator using LEFT JOIN:

Strategy #3: Using LEFT JOIN:

SELECT a.deptno, COUNT(DISTINCT b.empno) fri_count
FROM scott.emp a LEFT JOIN scott.emp b ON a.deptno=b.deptno
                                      AND TO_CHAR(b.hiredate, 'DY')='FRI'
GROUP BY a.deptno
ORDER BY 1

COUNT(DISTINCT …) is needed to handle a Cartesian Product as the join by deptno column produces many to many  relationship, i.e. Cartesian product.

Strategy #4: Generic substitution technique for an outer-join using UNION ALL

WITH e AS (
SELECT deptno, COUNT(*) fri_count
FROM scott.emp
WHERE TO_CHAR(hiredate, 'DY') = 'FRI'
GROUP BY deptno
UNION ALL
SELECT deptno, 0
FROM scott.emp
GROUP BY deptno
)
SELECT deptno, MAX(fri_count) fri_count
FROM e
GROUP BY deptno
ORDER BY 1

All the above techniques may look cool but they are clearly an overkill for such a simple problem. There is a simple rule worth remembering:

If you need to conditionally aggregate all records in the table but you fail doing so due to a WHERE clause filter, consider moving the filter into the GROUP function you use in SELECT.

Strategy #5: Conditional Aggregation

SELECT deptno, COUNT(DECODE(TO_CHAR(hiredate, 'DY'), 'FRI', 1)) fri_count
FROM scott.emp
GROUP BY deptno
ORDER BY 1

Alternatively, you can use CASE function inside of COUNT. It is especially convenient for our original question/problem, i.e. to count employees who is not paid a commission:

SELECT deptno, COUNT(CASE WHEN LNNVL(comm>0) THEN 1 END) no_comm_count
FROM scott.emp
GROUP BY deptno
ORDER BY 1
DEPTNO NO_COMM_COUNT
10 3
20 5
30 3

This approach is the most efficient as it makes Oracle scanning the emp table only once.

Notice the use of the LNNVL function. You can read more about it in my recent post here.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

9 Solutions to 2018 Oracle SQL Puzzle of the Week #7

Namesake Puzzle

Show groups of employees having the same last name.

  • Use a single SELECT statement only
  • Use hr.employees table

Solutions:

Solution #1: Using Subquery with HAVING clause:

SELECT first_name, last_name, department_id, employee_id 
FROM hr.employees 
WHERE last_name IN (SELECT last_name 
                    FROM hr.employees 
		    GROUP BY last_name 
		    HAVING COUNT(*)>1) 
ORDER BY 2,1

Solution #2: Using Multi-Column Subquery with NO HAVING clause

SELECT first_name, last_name, department_id, employee_id 
FROM hr.employees 
WHERE (last_name, 1) IN (SELECT last_name, SIGN(COUNT(*)-1) 
                         FROM hr.employees 
			 GROUP BY last_name) 
ORDER BY 2,1

Solution #3: Using Subquery with IN operator

SELECT first_name, last_name, department_id, employee_id 
FROM hr.employees a 
WHERE last_name IN (SELECT b.last_name 
                    FROM hr.employees b 
		    WHERE a.employee_id!=b.employee_id) 
ORDER BY 2,1

Solution #4: Using Self-Join with duplicate elimination in GROUP BY

SELECT a.first_name, a.last_name, a.department_id, a.employee_id 
FROM hr.employees a JOIN hr.employees b ON a.last_name=b.last_name 
                                       AND a.employee_id!=b.employee_id 
GROUP BY a.first_name, a.last_name, a.department_id, a.employee_id 
ORDER BY 2,1

Solution #5: Using a filter by COUNT analytic function with PARTITION BY

WITH x AS ( 
SELECT first_name, last_name, department_id, employee_id,  
       COUNT(*) OVER(PARTITION BY last_name) cnt 
FROM hr.employees 
)	 
SELECT first_name, last_name, department_id, employee_id 
FROM x 
WHERE cnt>1 
ORDER BY 2,1

Solution #6: Mimicking COUNT analytic function with MODEL clause

(credit to Naoto Katayama)

WITH x AS ( 
SELECT first_name, last_name, department_id, employee_id, cnt 
FROM hr.employees 
MODEL 
RETURN UPDATED ROWS 
DIMENSION BY (last_name, employee_id) 
MEASURES(first_name, department_id, 0 AS cnt) 
RULES (cnt[ANY, ANY]=COUNT(*)[CV(), ANY]) 
) 
SELECT first_name, last_name, department_id, employee_id 
FROM x 
WHERE cnt>1 
ORDER BY 2,1

Solution #7: Filtering by LEAD and LAG analytic functions

WITH x AS ( 
SELECT first_name, last_name, department_id, employee_id,  
       LAG (last_name,1) OVER(ORDER BY last_name) lag_name, 
       LEAD(last_name,1) OVER(ORDER BY last_name) lead_name 
FROM hr.employees 
)	 
SELECT first_name, last_name, department_id, employee_id 
FROM x 
WHERE last_name IN (lag_name, lead_name) 
ORDER BY 2,1

Solution #8: Using MODEL clause with dummy measure for SIGN over analytic function expression

SELECT first_name, last_name, department_id, employee_id
FROM hr.employees 
MODEL 
RETURN UPDATED ROWS 
PARTITION BY (last_name) 
DIMENSION BY (SIGN(COUNT(*) OVER(PARTITION BY last_name)-1) AS n, 
              employee_id) 
MEASURES(first_name, department_id, 0 AS dummy) 
RULES (dummy[1, ANY]=1) 
ORDER BY 2,1

Solution #9: Using UNPIVOT with DISTINCT option over CONNECT BY with PRIOR

WITH x AS (
SELECT first_name curr_first, last_name, department_id curr_dept, employee_id curr_id, 
       PRIOR first_name prior_first, PRIOR department_id prior_dept, PRIOR employee_id prior_id
FROM hr.employees
WHERE level=2
CONNECT BY last_name=PRIOR last_name AND employee_id>PRIOR employee_id
)
SELECT DISTINCT first_name, last_name, department_id, employee_id
FROM x
UNPIVOT( 
    (first_name, department_id, employee_id)  for dummy IN ((curr_first, curr_dept, curr_id),
                                                            (prior_first,prior_dept,prior_id))
)
ORDER BY 2,1;

You can execute the above SQL statements in Oracle Live SQL environment.
My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Interview Question: Get top and bottom paid employees in each department

This is a typical interview problem: list all bottom and top paid employees in each department. A preferred solution should not be using UNION or UNION ALL operators.

Please watch this short video to learn a couple of non-obvious techniques and to impress your potential employers on your next job interview.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

SQL puzzle: Find unique specialists in every department

Puzzle of the day.
This is a fairly simple problem but from time to time I am being approached by developers who need help with very similar problems.

Find all employees who has a unique job title in their respective department.

Solution #1: Using NOT EXISTS

SELECT ename, deptno, job, sal
FROM emp a
WHERE NOT EXISTS(SELECT 1
                 FROM emp b
                 WHERE a.deptno=b.deptno
                   AND a.job=b.job
                   AND a.empno!=b.empno)
ORDER BY deptno, job

Result:

ENAME          DEPTNO JOB              SAL
---------- ---------- --------- ----------
MILLER             10 CLERK           1300
CLARK              10 MANAGER         2450
KING               10 PRESIDENT       5000
JONES              20 MANAGER         2975
JAMES              30 CLERK            950
BLAKE              30 MANAGER         2850

Solution #1.1 – Generic substitution for NOT EXISTS

SELECT ename, deptno, job, sal
FROM emp a
WHERE 0=(SELECT COUNT(b.empno)
         FROM emp b
         WHERE a.deptno=b.deptno
           AND a.job=b.job
           AND a.empno!=b.empno)
ORDER BY deptno, job

Solution #2: Using NOT IN

SELECT ename, deptno, job, sal
FROM emp a
WHERE job NOT IN(SELECT job
                 FROM emp b
                 WHERE a.deptno=b.deptno
                   AND a.empno!=b.empno)
ORDER BY deptno, job

Result:

ENAME          DEPTNO JOB              SAL
---------- ---------- --------- ----------
MILLER             10 CLERK           1300
CLARK              10 MANAGER         2450
KING               10 PRESIDENT       5000
JONES              20 MANAGER         2975
JAMES              30 CLERK            950
BLAKE              30 MANAGER         2850

Solution #2.1: Using NOT IN with Multi-column subquery – it is pretty much the same as Solution #2.

SELECT ename, deptno, job, sal
FROM emp a
WHERE (job, deptno) NOT IN(SELECT job, deptno
                           FROM emp b
                           WHERE a.empno!=b.empno)
ORDER BY deptno, job

Solution #3.1: Using COUNT in subquery (very similar to Solution #1.1 but has different execution plan)

SELECT ename, deptno, job, sal
FROM emp a
WHERE 1=(SELECT COUNT(b.empno)
         FROM emp b
         WHERE a.deptno=b.deptno
           AND a.job=b.job)
ORDER BY deptno, job

Result:

ENAME          DEPTNO JOB              SAL
---------- ---------- --------- ----------
MILLER             10 CLERK           1300
CLARK              10 MANAGER         2450
KING               10 PRESIDENT       5000
JONES              20 MANAGER         2975
JAMES              30 CLERK            950
BLAKE              30 MANAGER         2850

Solution #3.2: A mixed version of Solutions #1.1 and #3.1:

SELECT ename, deptno, job, sal
FROM emp a
WHERE 0=(SELECT SUM(CASE WHEN a.empno=b.empno THEN 0 ELSE 1 END)
         FROM emp b
         WHERE a.deptno=b.deptno
           AND a.job=b.job)
ORDER BY deptno, job

Solution #4: Using Analytical function COUNT

WITH x AS (
SELECT ename, deptno, job, sal, COUNT(*) OVER(PARTITION BY deptno, job) cnt
FROM emp a
)
SELECT ename, deptno, job, sal
FROM x
WHERE cnt=1
ORDER BY deptno, job

Result:

ENAME          DEPTNO JOB              SAL
---------- ---------- --------- ----------
MILLER             10 CLERK           1300
CLARK              10 MANAGER         2450
KING               10 PRESIDENT       5000
JONES              20 MANAGER         2975
JAMES              30 CLERK            950
BLAKE              30 MANAGER         2850

Solution #4.1 – using MIN/MAX analytical functions – essentially, it is the same as solution #4

WITH x AS (
SELECT ename, deptno, job, sal, 
       MAX(empno) OVER(PARTITION BY deptno, job) max_no,
       MIN(empno) OVER(PARTITION BY deptno, job) min_no
FROM emp a
)
SELECT ename, deptno, job, sal
FROM x
WHERE max_no=min_no
ORDER BY deptno, job

Solution #5: Using In-Line view (WITH)

WITH x AS (
SELECT deptno, job
FROM emp
GROUP BY deptno, job
HAVING COUNT(*)=1
)
SELECT ename, deptno, job, sal
FROM emp JOIN x USING (deptno, job)
ORDER BY deptno, job

Result:

ENAME          DEPTNO JOB              SAL
---------- ---------- --------- ----------
MILLER             10 CLERK           1300
CLARK              10 MANAGER         2450
KING               10 PRESIDENT       5000
JONES              20 MANAGER         2975
JAMES              30 CLERK            950
BLAKE              30 MANAGER         2850

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

SQL Puzzle: Find top management in each department – with multiple workaround solutions, by Zahar Hilkevich

I was recently asked by a colleague at work to help him solving a problem that can be interpreted in scott’s emp table terms as the following:

In each department, list all managers and a president, but in the department where the president works, all managers (if any) should be filtered out.

If we look at all managers/president(s), we will see the following result:

ENAME      JOB           DEPTNO
---------- --------- ----------
JONES      MANAGER           20
BLAKE      MANAGER           30
CLARK      MANAGER           10
KING       PRESIDENT         10

Our desired result should exclude CLARK.

Solution/Workaround #1:

SELECT ename, job, deptno
FROM emp a
WHERE job='PRESIDENT'
  OR (job='MANAGER' AND NOT EXISTS(SELECT 1
                                   FROM emp b
                                   WHERE b.deptno=a.deptno
                                     AND job='PRESIDENT')
     )
ORDER BY deptno

Result:
ENAME      JOB           DEPTNO
---------- --------- ----------
KING       PRESIDENT         10
JONES      MANAGER           20
BLAKE      MANAGER           30

This is the most straight-forward solution and it barely requires any explanation.

Workaround #2:

WITH x AS (
SELECT ename, job, deptno, 
       RANK()OVER(PARTITION BY deptno 
                  ORDER BY DECODE(job,'PRESIDENT',1,'MANAGER',2)) rk
FROM emp a
WHERE job IN ('PRESIDENT', 'MANAGER')
)
SELECT ename, job, deptno
FROM x
WHERE rk=1
ORDER BY deptno

Result:
ENAME      JOB           DEPTNO
---------- --------- ----------
KING       PRESIDENT         10
JONES      MANAGER           20
BLAKE      MANAGER           30

Analytical functions make the solution very simple. Here, we use custom sorting (in the ORDER BY) with the RANK function.

Workaround #3:

WITH x AS (
SELECT ename, job, deptno,
       COUNT(DISTINCT job)OVER(PARTITION BY deptno) cnt
FROM emp a
WHERE job IN ('MANAGER', 'PRESIDENT')
)
SELECT ename, job, deptno
FROM x
WHERE job='PRESIDENT' OR cnt=1
ORDER BY deptno

Result:
ENAME      JOB           DEPTNO
---------- --------- ----------
KING       PRESIDENT         10
JONES      MANAGER           20
BLAKE      MANAGER           30

This solution shows a use of COUNT(DISTINCT …)OVER() analytical function.

Workaround #4:

SELECT ename, job, deptno 
FROM emp 
WHERE (deptno, DECODE(job,'PRESIDENT',1,'MANAGER',2)) IN 
  (SELECT deptno, MIN(DECODE(job,'PRESIDENT',1,'MANAGER',2))
   FROM emp
   GROUP BY deptno)
ORDER BY deptno

Result:
ENAME      JOB           DEPTNO
---------- --------- ----------
KING       PRESIDENT         10
JONES      MANAGER           20
BLAKE      MANAGER           30

Another example of the use of custom order hidden in the multi-column subquery.

There are at least 3-5 other workarounds available for this puzzle.

You will have no problems uncovering them after reading my book “Oracle SQL Tricks and Workarounds”.