Explore the power of analytic functions – Part 2

In the previous blog post on that subject we reviewed a couple a notable applications of Oracle analytic functions. Today, I came across another interesting illustration of the same concept.

Problem: List all employees from the same department and holding the same job title as ADAMS.

Expected Result:

ENAME JOB DEPTNO
ADAMS CLERK 20
SMITH CLERK 20

Like before, we start with traditional approaches that every experienced developer would easily demonstrate.

Strategy #1:  Using multi-column subquery

SELECT ename, job, deptno
FROM scott.emp
WHERE (deptno, job) IN (SELECT deptno, job
                        FROM scott.emp
                        WHERE ename = 'ADAMS')
ORDER BY 2, 3, 1

Strategy #2:  Using self-join

SELECT a.ename, job, deptno
FROM scott.emp a JOIN scott.emp b USING(deptno, job)
WHERE b.ename = 'ADAMS'
ORDER BY job, deptno, a.ename

Strategy #3:  Using EXISTS predicate

SELECT ename, job, deptno
FROM scott.emp a
WHERE EXISTS (SELECT 1
              FROM scott.emp
              WHERE ename  = 'ADAMS'
                AND deptno = a.deptno
                AND job    = a.job)
ORDER BY 2, 3, 1

A common feature of all the strategies above is having two copies of the emp table with two joining conditions (deptno, job) and one filter (ename = ‘ADAMS’)

As we have seen before, with analytic functions, we can get away with a single copy of th emp table.

Strategy #4:  Using COUNT Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       COUNT(DECODE(ename, 'ADAMS', 1)) 
             OVER(PARTITION BY deptno, job) cnt
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE cnt > 0
ORDER BY 2, 3, 1

Of course, you can use different analytic functions here:

Strategy #5:  Using MAX Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       MAX(DECODE(ename, 'ADAMS', ename)) 
           OVER(PARTITION BY deptno, job) adams
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE adams = 'ADAMS'
ORDER BY 2, 3, 1

Strategy #6:  Using LISTAGG Analytic function

WITH x AS (
SELECT ename, job, deptno, 
       LISTAGG(DECODE(ename, 'ADAMS', 'Y'), '|') WITHIN GROUP (ORDER BY 1) 
               OVER(PARTITION BY deptno, job) flag
FROM scott.emp
)
SELECT ename, job, deptno
FROM x
WHERE flag LIKE 'Y%'
ORDER BY 2, 3, 1

We need to use LIKE operator in case we have more than a single Adams working in the same department and holding the same job title.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

A trick that helps avoiding multiple table scans.

Let’s look at a fairly simple SQL problem:

In a traditional scott.emp table, find all employees who work in the same department as the president.

  • Make your query work even if there are more than 1 president records exist in emp table
  • Make Oracle scan emp table just ONCE

A “traditional” solution to this problem may look like this:

Strategy #1: Using a subquery

SELECT *
FROM scott.emp
WHERE deptno IN (SELECT deptno 
                 FROM scott.emp 
                 WHERE job='PRESIDENT')

or this:

Strategy #2: Using a self-join

SELECT DISTINCT a.*
FROM scott.emp a JOIN scott.emp b ON a.deptno=b.deptno
WHERE b.job='PRESIDENT'

Note, that DISTINCT option in the above query is needed to prevent duplicates if there were multiple presidents in а specific department.

Both solutions above use 2 copies of the emp table which makes oracle scan the same scott.emp table twice.

A trick presented below allows you to use only a single copy of the emp table to solve the problem. The trick involves different conceptual and technical approaches compared to the solutions we have seen so far.

Conceptually, we should rephrase the problem in a way that would keep it identical and at the same time allows us to use different technical arsenal. This approach is explained in a detailed manner in my book “Oracle SQL Tricks and Workarounds”. We can rephrase the puzzle and say that we are looking for employees from departments with some “positive” number of presidents working there. Technically speaking, we need to use analytic function COUNT and check if it is greater than 0:

Strategy #3: Using analytic function COUNT

WITH x AS (
SELECT e.*, 
       COUNT(DECODE(job,'PRESIDENT',1))OVER(PARTITION BY deptno) cnt
FROM scott.emp e
)
SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM x
WHERE cnt>0
ORDER BY empno

We do need to use a common table expression as we cannot filter out by analytic function in the same query where the function is used. Nevertheless, we scan the emp table just once, and during this scan, Oracle engine counts the number of presidents in each department.

COUNT is not the only analytic function that can be employed to solve the problem.

Strategy #4: Using analytic function LISTAGG

WITH x AS (
SELECT e.*, 
       LISTAGG(job,'|') 
         WITHIN GROUP (ORDER BY job) OVER(PARTITION BY deptno) jobs
FROM scott.emp e 
)
SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM x
WHERE '|' || jobs || '|' LIKE '%|PRESIDENT|%'
ORDER BY empno

Instead of counting the presidents by department, we simply concatenate all the job titles and check if the resulting string includes a president.

Finally, if you don’t like using sub-queries in general, we can leverage the power of MODEL clause:

Strategy #5: Using MODEL clause to avoid sub-queries

SELECT empno, ename, job, mgr, hiredate, sal, comm, deptno
FROM scott.emp
MODEL RETURN UPDATED ROWS 
DIMENSION BY (
  empno, 
  SIGN(COUNT(DECODE(job,'PRESIDENT',1))OVER(PARTITION BY deptno)) cnt
) 
MEASURES(ename, job, mgr, hiredate, sal, comm, deptno, 0 dummy) 
RULES(dummy[ANY, 1]=1)
ORDER BY empno

The tricky part here is using a composition of SIGN, COUNT, and DECODE functions (i.e. SIGN on top of what we used in Strategy #3) as a secondary dimension and empno as primary. Employee number is unique by itself, so adding another dimension will still maintain uniqueness required by MODEL clause. The only MODEL RULE changes the dummy measure which “triggers” the “RETURN UPDATED ROWS” instruction and returns only those rows where the dummy dimension was set to 1 – notice that its default value is 0.

You can check the execution plan for all of the above strategies to see how many times Oracle scans the emp table.

***

If you find this post useful, please press the LIKE button and subscribe.

My Oracle Group on Facebook:

Also, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more clever tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

 

 

Interview Question: Count all rows in a table without using COUNT

This is a fairly simple interview question:

Count all rows in a table without using COUNT and any Analytic functions.

A bit more advanced version of the puzzle could be to count rows without using ANY functions at all, neither aggregate nor analytic.

The following video will give you answers to this and some other questions:

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.