Interview Question: For each department count the number of employees who get no commission.

Interview Question:

Write a single SELECT statement that returns  the number of employees who get no commission broken down by department. (Use scott.emp table)

Level:

Intermediate

Expected Result:

DEPTNO NO_COMM_COUNT
10 3
20 5
30 3

 Solutions

A very typical attempt to solve this problem results in the following query:

SELECT deptno, COUNT(*) no_comm_count
FROM scott.emp
WHERE comm IS NULL OR comm=0
GROUP BY deptno
ORDER BY 1

Yes, the result looks correct, but is the query correct?

The answer is NO! It would become apparent if we had a department where all employees get paid commission, so the number of those who does not would be 0.

Let’s change the requirement a bit – we will show all department and number of employees hired on Friday:

SELECT deptno, COUNT(*) fri_count
FROM scott.emp
WHERE TO_CHAR(hiredate, 'DY')='FRI'
GROUP BY deptno 
ORDER BY 1

The result of this query is clearly not what we want:

DEPTNO FRI_COUNT
30 2

We would expect the following instead:

DEPTNO FRI_COUNT
10 0
20 0
30 2

Why don’t we get the departments 10 and 20? The answer is very simple – because we filter “all” those department rows with our WHERE clause. So how should we work around?

Let’s start with more intuitive but less efficient approaches – we will use the same query as before and UNION it with another query that returns “empty” departments. Essentially, the original problem transforms into a new one – find all department where no employees were hired on Friday.

Strategy #1: Using UNION ALL with multi-column non-correlated subquery:

SELECT deptno, COUNT(*) fri_count 
FROM scott.emp 
WHERE TO_CHAR(hiredate, 'DY')='FRI' 
GROUP BY deptno
UNION ALL
SELECT deptno, 0 fri_count 
FROM scott.emp
WHERE (deptno, 'FRI') NOT IN (SELECT deptno, TO_CHAR(hiredate, 'DY')
                              FROM scott.emp)
GROUP BY deptno
ORDER BY 1
DEPTNO FRI_COUNT
10 0
20 0
30 2

Strategy #2: Using UNION ALL with ALL predicate on correlated subquery:

SELECT deptno, COUNT(*) fri_count
FROM scott.emp
WHERE TO_CHAR(hiredate, 'DY')='FRI'
GROUP BY deptno
UNION ALL 
SELECT deptno, 0 no_comm_count 
FROM scott.emp a
WHERE 'FRI'!=ALL(SELECT TO_CHAR(hiredate, 'DY')
                 FROM scott.emp b
                 WHERE a.deptno=b.deptno) 
GROUP BY deptno
ORDER BY 1

It is apparent that the ALL predicate ensures that no employees were hired on Friday.

Now we will mimic the behavior of the UNION ALL operator using LEFT JOIN:

Strategy #3: Using LEFT JOIN:

SELECT a.deptno, COUNT(DISTINCT b.empno) fri_count
FROM scott.emp a LEFT JOIN scott.emp b ON a.deptno=b.deptno
                                      AND TO_CHAR(b.hiredate, 'DY')='FRI'
GROUP BY a.deptno
ORDER BY 1

COUNT(DISTINCT …) is needed to handle a Cartesian Product as the join by deptno column produces many to many  relationship, i.e. Cartesian product.

Strategy #4: Generic substitution technique for an outer-join using UNION ALL

WITH e AS (
SELECT deptno, COUNT(*) fri_count
FROM scott.emp
WHERE TO_CHAR(hiredate, 'DY') = 'FRI'
GROUP BY deptno
UNION ALL
SELECT deptno, 0
FROM scott.emp
GROUP BY deptno
)
SELECT deptno, MAX(fri_count) fri_count
FROM e
GROUP BY deptno
ORDER BY 1

All the above techniques may look cool but they are clearly an overkill for such a simple problem. There is a simple rule worth remembering:

If you need to conditionally aggregate all records in the table but you fail doing so due to a WHERE clause filter, consider moving the filter into the GROUP function you use in SELECT.

Strategy #5: Conditional Aggregation

SELECT deptno, COUNT(DECODE(TO_CHAR(hiredate, 'DY'), 'FRI', 1)) fri_count
FROM scott.emp
GROUP BY deptno
ORDER BY 1

Alternatively, you can use CASE function inside of COUNT. It is especially convenient for our original question/problem, i.e. to count employees who is not paid a commission:

SELECT deptno, COUNT(CASE WHEN LNNVL(comm>0) THEN 1 END) no_comm_count
FROM scott.emp
GROUP BY deptno
ORDER BY 1
DEPTNO NO_COMM_COUNT
10 3
20 5
30 3

This approach is the most efficient as it makes Oracle scanning the emp table only once.

Notice the use of the LNNVL function. You can read more about it in my recent post here.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Suggested Reading:

Would you like to read about many more tricks and puzzles? For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds”.

Interview Question: Count all rows in a table without using COUNT

This is a fairly simple interview question:

Count all rows in a table without using COUNT and any Analytic functions.

A bit more advanced version of the puzzle could be to count rows without using ANY functions at all, neither aggregate nor analytic.

The following video will give you answers to this and some other questions:

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

Interview Question: Get top and bottom paid employees in each department

This is a typical interview problem: list all bottom and top paid employees in each department. A preferred solution should not be using UNION or UNION ALL operators.

Please watch this short video to learn a couple of non-obvious techniques and to impress your potential employers on your next job interview.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

2018 Oracle SQL Puzzle of the Week #1

This puzzle is sometimes offered at non-SQL technical interviews with companies like Google or Amazon:

For a given text string, find the first (from the beginning) longest sub-string that does not have repeating characters.

For example, the longest sub-string with no repeating characters in the word anaconda is acond; and in the word: stereotype – it is reotyp.

We move the bar higher and will solve this quite challenging problem in Oracle SQL:

  • Remember to use only a single SELECT statement.
  • The character case should be ignored (use or convert the value to a lower case)
  • You have about 1 week to solve the puzzle and submit your solution(s) but whoever does it sooner will earn more points.
  • The scoring rules can be found here.

If this puzzle is a bit tricky for you, start with a simpler one:

For a given text string, list all possible sub-strings.

If you are ready to give up, please don’t – try to come up with a PL/SQL solution (at very least). It could be a function that takes a VARCHAR2 argument and returns a VARCHAR2 result value.

A correct answer (and workarounds!) will be published here in a week.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Interview Question: Count number of every week day in a year

Interview Question: With a single SELECT statement get the number of each week day in the current year.

Level: Intermediate

Expected Result:

Day                                  Days in Year
------------------------------------ ------------
SUNDAY                                         52
MONDAY                                         52
TUESDAY                                        52
WEDNESDAY                                      52
THURSDAY                                       52
FRIDAY                                         53
SATURDAY                                       53

Solution #1:

WITH x AS (
SELECT LEVEL-1+TRUNC(SYSDATE, 'YYYY') AS dd
FROM dual
CONNECT BY TRUNC(LEVEL-1+TRUNC(SYSDATE, 'YYYY'),'YYYY')=TRUNC(SYSDATE, 'YYYY')
)
SELECT TO_CHAR(dd, 'DAY') "Day", COUNT(*) "Days in Year"
FROM x
GROUP BY TO_CHAR(dd, 'DAY'), TO_CHAR(dd, 'D')
ORDER BY TO_CHAR(dd, 'D');

Explanation:

The WITH clause returns all days in the current year, this is a common trick used in majority of sql puzzle related to a calendar. The connect by query used in the WITH generated a date range which starts on TRUNC(SYSDATE, ‘YYYY’) – i.e. the 1st day of the year – and continues as long as the next day falls into the same year (see condition in the CONNECT BY clause). The main query groups by day name – TO_CHAR(dd, ‘DAY’) – and sorts by day number (in a week) – TO_CHAR(dd, ‘D’).

Solution #2:

WITH x AS (
SELECT TO_CHAR(ADD_MONTHS(TRUNC(SYSDATE,'YYYY'), 12)-1, 'DDD') days_in_year
FROM dual
)
SELECT TO_CHAR(LEVEL-1+TRUNC(SYSDATE, 'YYYY'),'DAY') "Day",
       CASE WHEN MOD(days_in_year,52)>=LEVEL THEN 53
            ELSE 52
       END "Days in Year"
FROM x
CONNECT BY LEVEL<=7
ORDER BY TO_CHAR(LEVEL-1+TRUNC(SYSDATE, 'YYYY'),'D');

Explanation:

The idea behind this solution is totally different than in the 1st one. A year has 52 weeks and 1 or 2 days depending on whether it is a  leap year or not. So each day of the week happens 52 times a year + first one or two days of the year make corresponding week days have 53 days in that same year. If we know the number of days in a year (365 or 366) we can find out which days of the week will happen 53 times. For that matter we can take MOD(days_in_year, 52) expression that will return either 1 or 2. If the day order number within a year is 1 (or 2 for the leap year) we know that the corresponding week day will occur 53 times, otherwise – 52.

The WITH clause returns number of days in the current year. We get that by taking the 1st day of the current year: TRUNC(SYSDATE,’YYYY’), adding 12 months to it and subtract 1 day to get the last day of the current year. Taking TO_CHAR(…, ‘DDD’) – gives us the order number of that day in the year which is exactly the number of days in the current year.

The main query generates the date range from Jan-1 to Jan-7 in the current year, and assigns 52 or 53 to the 2nd column based on the logic described above.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

 

 

Two ways to build a salary range report without using CASE function

Interview Question: Produce a salary range report with a single SELECT statement. Decode function is allowed, CASE function – is not.

Level: Intermediate

Expected Result:

RANGE                             Employees
-------------------------------- ----------
0-999                                     2
1000-2999                                 9
3000-5999                                 3

Strategy #1:

SELECT COALESCE(DECODE(LEAST(sal, 999), sal, '0-999'),
                DECODE(LEAST(sal, 2999), GREATEST(sal, 1000), '1000-2999'),
                DECODE(LEAST(sal, 9999), GREATEST(sal, 3000), '3000-5999')
                ) AS range,
       COUNT(*) "Employees"
FROM emp
GROUP BY COALESCE(DECODE(LEAST(sal, 999), sal, '0-999'),
                  DECODE(LEAST(sal, 2999), GREATEST(sal, 1000), '1000-2999'),
                  DECODE(LEAST(sal, 9999), GREATEST(sal, 3000), '3000-5999')
                  )
ORDER BY 1

Explanation:

In Oracle SQL terms, a mathematical condition

a <=x <=b

can be  interpreted as

x BETWEEN a AND b

however, this condition is good only for CASE function, and not for DECODE. The trick is to use another interpretation:

LEAST(b,x)=GREATEST(x,a)

– that can be used in DECODE.

CASE-based Solution:

SELECT CASE WHEN sal<=999 THEN '0-999'
            WHEN sal BETWEEN 1000 AND 2999 THEN '1000-2999'
            WHEN sal BETWEEN 3000 AND 5999 THEN '3000-5999'
       END AS range,
       COUNT(*) "Employees"
FROM emp
GROUP BY CASE WHEN sal<=999 THEN '0-999'
              WHEN sal BETWEEN 1000 AND 2999 THEN '1000-2999'
              WHEN sal BETWEEN 3000 AND 5999 THEN '3000-5999'
         END
ORDER BY 1

Strategy #2:

WITH x AS (
SELECT DECODE(1, (SELECT COUNT(*) FROM dual WHERE emp.sal<=999), '0-999',
                 (SELECT COUNT(*) FROM dual WHERE emp.sal BETWEEN 1000 AND 2999), '1000-2999',
                 (SELECT COUNT(*) FROM dual WHERE emp.sal BETWEEN 3000 AND 5999), '3000-5999'
             ) AS range
FROM emp
)
SELECT range, COUNT(*) AS "Employees"
FROM x
GROUP BY range
ORDER BY 1

Explanation:
This query demonstrates how to mimic CASE function using DECODE and in-line scalar subquery from dual.

Suggested further reading:

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions. The book is also available on Amazon and in all major book stores.

My Oracle Group on Facebook:

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

How to delete duplicate records without using Joins and aggregate functions

Interview Question: How to delete duplicate records without using Joins and aggregate functions

Level: Intermediate/Advanced

Answer/Solution:

DELETE
FROM emp_dups a
WHERE ROWID<ANY(SELECT ROWID
                FROM emp_dups b
                WHERE b.empno=a.empno)

Explanation:

If we were allowed to use group functions, we could have used a very well known strategy:

DELETE 
FROM emp_dups a
WHERE ROWID<(SELECT MAX(ROWID)
             FROM emp b
             WHERE b.empno=a.empno)

Since group functions are not allowed, we need to find a substitution. In our case, the solution is based on the equivalence of the following 2 conditions:

a<(SELECT MAX(a) ...)
a<ANY(SELECT a ...)

Another example of such equivalent conditions:

a=(SELECT MAX(a) ...)
a>=ALL(SELECT a ...)

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

 

Interview Question: Show Odd/Even rows without using any functions and pseudo-columns

Sushil Kumar, Database Developer at JP Morgan Chase & Co, has recently asked me this question on my Facebook group page. My first reaction was: “What a silly question! Of course it is impossible to identify odd and even rows without using functions”. But shortly after that, I realized that this is a great SQL puzzle. It took me about 30 minutes (which is a lot!) to solve it.

Interview Question: Show Odd/Even rows without using any functions and pseudo-columns

Level: Advanced

Sample Expected Result:

ENAME           EMPNO   ODD_EVEN
---------- ---------- ----------
MILLER           7934          1
FORD             7902          0
JAMES            7900          1
ADAMS            7876          0
TURNER           7844          1
KING             7839          0
SCOTT            7788          1
CLARK            7782          0
BLAKE            7698          1
MARTIN           7654          0
JONES            7566          1
WARD             7521          0
ALLEN            7499          1
SMITH            7369          0

--Note: Rows are sorted by empno

The idea behind the following solution is quite simple: substitute functions with operators and predicates. Several similar techniques were described in my book Oracle SQL Tricks and Workarounds.

Solution:

WITH x (ename, empno, odd_even) AS
(
SELECT ename, empno, 1 as odd_even
FROM emp
WHERE empno>=ALL(SELECT empno FROM emp)
UNION ALL
SELECT e.ename, e.empno, 1-odd_even
FROM emp e, x
WHERE e.empno>=ALL(SELECT empno FROM emp WHERE empno<x.empno)
)
SELECT *
FROM x
/

ENAME           EMPNO   ODD_EVEN
---------- ---------- ----------
MILLER           7934          1
FORD             7902          0
JAMES            7900          1
ADAMS            7876          0
TURNER           7844          1
KING             7839          0
SCOTT            7788          1
CLARK            7782          0
BLAKE            7698          1
MARTIN           7654          0
JONES            7566          1
WARD             7521          0
ALLEN            7499          1
SMITH            7369          0


If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

Interview Question: Show Location for every employee without using joins

Interview Question: Show Location for every employee without using joins

Level: Intermediate

Expected Result:

ENAME          DEPTNO LOCATION
---------- ---------- ---------
CLARK              10 NEW YORK
KING               10 NEW YORK
MILLER             10 NEW YORK
ADAMS              20 DALLAS
FORD               20 DALLAS
JONES              20 DALLAS
SCOTT              20 DALLAS
SMITH              20 DALLAS
ALLEN              30 CHICAGO
BLAKE              30 CHICAGO
JAMES              30 CHICAGO
MARTIN             30 CHICAGO
TURNER             30 CHICAGO
WARD               30 CHICAGO

Method/Workaround #1: Use Aggregation over Cartesian Product

SELECT e.ename, e.deptno, MAX(DECODE(e.deptno,d.deptno, d.loc)) location
FROM emp e, dept d
GROUP BY e.ename, e.deptno
ORDER BY 2,1;

Method/Workaround #2: Use In-Line Scalar Subquery

SELECT e.ename, e.deptno, 
       (SELECT loc FROM dept d WHERE deptno=e.deptno) location
FROM emp e
ORDER BY 2,1;

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.



Common rule to create or validate the GROUP BY clause

Level: Intermediate

Do you know that the content of the GROUP BY clause can be derived from the rest of the clauses in SELECT statement?

Well, it is true to a great extent. What can be derived is a combination of columns and expressions that must be included in GROUP BY.

We will start a demonstration with a simple puzzle:

Complete GROUP BY Clause in the following query:

SELECT 'Salesmen Report' AS "Report Title", 
       d.deptno, COUNT(e.empno) cnt
FROM dept d, emp e 
WHERE d.deptno=e.deptno
GROUP BY _______________________________
HAVING e.job=’SALESMAN’
   AND SUM(e.sal)>2000
ORDER BY TO_CHAR(e.hiredate,’YYYY’), SUM(e.sal) DESC;

If you are not sure about the answer, learn the following rule.

Common rule to create or validate the GROUP BY clause:

It should at least include all columns and scalar expressions (constants, aggregate functions, and analytical functions are excluded) referenced in SELECT, HAVING, and ORDER BY clauses. Other columns can be specified as well, but they don’t have to.

Here is a solution to the above puzzle:

SELECT 'Salesmen Report' AS "Report Title", 
       d.deptno, COUNT(e.empno) cnt
FROM dept d, emp e 
WHERE d.deptno=e.deptno
GROUP BY d.deptno, e.job, TO_CHAR(e.hiredate,'YYYY')
HAVING e.job=’SALESMAN’
   AND SUM(e.sal)>2000
ORDER BY TO_CHAR(e.hiredate,’YYYY’), SUM(e.sal) DESC;

Note that logically e.job=’SALESMAN’ condition does not belong to HAVING clause. If we push it to WHERE clause (where it belongs to), the group by clause may change:

SELECT 'Salesmen Report' AS "Report Title", 
       d.deptno, COUNT(e.empno) cnt
FROM dept d, emp e 
WHERE d.deptno=e.deptno AND e.job=’SALESMAN’
GROUP BY d.deptno, TO_CHAR(e.hiredate,'YYYY')
HAVING SUM(e.sal)>2000
ORDER BY TO_CHAR(e.hiredate,’YYYY’), SUM(e.sal) DESC;

Since e.job is no longer in the HAVING clause, we don’t have to put it in GROUP BY, but we CAN. The Rule on GROUP BY clause says that at very least, the columns referenced in SELECT, HAVING, and ORDER BY clauses must be referenced in GROUP BY as well, but there might be other columns.

If you like this post, you may want to join my new Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.