2018 Oracle SQL Puzzle of the Week #1

This puzzle is sometimes offered at non-SQL technical interviews with companies like Google or Amazon:

For a given text string, find the first (from the beginning) longest sub-string that does not have repeating characters.

For example, the longest sub-string with no repeating characters in the word anaconda is acond; and in the word: stereotype – it is reotyp.

We move the bar higher and will solve this quite challenging problem in Oracle SQL:

  • Remember to use only a single SELECT statement.
  • The character case should be ignored (use or convert the value to a lower case)
  • You have about 1 week to solve the puzzle and submit your solution(s) but whoever does it sooner will earn more points.
  • The scoring rules can be found here.

If this puzzle is a bit tricky for you, start with a simpler one:

For a given text string, list all possible sub-strings.

If you are ready to give up, please don’t – try to come up with a PL/SQL solution (at very least). It could be a function that takes a VARCHAR2 argument and returns a VARCHAR2 result value.

A correct answer (and workarounds!) will be published here in a week.

My Oracle Group on Facebook:

If you like this post, you may want to join my Oracle group on Facebook: https://www.facebook.com/groups/sqlpatterns/

Would you like to read about many more tricks and puzzles?

For more tricks and cool techniques check my book “Oracle SQL Tricks and Workarounds” for instructions.

15 thoughts on “2018 Oracle SQL Puzzle of the Week #1

  1. KATAYAMA NAOTO January 9, 2018 / 11:06 am

    WITH
    —————————
    DATA(COL)AS(
    SELECT ‘anaconda’ FROM DUAL UNION ALL
    SELECT ‘stereotype’ FROM DUAL),
    —————————
    DATAVIEW(RN,COL)AS(
    SELECT ROWNUM,COL FROM DATA),
    —————————
    WORKVIEW1(RN,POS,COL)AS(
    SELECT
    RN,ROW_NUMBER()OVER(PARTITION BY RN ORDER BY NULL),COL
    FROM DATAVIEW
    CROSS JOIN XMLTABLE(‘for $i in 1 to $LENGTH_COL cast as xs:integer return $i’
    PASSING LENGTH(COL) AS LENGTH_COL)),
    —————————
    WORKVIEW2(RN,POS,COL,R)AS(
    SELECT
    RN,POS,COL,QTYCHK
    FROM WORKVIEW1
    CROSS JOIN XMLTABLE(‘for $i in 1 to $LENGTH_COL cast as xs:integer return $i’
    PASSING LENGTH(COL)-POS+1 AS LENGTH_COL
    COLUMNS QTYCHK NUMBER(32) PATH ‘.’)),
    —————————
    WORKVIEW3(RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL)AS(
    SELECT
    RN,POS,COL,R,COLUMN_VALUE,SUBSTR(COL,POS,R),SUBSTR(SUBSTR(COL,POS,R),QTYCHK,1)
    FROM WORKVIEW2
    CROSS JOIN XMLTABLE(‘for $i in 1 to $R cast as xs:integer return $i’
    PASSING R AS R
    COLUMNS QTYCHK NUMBER(32) PATH ‘.’)),
    —————————
    WORKVIEW4(RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL,COL_PART_DETAIL_CNT)AS(
    SELECT RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL,COUNT(DISTINCT COL_PART_DETAIL)OVER(PARTITION BY RN,POS,R)
    FROM WORKVIEW3),
    —————————
    WORKVIEW5(RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL,COL_PART_DETAIL_CNT,MAX_R)AS(
    SELECT RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL,COL_PART_DETAIL_CNT,MAX(R)OVER(PARTITION BY RN)
    FROM WORKVIEW4
    WHERE R = COL_PART_DETAIL_CNT),
    —————————
    WORKVIEW6(RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL,COL_PART_DETAIL_CNT,MAX_R,MIN_POS)AS(
    SELECT RN,POS,COL,R,R2,COL_PART,COL_PART_DETAIL,COL_PART_DETAIL_CNT,MAX_R,MIN(POS)OVER(PARTITION BY RN)
    FROM WORKVIEW5
    WHERE R = MAX_R)
    —————————
    SELECT MAX(COL) COL,MAX(COL_PART) COL_PART FROM WORKVIEW6
    WHERE POS = MIN_POS
    GROUP BY RN
    ORDER BY RN;

    • Zahar Hilkevich January 10, 2018 / 1:28 pm

      CORRECT though overly complex. The use of XMLTable is interesting. The whole thing can be greatly simplified. Anyway, it’s a good start!

  2. Taroon Ray January 9, 2018 / 2:59 pm

    WITH temp
    AS ( SELECT LEVEL ll
    FROM DUAL
    CONNECT BY LEVEL <= LENGTH ('anaconda')),
    temp2
    AS ( SELECT DISTINCT SUBSTR ('anaconda', ll, LEVEL) str, ll
    FROM DUAL, temp
    CONNECT BY LEVEL <= LENGTH ('anaconda')),
    temp3
    AS (SELECT str, ll
    FROM temp2
    WHERE NOT REGEXP_LIKE (str, '([A-Za-z]).*\1')),
    temp4
    AS (SELECT str, ll
    FROM temp3
    WHERE LENGTH (str) = (SELECT MAX (LENGTH (str)) FROM temp3))
    SELECT str
    FROM temp4
    WHERE ll = (SELECT MIN (ll) FROM temp4);

    –takes around 34 seconds

    • Zahar Hilkevich January 10, 2018 / 1:09 pm

      This seems to work, but so inefficient … At the same time, your solution has ALL components of a very efficient strategy that I will show in a subsequent post with solutions. Keep trying!

  3. Taroon Ray January 10, 2018 / 1:30 pm

    with x as
    (
    select level i from dual connect by level length(str) then i+1 else i end, case when j <= length(str) then j+1 else i+1 end, substr(str, i, j) from y
    where i<=j
    and i<= length(str)
    ),
    z as
    (
    select distinct letter, i from y
    where letter is not null
    and NOT REGEXP_LIKE (letter, '([A-Za-z]).*\1')
    ),
    final as
    (
    select * from z
    where LENGTH (letter) = (SELECT MAX (LENGTH (letter)) FROM z)
    )
    select letter from final
    where i = (select min(i) from final);

    –takes miliseconds

  4. Taroon Ray January 10, 2018 / 2:00 pm

    with x as
    (
    select level i from dual connect by level length(str) then i+1 else i end, case when j length(str) then substr(str, i+1, 1) else substr(str, i, j) end from y
    where i<=j
    and i<= length(str)
    ),
    z as
    (
    select distinct letter, i from y
    where letter is not null
    and NOT REGEXP_LIKE (letter, '([A-Za-z]).*\1')
    ),
    final as
    (
    select * from z
    where LENGTH (letter) = (SELECT MAX (LENGTH (letter)) FROM z)
    )
    select letter from final
    where i = (select min(i) from final);

    –bit of correct in y to get all the possible sub-strings (the logic is same and it takes milliseconds)

  5. Taroon Ray January 10, 2018 / 2:04 pm

    — Some lines are missing out at the time of commenting here.. posting again

    WITH x
    AS ( SELECT LEVEL i
    FROM DUAL
    CONNECT BY LEVEL LENGTH (str) THEN i + 1 ELSE i END,
    CASE WHEN j LENGTH (str) THEN SUBSTR (str, i + 1, 1)
    ELSE SUBSTR (str, i, j)
    END
    FROM y
    WHERE i <= j AND i <= LENGTH (str)),
    z
    AS (SELECT DISTINCT letter, i
    FROM y
    WHERE letter IS NOT NULL
    AND NOT REGEXP_LIKE (letter, '([A-Za-z]).*\1')),
    final
    AS (SELECT *
    FROM z
    WHERE LENGTH (letter) = (SELECT MAX (LENGTH (letter)) FROM z))
    SELECT letter
    FROM final
    WHERE i = (SELECT MIN (i) FROM final);

  6. Deepak Mahto January 12, 2018 / 8:17 am

    WITH ALIAS1 AS (SELECT ‘stereotype’ AS INPUT FROM DUAL),
    ALIAS2(LVL) AS
    (
    SELECT LENGTH(INPUT)-1 AS LVL FROM ALIAS1
    UNION ALL
    SELECT LVL-1
    FROM ALIAS2
    WHERE LVL > 0
    )
    SELECT SUBSTRINGS
    FROM
    (SELECT SUBSTRINGS , ROW_NUMBER() OVER (ORDER BY INITIALS) AS RN
    FROM
    (SELECT MAX(LENGTH(SUBSTRINGS)) OVER () AS MAX_LEN, SUBSTRINGS , MAX(LENGTH(SUBSTRINGS)) AS LEN, MAX(LVL) AS INITIALS FROM
    (SELECT DISTINCT SUBSTR((SELECT INPUT FROM ALIAS1) , A1.LVL , LVLL) AS SUBSTRINGS , A1.LVL FROM ALIAS2 A1 CROSS APPLY (SELECT A2.LVL AS LVLL FROM ALIAS2 A2 WHERE A1.LVL < A2.LVL))
    WHERE NOT REGEXP_LIKE (SUBSTRINGS, '([A-Za-z]).*\1')
    GROUP BY SUBSTRINGS)
    WHERE MAX_LEN = LEN)
    WHERE RN = 1

    • Taroon Ray January 13, 2018 / 12:27 pm

      The above query works but I guess it would fail at certain point. Lets say I am taking the word “aana”, then the longest substring with out non-repetitive characters would be “an”.
      But your query as it is not finding all possible substrings, it would give “a”, which is wrong.

      • Deepak Mahto January 15, 2018 / 9:54 pm

        Thanks a lot for your feedback.
        Had updated, hopefully these one will suit well for all cases!

        WITH ALIAS1 AS (SELECT ‘anaconda’ AS INPUT FROM DUAL),
        ALIAS2(LVL) AS
        (
        SELECT LENGTH(INPUT) AS LVL FROM ALIAS1
        UNION ALL
        SELECT LVL-1
        FROM ALIAS2
        WHERE LVL > 1
        )
        SELECT SUBSTRINGS
        FROM
        (SELECT SUBSTRINGS , ROW_NUMBER() OVER (ORDER BY INITIALS) AS RN
        FROM
        (SELECT MAX(LENGTH(SUBSTRINGS)) OVER () AS MAX_LEN, SUBSTRINGS , MAX(LENGTH(SUBSTRINGS)) AS LEN, MAX(LVL) AS INITIALS FROM
        (SELECT DISTINCT SUBSTR((SELECT INPUT FROM ALIAS1) , A1.LVL , LVLL) AS SUBSTRINGS , A1.LVL FROM ALIAS2 A1 CROSS APPLY (SELECT A2.LVL AS LVLL FROM ALIAS2 A2 WHERE A1.LVL <= A2.LVL))
        WHERE NOT REGEXP_LIKE (SUBSTRINGS, '([A-Za-z]).*\1')
        GROUP BY SUBSTRINGS)
        WHERE MAX_LEN = LEN)
        WHERE RN = 1

  7. KATAYAMA NAOTO January 12, 2018 / 10:17 am

    WITH
    DATA(COL)AS(
    SELECT ‘anaconda’ FROM DUAL UNION ALL
    SELECT NULL FROM DUAL UNION ALL
    SELECT ‘stereotype’ FROM DUAL),
    DATAVIEW(RN,COL)AS(
    SELECT ROWNUM,COL FROM DATA),
    WORKVIEW(RN,POS,COL,COLPART)AS(
    SELECT RN,ROW_NUMBER()OVER(PARTITION BY RN ORDER BY NULL),COL,SUBSTR(COL,QTYCHK)
    FROM DATAVIEW
    CROSS JOIN XMLTABLE(‘for $i in 1 to $LENGTH_COL cast as xs:integer return $i’
    PASSING NVL(LENGTH(COL),0) AS LENGTH_COL
    COLUMNS QTYCHK NUMBER(32) PATH ‘.’))

    SELECT MAX(COL),MAX(ANS)KEEP(DENSE_RANK LAST ORDER BY LENGTH(ANS),POS DESC)
    FROM(
    SELECT RN,COL,POS,ANS
    FROM WORKVIEW
    MODEL
    PARTITION BY(RN,POS)
    DIMENSION BY(0 AS SOEJI)
    MEASURES(COL,COLPART,CAST(NULL AS VARCHAR2(4000)) AS ANS,CAST(NULL AS VARCHAR2(4000)) AS WILLANS)
    RULES ITERATE(4000) UNTIL NVL(LENGTH(COLPART[0]),0) <= ITERATION_NUMBER+1
    (WILLANS[0] = WILLANS[0]||SUBSTR(COLPART[0],ITERATION_NUMBER+1,1)
    ,WILLANS[0] = CASE WHEN REGEXP_COUNT(WILLANS[0],SUBSTR(COLPART[0],ITERATION_NUMBER+1,1)) = 1 THEN WILLANS[0] ELSE SUBSTR(COLPART[0],ITERATION_NUMBER+1,1) END
    ,ANS[0]=CASE WHEN NVL(LENGTH(ANS[0]),0)<NVL(LENGTH(WILLANS[0]),0) THEN WILLANS[0] ELSE ANS[0] END)
    )
    GROUP BY RN;

Leave a comment