The Unique Result Oracle Database Coding Challenge

28072011

July 28, 2011

I must say that I am impressed with the number of unique solutions that were developed for the previous coding challenge (FizzBuzz). While not all solutions were extremely efficient (a couple were intentionally designed to be as inefficient as possible), the various techniques provide views of different approaches to solving a problem that was not well defined (even though at first glance it appeared to be well defined). While not all of the solutions presented are optimal for the FizzBuzz challenge, derivatives of those solutions might be perfect for real-world problems (for instance, side-tracking a pesky DBA or developer with a performance challenge).

Time for another coding challenge. This challenge is adapted from one that was posed in a Usenet thread several years ago. Remember that if a specification is not well defined, feel free to interpret the specification – one of the goals of this blog article is to see a variety of solutions, but a secondary goal is to determine what might happen when weak specifications are provided in a request for a solution.

Assume that there are two tables, T1 and T2, as designed and populated below. The two tables contain the results from experiments performed by two different analysts.

Part 1: Return all rows from table T1 where the C2 value is not found in table T2’s column C2 values, and return all rows from table T2 where the C2 value is not found in table T1’s column C2 values. In short, we want the rows that have unique results found in column C2 so that we may identify the differences found by the two analysts.

Part 2: Using a single select statement, return all rows found in table T1 that are not duplicated in table T2, and all rows found in table T2 that are not duplicated in table T1. The resulting output must identify the table from which the row was found.

Part 3: A third analyst performed experiments and entered his results in table T3, as designed and populated below. Using a single SQL statement, retrieve the rows from tables T1, T2, and T3 that are not duplicated in the other tables – find the unique rows from the three tables. Each row should identify its source table (T1, T2, or T3).

When you post your solutions, please identify Part 1, Part 2, or Part 3 next to the solution.

General tips for comments:

Include your code portion of the solution in a monospaced font (Courier) to retain the alignment spaces in the solution. To do this, use a <pre> tag just before the code portion of the solution and a </pre> tag just after the solution ([s ourcecode] and [/s ourcecode] tags (without the space between the first two letters) should have the same result, just with a smaller font size and a non-scrolling code area).

Less than (<) and greater than (>) signs have a special meaning in HTML web pages. As such, specify &lt; for a less than sign and &gt; for a greater than sign in code sections to avoid having portions of your code sections magically disappear.

Think about how you would build the solutions before scrolling down to the comments section.

–

–

–

–

–

–

–

–

Part 2:

SELECT NVL(T1.C1,T2.C1) C1, NVL(T1.C2,T2.C2) C2, NVL(T1.C3,T2.C3) C3, NVL(T1.C4,T2.C4) C4, NVL2(T1.C1,'T1','T2') FROM_TABLE FROM T1 FULL OUTER JOIN T2 ON T1.C1=T2.C1 AND T1.C2=T2.C2 AND T1.C3=T2.C3 AND T1.C4=T2.C4 WHERE T1.C1 IS NULL OR T2.C1 IS NULL;

There are a lot of unique solutions for this particular challenge – I would say at least 10 solutions for the second part alone. The solutions posted by Raj are very good, and are not solutions that I had previously consider – he apparently read all three parts of the challenge, and carefully considered SQL code reuse.

Here is a sample of one of the other solutions for the second part of the challenge:

The above is one possible outcome of expanding the solution for the first part of the challenge into a solution for the second part of the challenge. If we were to expand the above solution to work for the third part of the challenge, considered how might it appear.

If we know how to read a 10053 trace file, might we be able to cheat to find other solutions?

Just got back from a trip. In case you’re still reading the comments, here is a set of solutions inspired by an asktom thread. Basically, I use aggregation instead of analytics. In parts 2 and 3 I assume that I can throw out duplicate lines in each analyst’s results; this is not necessary, but allows me to add the third table with less change to the code. Besides, you want us to make dangerous assumptions, right?

My RSS feed only includes a small part of my articles for a variety of reasons. It seems that various sites strive to be more than blog aggregators, and have a habit of either not identifying the original author of some articles (oraclecommunity.in.ua and http://www.oracle.cc/the-new-order-oracle-coding-challenge-3-%E2%80%93-mind-boggle.html seems to come to mind), make it appear that people are writing articles specifically for a site (itnewscast.com/database/new-order-oracle-coding-challenge-2 seems to come to mind), or attempt to charge people to read articles that the author makes freely readable (sswug.org seems to come to mind). It is not uncommon for me (and I suspect several others who write technical blog articles) to spend several hours developing each article – limiting the RSS feed simply means that someone must make more of an effort to republish my articles without authorization.(I have a much longer explanation why my RSS feed includes only partial articles, but this should be a sufficient explanation for now.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:
<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: