I noticed an interesting request on the comp.databases.oracle.misc usenet group, and I thought that I would make an attempt at solving the request. The original poster (OP) stated that he had a table with data similar to the following:

DEPT_RANK for ID 99 is 1 for A because sum(LOAD) = 22 is the max
LOC_RANK for ID 99 is 1 for LA because sum(LOAD) = 15 is the max

At first glance, that request seems to be reasonably easy to accomplish. Let’s start by creating a table with the sample data (ideally, the OP should have provided the DDL and DML to create and populate this table):

The first step, if we were to think about creating the solution in logical steps, is to find a way to calculate the SUM values that were mentioned by the OP. So, as a starting point, we might try this:

If I am understanding the OP’s request correctly, the above is a good starting point (even though the alias for the first analytic function could have been better selected).

We are then able to take the above SQL statement and push it into an inline view to hopefully produce the output that is expected by the OP (note that the PARTITION clause differs for the LOC_RANK column from what is specified in the inline view for the function that is used to create that column):

Almost for the LOC_RANK column (not even close for the DEPT_RANK column), but not quite right. The problem is that when attempting to calculate the RANK columns in the above output, we should only PARTITION on the ID column, not the ID column and some other column, as was the case when we used the SUM analytic function.

Notice in the above that when two rows have the same SUM_LOAD_ value, the displayed rank is correct, but that repeated rank value then causes a rank value to be skipped (compare the DEPT_RANK column value on row 3, the LOC_RANK column value on row 1, and the LOC_RANK column value on row 5).

Now what? The ROW_NUMBER function could be used to produce sequential rank numbers without gaps, for example:

Write the DDL and DML statements to create the source table and populate it with the non-aggregated original data, and then produce the output requested by the OP (without looking at the updated usenet thread).

An interesting request came in from an ERP mailing list – how would you write a SQL statement that indicates the next operation in a manufacturing process. Sounds like an easy requirement. Let’s take a look at a graphical view of one example (the graphical view is created using a program that I wrote probably 7 years ago):

The cyan colored rectangles in the above picture are the various operations (found in the OPERATION table) in the manufacturing process. The light green colored rectangles are the material requirements (found in the REQUIREMENT table) that are used by the operation that is immediately above the sequence of light green rectangles. The white rectangles are the sub-assembly headers for the manufacturing process. The white rectangle at the far left is considered the main sub-assembly 0 (WORKORDER_SUB_ID = ‘0’), the middle white rectangle in this case is sub-assembly 1 (WORKORDER_SUB_ID = ‘1’), and the right-most white rectangle in this case is sub-assembly 200 (WORKORDER_SUB_ID = ‘200’). All sub-assemblies except the main sub-assembly 0 are tied to another “parent” operation that consumes the sub-assembly just as if it were another material requirement; therefore a dummy row is added to the material requirements table (REQUIREMENT) with the SUBORD_WO_SUB_ID column set to the sub-assembly number of the connected operation.

In the above picture, on the main sub-assembly 0, operation 888 (at resource ID INSPECT) is the first operation, and operation 999 (at resource ID SHIP) is the second operation. On sub-assembly 1, operation 10 (at resource ID 92) is the first operation, operation 20 (at resource ID 62) is the second operation, and operation 541 (at resource ID KRW) is the third operation. On sub-assembly 200, operation 10 (at resource ID WSD70-TACK) is the first operation, operation 20 (at resource ID WELD-CBA) is the second operation, operation 40 (at resource ID BLAST) is the third operation, and operation 50 (at resource ID PAINT) is the fourth operation.

Since I am working with Oracle Database, I can construct a SQL statement using the LEAD analytic function to find the next operation number and the next resource ID:

Looks like a pretty easy solution… unless we recognize that after the last operation on sub-assembly 200 the next operation is really the parent operation of that sub-assembly (operation 10 on sub-assembly 1). Likewise, after the last operation on sub-assembly 1, the next operation is really the parent operation of that sub-assembly (operation 888 on sub-assembly 0). There is no next operation after operation 999 on the main sub-assembly 0. How can we fix this problem with the NULL next operations in the previous output? We just need an outer join to the dummy row in the REQUIREMENT table to pick up the parent’s sub-assembly number, and then join that dummy row back to a second reference of the OPERATION table. The NVL2 and COALESCE functions are used to handle the cases where the original next operation and next resource ID would have been output as NULL values:

The above takes less that 0.01 seconds to execute, in part because I was careful to preserve the leading columns in the primary key indexes’ columns when building the inline view.

But, what if the OP (as he mentioned) was running with an older version of SQL Server that did not support the LEAD analytic function? Is the OP out of luck (it could be argued that is a trick question)?

Let’s start again, this time we will self-join the OPERATION table rather than using an analytic function to determine the next operation, and just for fun we will use ANSI join syntax:

Now, we will slide the above SQL statement into an inline view and use the COALESCE function to return the first non-NULL value in a list of columns, and the Oracle only NVL2 function to return one of two column values depending on whether or not a third column contains a NULL value:

So, now that we have two different SQL statements that both solve the problem that was presented by the OP, which version of the SQL statement is the most efficient, and which would you use in production?

In the previous article of this series we examined a couple of moderately simple data retrieval requirements, and used inline views and/or analytic functions to solve those data retrieval requirements. I recently saw another opportunity to help a person on the ERP mailing list with a SQL statement. The original poster (OP) provided a screen capture that contained information similar to the following:

Initially, the OP requested that the most recent TRANSACTION_DATE (the last column in the above output) for the PART_ID value to be shown on each row of the output, rather than the TRANSACTION_DATE that is directly associated with a specific row. However, there are two problems:

The OP did not mention what tables were used as the data source for the included screen capture.

The OP did not mention whether a recent release of Oracle Database or some other brand of database was in use.

For times like these, it helps a lot to be familiar with the data model used in the database. The INVENTORY_TRANS table in this particular ERP system has columns that match several of the columns that were included in the OP’s screen capture: CUST_ORDER_ID, PART_ID, TYPE, CLASS, WAREHOUSE_ID, LOCATION_ID, and TRANSACTION_DATE. The question remains, where did the QTY_ON_HAND column come from – that column is not in the INVENTORY_TRANS table. We could find out, at least on Oracle Database, with a SQL statement similar to the following:

SELECT
DTC.TABLE_NAME,
DT.TABLE_NAME
FROM
DBA_TAB_COLUMNS DTC,
DBA_TABLES DT
WHERE
DTC.COLUMN_NAME='QTY_ON_HAND'
AND DTC.OWNER='SCHEMA_OWNER_HERE'
AND DTC.OWNER=DT.OWNER(+)
AND DTC.TABLE_NAME=DT.TABLE_NAME(+)
ORDER BY
DTC.TABLE_NAME;
TABLE_NAME TABLE_NAME
------------------------------ ----------
CR_EC_LINE_PART
CR_OM_CO_LINE_PART
CR_PART_PO_LINE
CR_WO_PART
PART PART
PART_USER_DEFINED_VIEW

Note that in the above, I had to join the DBA_TAB_COLUMNS dictionary view with the DBA_TABLES dictionary view (I could have used DBA_OBJECTS instead) to determine that 5 of the 6 rows returned from the DBA_TAB_COLUMNS view are in fact not tables, but views. Thus, the QTY_ON_HAND column in the OP’s screen capture must have originated from the PART table (unless, of course, the column in the screen capture is actually an aliased calculation).

Now that the data sources are known for the query, we can take a best guess approach at reproducing the output that the OP provided in the screen capture:

SELECT
IT.CUST_ORDER_ID,
IT.PART_ID,
TYPE,
CLASS,
IT.WAREHOUSE_ID,
IT.LOCATION_ID,
P.QTY_ON_HAND,
IT.TRANSACTION_DATE
FROM
INVENTORY_TRANS IT,
PART P
WHERE
IT.CUST_ORDER_ID IS NOT NULL
AND IT.TYPE='O'
AND IT.CLASS='I'
AND IT.PART_ID=P.ID
AND P.QTY_ON_HAND>0
ORDER BY
IT.PART_ID,
IT.TRANSACTION_DATE;

Next, there are a couple of ways to have the most recent transaction date appear on each row of the output (partially dependent on the database version). First, let’s build a query that returns the most recent transaction date where the TYPE is O and the CLASS is I and the CUST_ORDER_ID column does not contain a NULL:

Next, we will just plug the above inline view into the FROM clause of our original SQL statement and add an additional predicate into the WHERE clause to tell the query optimizer how to associate the inline view with the other tables in the SQL statement:

Let’s assume that the OP’s intended result changes – the OP just wants to know which PART_IDs have not had any transactions in the last 89 day. So, the OP is interested in those PART_IDs with a last transaction that was either 90 or more days ago, or there has never been a transaction for the PART_ID. Oh, scrap… start over.

First, we will identify the list of PART_IDs that have had transactions in the last 89 days (CLASS values of ‘A’ are adjustments, not true inventory transactions, so rows with those CLASS values will be excluded). Once we have that list, we can simply subtract the list from the entire list of PART_IDs:

The OP is interested in seeing the actual warehouse location containing the PART_IDs, so we need to use the PART_LOCATION table, not the PART table as we did before. We then need to create a left outer join between the above list and the PART_LOCATION table – essentially, we will find cases where there is a row in the PART_LOCATION table without a corresponding part ID in the above list. ANSI syntax will be used here for the outer join, because it is still unknown whether the OP is running an Oracle Database backend or a SQL Server backend:

In the above, notice that there is a seeminly odd ON clause: “PL.PART_ID=IT.PART_ID AND IT.PART_ID IS NULL” – that simply means that when resolving the left outer join, a value must be found on the left side, but a corresponding value should not be found on the right side of the join. This is the “minus” syntax that I selected to use here (the SQL MINUS syntax could not be used because the WAREHOUSE_ID, LOCATION_ID, and QTY column values returned by the query of the INVENTORY_TRANS table would have to exactly match the values found in the PART_LOCATION table… and that is more difficult to accomplish).

If I was certain that the OP was running with an Oracle Database backend, I would have suggested the following syntax, rather than the ANSI syntax as shown above:

SELECT
PL.PART_ID,
PL.WAREHOUSE_ID,
PL.LOCATION_ID,
IT.PART_ID,
PL.QTY
FROM
PART_LOCATION PL,
(SELECT DISTINCT
PART_ID
FROM
INVENTORY_TRANS
WHERE
TRANSACTION_DATE > TRUNC(SYSDATE-90)
AND PART_ID IS NOT NULL
AND CLASS<>'A') IT
WHERE
PL.PART_ID=IT.PART_ID(+)
AND IT.PART_ID IS NULL
AND PL.QTY<>0
ORDER BY
PL.PART_ID,
PL.WAREHOUSE_ID,
PL.LOCATION_ID;

Why the NULL values in the fourth column (the second PART_ID column) of the output? That NULL simply means that the query is working exactly as intended. That fourth column would be showing only those PART_IDs that have had transactions in the last 89 days (and NULL if no transactions in the last 89 days) if the restriction “AND IT.PART_ID IS NULL” was NOT included in the ON clause – but such rows should not be included in the output if the output is to be used as intended.

Sometimes it is interesting to take a step back from some of the more difficult to construct SQL statements, and just put together something simple. The following request recently arrived through an ERP mailing list:

“I want to find the last ship date for all the open part numbers in Customer Order Entry. The last ship date is not necessarily from the Open order in Customer Order Entry…”

The original poster (OP) provided more information, and even provided a description of what was already tried as a solution. It is not exceptionally clear what the OP wanted, but because I have several years of experience working with this particular ERP system, I had a feeling that the OP might be headed in the wrong direction. This ERP system uses tables that are well normalized, which would make it a bit more difficult to locate the most recent ship date for a part number that is recorded with one customer order’s tables’ rows when another customer order’s tables’ rows are reported. Unless, of course, we remember that there is also a transaction table that records every inventory movement of part numbers through the system – we just need to know what those rows in the transaction table look like. In this case, the rows have the following characteristics:

The CUST_ORDER_ID column value must not be NULL

The TYPE column must have a value of ‘O’

The CLASS column must have a value of ‘I’

With that knowledge, we could then find the most recent date that any part number shipped with a SQL statement similar to the following:

SELECT
PART_ID,
MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
INVENTORY_TRANS
WHERE
CUST_ORDER_ID IS NOT NULL
AND TYPE='O'
AND CLASS='I'
GROUP BY
PART_ID;

In the above, the AS keyword is optional on an Oracle Database platform, but it is not optional on some other database platforms.

With an Oracle Database backend, inline views may be used to permit the above SQL statement to be used to retrieve additional information for a parent SQL statement. If we wrap the above SQL statement in ( ) and place it in the FROM clause of the SQL statement, Oracle will treat the data returned by the above SQL statement quite similar to how it would handle a regular table (the Oracle query optimizer might decide to rewrite the combined SQL statement into an equivalent form that no longer contains an inline view). There is always a chance that a part number that is on order may have never shipped to a customer, so we will need to outer join (in this case using a left outer join) to the newly created inline view:

SELECT
COL.CUST_ORDER_ID,
COL.PART_ID,
LS.LAST_TRANSACTION_DATE
FROM
CUSTOMER_ORDER CO,
CUST_ORDER_LINE COL,
(SELECT
PART_ID,
MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
INVENTORY_TRANS
WHERE
CUST_ORDER_ID IS NOT NULL
AND TYPE='O'
AND CLASS='I'
GROUP BY
PART_ID) LS
WHERE
CO.ID=COL.CUST_ORDER_ID
AND CO.STATUS IN ('U','F','R')
AND COL.LINE_STATUS='A'
AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY
AND COL.PART_ID=LS.PART_ID(+);

In the above, I joined the CUSTOMER_ORDER and CUST_ORDER_LINE tables to the inline view that I aliased as LS. A third table, CUST_LINE_DEL, that optionally contains the delivery schedule for some of the rows in the CUST_ORDER_LINE table, could have also been outer joined to the CUST_ORDER_LINE table.

Running SQL Server, or just feeling ANSI? If so, the above may be rewritten as follows:

SELECT
COL.CUST_ORDER_ID,
COL.PART_ID,
LS.LAST_TRANSACTION_DATE
FROM
CUSTOMER_ORDER CO
JOIN
CUST_ORDER_LINE COL
ON
CO.ID=COL.CUST_ORDER_ID
LEFT OUTER JOIN
(SELECT
PART_ID,
MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
INVENTORY_TRANS
WHERE
CUST_ORDER_ID IS NOT NULL
AND TYPE='O'
AND CLASS='I'
GROUP BY
PART_ID) LS
ON
COL.PART_ID=LS.PART_ID
WHERE
CO.STATUS IN ('U','F','R')
AND COL.LINE_STATUS='A'
AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY;

Unfortunately, the OP is actually using a SQLBase database backend that does not support inline views. I remember the feeling before I discovered that Oracle Database supported inline views… in that case I would do something like the following:

Create a statically defined view.

Join to that statically defined view just as if the view were a table.

CREATE VIEW CUST_ORDER_PART_LS AS
SELECT
PART_ID,
MAX(TRANSACTION_DATE) AS LAST_TRANSACTION_DATE
FROM
INVENTORY_TRANS
WHERE
CUST_ORDER_ID IS NOT NULL
AND TYPE='O'
AND CLASS='I'
GROUP BY
PART_ID;
SELECT
COL.CUST_ORDER_ID,
COL.PART_ID,
LS.LAST_TRANSACTION_DATE
FROM
CUSTOMER_ORDER CO,
CUST_ORDER_LINE COL,
CUST_ORDER_PART_LS LS
WHERE
CO.ID=COL.CUST_ORDER_ID
AND CO.STATUS IN ('U','F','R')
AND COL.LINE_STATUS='A'
AND COL.ORDER_QTY>COL.TOTAL_SHIPPED_QTY
AND COL.PART_ID=LS.PART_ID(+);

I guess that it is a good exercise once in a while to practice simple SQL.

—

A second example from the ERP mailing list showed a SQL Server solution for a particular problem. The particular problem that the SQL Server solution set out to solve is as follows:

“Someone posed an interesting question to me – How can you do a mass insert of Operations? Well most of us know how to manage that without too much trouble but this one came with a couple of caveats – The Master’s first operation (lowest numbered) has to have a particular ResourceID and then a different but specific ResourceID must be on a subsequent Operation. It is after the second operation where the insert must be placed.

Sounds like fun – So I figured that it could be done in a single SQL statement – well after a 4-way join of the Operation table to itself I got it to work how I expected…”

Interesting, I think that there are times in the past that I have used multiple self-joins to a table in order to solve similar SQL problems. However, there is an easier way using analytic functions. The following SQL statement attempts to indicate: the sequence of the operation within the work order sub ID, the sequence number that follows sequentially, the second sequence number that follows sequentially, and the resource ID of the next operation in sequential order by the operation number:

SELECT
WORKORDER_BASE_ID,
WORKORDER_LOT_ID,
WORKORDER_SPLIT_ID,
WORKORDER_SUB_ID,
SEQUENCE_NO,
RESOURCE_ID,
ROW_NUMBER() OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
ORDER BY RESOURCE_ID) AS RN,
LEAD(RESOURCE_ID,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
ORDER BY RESOURCE_ID) AS NEXT_RESOURCE_ID,
LEAD(SEQUENCE_NO,1) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
ORDER BY RESOURCE_ID) AS NEXT_SEQ_NO,
LEAD(SEQUENCE_NO,2) OVER (PARTITION BY WORKORDER_BASE_ID, WORKORDER_LOT_ID, WORKORDER_SPLIT_ID, WORKORDER_SUB_ID
ORDER BY RESOURCE_ID) AS NEXT_NEXT_SEQ_NO
FROM
OPERATION
WHERE
WORKORDER_TYPE='M';

Next, we only care about those rows when our resource ID of interest (69 in my example) is specified in the first operation on the work order sub ID and the second operation on that work order sub ID specifies the other resource ID that is of interest (FW in my example). If we are to avoid primary key violations, we should select a new sequence number that is half way between the next-next operation and the next operation:

With the above, we are well on our way to perfoming a mass insert into this table. If I recall correctly, the above must first be inserted into another table (a temp table, preferrably) and then the rows may be inserted into the OPERATION table (the table that is the row source for the analytic functions used in the SQL statement).

While taking a look at the OTN threads this morning, I found an interesting test case, where the OP was asking why a query used a full table scan and not an index range scan. The OP would like for the query to use the index without using a hint in the query. So, why doesn’t the OP’s test case use an index range scan? Just for fun I will state that my first two initial guesses were not quite on target.

A slightly modified table creation script of the setup for the OP’s test case:

Tic Tac Toe, the game of X’s and O’s, was an oddly popular game in elementary school. When playing the game you quickly learn a couple of rules:

Because X always places his mark first (alternating between X and O), there is an unfair advantage for the player placing the X marks.

The player placing his mark in the center square typically has an advantage in the game.

Once a player scores three marks in a row horizontally, vertically, or diagonally, there is no point in continuing the game.

–

–

–

–

–

–

–

Consider the above game matches. In the first two matches both X and O score three in a row, but which player won the match – it depends on the order in which the marks were placed. In the third match, X won with a diagnal three in a row. The final match resulted in a tie (neither player won). You can use a SQL statement similar to the following to output all of the roughly 362,000 combinations (note that not all combinations are unique – rotating the board 90 or 180 degrees to generate a unique combination probably is not fair, and the order in which the marks are placed could matter):

WITH
N AS
(SELECT
ROWNUM N
FROM
DUAL
CONNECT BY
LEVEL<=9),
C AS
(SELECT
'XOX'||'OXO'||'XOX' C
FROM
DUAL)
SELECT
SUBSTR(C.C,N1.N,1) ||
SUBSTR(C.C,N2.N,1) ||
SUBSTR(C.C,N3.N,1) || CHR(10) ||
SUBSTR(C.C,N4.N,1) ||
SUBSTR(C.C,N5.N,1) ||
SUBSTR(C.C,N6.N,1) || CHR(10) ||
SUBSTR(C.C,N7.N,1) ||
SUBSTR(C.C,N8.N,1) ||
SUBSTR(C.C,N9.N,1) GAME
FROM
N N1,
N N2,
N N3,
N N4,
N N5,
N N6,
N N7,
N N8,
N N9,
C
WHERE
N1.N<>N2.N
AND N1.N<>N3.N
AND N1.N<>N4.N
AND N1.N<>N5.N
AND N1.N<>N6.N
AND N1.N<>N7.N
AND N1.N<>N8.N
AND N1.N<>N9.N
AND N2.N<>N3.N
AND N2.N<>N4.N
AND N2.N<>N5.N
AND N2.N<>N6.N
AND N2.N<>N7.N
AND N2.N<>N8.N
AND N2.N<>N9.N
AND N3.N<>N4.N
AND N3.N<>N5.N
AND N3.N<>N6.N
AND N3.N<>N7.N
AND N3.N<>N8.N
AND N3.N<>N9.N
AND N4.N<>N5.N
AND N4.N<>N6.N
AND N4.N<>N7.N
AND N4.N<>N8.N
AND N4.N<>N9.N
AND N5.N<>N6.N
AND N5.N<>N7.N
AND N5.N<>N8.N
AND N5.N<>N9.N
AND N6.N<>N7.N
AND N6.N<>N8.N
AND N6.N<>N9.N
AND N7.N<>N8.N
AND N7.N<>N9.N
AND N8.N<>N9.N;

The output of the above SQL statement should appear similar to the following:

The big problem with the above SQL statement is that it is not clear which player won in all cases. Ideally, the N1.N value would be used to output an X mark in the specified position, the N2.N value would be used to output an O mark in a specified position, the N3.N value would be used to output an X in the specified position, etc. until one of the players places three marks in a row. For example, if N1.N is 5, an X would be placed in the center square, if N2.N is 9, an O would be placed in the bottom right square.

Now that we know that the order in which the marks are placed is important, how would be know when a player wins? You could experiment with the following:

Vertical win, with three positions having the same resulting values:

SELECT
MOD(position - 1, 3) + 1 V
FROM
DUAL;

Horizontal win, with three positions having the same resulting values:

OK, now that I have explained the game, given you a couple of SQL statements to possibly help you with the solution… on to the challenge. With the help of Oracle Database build a tic tac toe solver that will help a player win at tic tac toe. Provided a current description of the board, for example ‘_X_’||’O_X’||’OX_’, display all board solutions that allows player O (or player X if his turn is next) to win. (Side note: I have not yet completed the solution to this challenge – it might be possible to accomplish this challenge with just a SQL statement.)

In part 1 of this series the challenge was to simply reverse the order of digits in the numbers from 1 to 1,000,000 to find that cases where the numbers formed by the reverse ordered digits evenly divided into the original number. In part 2 of this series the challenge required examining all of the numbers between 1000 and 9999, where arranging the digits of the original number into any of 23 valid combinations resulted in a new number that cleanly divided into the original four digit number. There were several different solutions provided to the two challenges, so now it is time to move on to part three of the series.

In part 1 of this blog article series I mentioned playing a game years ago that used letters on the face of dice – the dice were rolled, and then the challenge was to find all words that could be completely spelled using the letters on the top of the dice. I was not very good at the game, so I enlisted the help of a computer. One such dice game is called Boggle, and that game’s name is probably fitting for today’s challenge. Imagine that you played this game and the following letters appeared on the top of the dice:

One of the rules of the game requires that words must be at least 3 letters in length, for example: you say melee eye (I) see elfs file some mail (OK, the word I is too short, but we can have some fun with the words that are found). As you might be able to guess, there are a lot of possible combinations of the 16 letters found on the dice, some of which are valid words. If we just consider the 5 letter, 4 letter, and 3 letter combinations of the dice, there are more than a half million possible combinations (in the following table, multiply the numbers across and add the results for each row) – no wonder I needed the computer’s help with these puzzles.

16

15

14

13

12

= 16! / 11!

16

15

14

13

= 16! / 12!

16

15

14

= 16! / 13!

= 571,200

To make the full challenge of finding words a little easier, let’s break the challenge into a couple of parts:

Part 1: Consider the 2 x 2 letter arrangement at the left. With the help of Oracle Database, list all of the three letter combinations of those four letters. There will be 4 * 3 * 2 = 24 possible combinations of the letters.

Part 2: Consider the 4 x 4 letter arrangement at the left. With the help of Oracle Database, list all of the four letter combinations of those 16 letters. There will be 16 * 15 * 14 * 13 = 43,680 possible combinations of the letters.

–

–

–

–

–

Part 3: Consider the 4 x 4 letter arrangement above. With the help of Oracle Database, list all of the three, four, five, and six letter combinations of those 16 letters. If you see any seven letter words in the above set of letters, you might as well retrieve those letter combinations also. How many letter combinations do you have in total for part 3?

Part 4:Extra Credit: How many of the letter combinations generated in part 3 above are valid U.S. or U.K. English words? List the words.

Part 5: Extra, Extra Credit: List any words found in the letters at the left that have any connection to Oracle Corporation. Remember that a letter can only be used as many times in a single word as it appears at the left (if you can form a word with three letter A’s that have a connection to Oracle Corp., go for it.).

–

–

–

–

–

–

–

–

–

–

–

–

–

–

Added August 7, 2011:

When I put together this challenge I did not think that it was possible to complete Part 4 Extra Credit using just SQL. I was fairly certain that there were some interesting techniques to retrieve HTML content with the help of PL/SQL, but I had not worked out a solution that utilized that technique. As I write this, Radoslav Golian in the comments section appears to have both a PL/SQL and a SQL solution that uses the dictionary.reference.com website to validate the words (only 6 words to avoid a denial of service type attack on the dictionary.reference.com website). One of the approaches that I considered, but did not develop, is something similar to how Radoslav verified the words, but I would use a VBS script to submit the request and check the result as is demonstrated in these two articles: Submit Input to an ASP Web Page and Retrieve the Result using VBS and Use VBS to Search for Oracle Books using Google’s Book Library.

The solution that I put together for Part 4 Extra Credit started with an Excel macro that I posted in another blog article, which was then converted to PL/SQL. I then transformed the PL/SQL for use in this article, and generated a new Excel macro from the PL/SQL code. The Excel macro (along with the calling code looks like this:

Sub StartBoggle()
Call Boggle("ESOIMEFOALEUSAYE", 6, 3)
End Sub
Sub Boggle(strCharacters As String, intMaxWordLength As Integer, intMinWordLength As Integer)
Dim i As Integer
Dim strCharacter(20) As String
Dim intCharacterIndex(20) As Integer
Dim intCharacters As Integer
Dim intCharactersMax As Integer
Dim intCharactersMin As Integer
Dim intNumberOfSuppliedCharacters As Integer
Dim intAdjustmentPosition As Integer
Dim intFlag As Integer
Dim strOutput As String
Dim strWords(10000) As String
Dim intWordCount As Integer
Dim intFilenum As Integer
intFilenum = FreeFile
Open "C:\Words " & strCharacters & ".txt" For Output As #intFilenum
If intMaxWordLength = 0 Then
intCharactersMax = Len(strCharacters)
Else
If intMaxWordLength <= Len(strCharacters) Then
intCharactersMax = intMaxWordLength
Else
intCharactersMax = Len(strCharacters)
End If
End If
If intMinWordLength = 0 Then
intCharactersMin = 3
Else
If intMaxWordLength < intMinWordLength Then
intCharactersMin = intCharactersMax
Else
intCharactersMin = intMinWordLength
End If
End If
intNumberOfSuppliedCharacters = Len(strCharacters)
For i = 1 To intNumberOfSuppliedCharacters
strCharacter(i) = Mid(strCharacters, i, 1)
Next i
intCharacters = intCharactersMin - 1
intWordCount = 0
Do While intCharacters < intCharactersMax
intCharacters = intCharacters + 1
intAdjustmentPosition = 1
For i = 1 To intCharacters
intCharacterIndex(i) = i
Next i
Do While intAdjustmentPosition > 0
intFlag = 0
For i = 1 To intAdjustmentPosition - 1
If intCharacterIndex(i) = intCharacterIndex(intAdjustmentPosition) Then
' Found a duplicate index position in the other values to the left
intFlag = 1
Exit For
End If
Next i
If intFlag = 1 Then
' Try the next index position in this element
intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
Else
If intAdjustmentPosition = intCharacters Then
' Output
strOutput = ""
For i = 1 To intCharacters
strOutput = strOutput & strCharacter(intCharacterIndex(i))
Next i
intFlag = 0
For i = intWordCount To 1 Step -1
If strOutput = strWords(i) Then
intFlag = 1
Exit For
End If
Next i
If intFlag = 0 Then
If Application.CheckSpelling(Word:=UCase(strOutput)) <> 0 Then
intWordCount = intWordCount + 1
strWords(intWordCount) = strOutput
Print #intFilenum, strOutput
Debug.Print strOutput
End If
End If
If intCharacterIndex(intAdjustmentPosition) = intNumberOfSuppliedCharacters Then
' No more available values in the last position
intCharacterIndex(intAdjustmentPosition) = 1
intAdjustmentPosition = intAdjustmentPosition - 1
If intAdjustmentPosition > 0 Then
intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
End If
Else
intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
End If
Else
' No duplicate so prepare to check the next position
intAdjustmentPosition = intAdjustmentPosition + 1
End If
End If
Do While (intAdjustmentPosition > 0) And (intCharacterIndex(intAdjustmentPosition) > intNumberOfSuppliedCharacters)
' Roll back one index position as many times as necessary
intCharacterIndex(intAdjustmentPosition) = 1
intAdjustmentPosition = intAdjustmentPosition - 1
If intAdjustmentPosition > 0 Then
intCharacterIndex(intAdjustmentPosition) = intCharacterIndex(intAdjustmentPosition) + 1
End If
Loop ' (intAdjustmentPosition > 0) And
Loop 'intAdjustmentPosition > 0
Loop 'intCharacters < intCharactersMax
Close #intFilenum
End Sub

The Excel macro builds letter combinations that are between the minimum and maximum length, and then tests those letter combinations using the built-in dictionary that is in Excel. I had a little bit of difficulty coming up with a way to generate the letter combinations of variable length, so I settled on a custom developed technique – I would simply keep track of the original character positions, manipulate those original character positions, and then output the corresponding characters. The challenge is then how does one verify that the same character position is not used more than once in a single word?

The method that I came up with is as follows, which assumes that we are trying to build four letter words from the supplied 16 letters. We can start with the seed combination 1,2,3,4. The idea is to work from left to right, and then back to the left. Every time to make it to the right, we output a word, when we make it all the way back to the left (just before the number 1 in the above), we are done. The rules are simple:

Increment the number in a position, and if that number does not appear in a position to the left, move one position to the right.

When the maximum character number (16 in this example) is exceeded in a position, reset the number to 1, move one position to the left, and increment the value in the new position by 1.

In the last position the character number should be incremented as many times as necessary to reach the maximum character number – each time a potential new combination will be generated.

But there is a problem with this approach – it does not use Oracle Database!

–

Let’s go back to the PL/SQL function from which I created the Excel function (I have not worked much with pipelined functions – so there may be one or two errors):

Remember that there are more than a half million character combinations for just the 3, 4, and 5 letter combinations – the above will as for 6,336,960 letter combinations to be generated. But there is a problem with this approach – it does not verify that the letter combinations are actual words!

For fun, let’s see how many possible combinations will result if we allow 3, 4, 5, 6, 7, and 8 letter combinations:

Len

Combinations

8

16

15

14

13

12

11

10

9

518,918,400

= 16! / 8!

7

16

15

14

13

12

11

10

57,657,600

= 16! / 9!

6

16

15

14

13

12

11

5,765,760

= 16! / 10!

5

16

15

14

13

12

524,160

= 16! / 11!

4

16

15

14

13

43,680

= 16! / 12!

3

16

15

14

3,360

= 16! / 13!

582,912,960

582,912,960

That is more than a half billion combinations! Warning, significant database server CPU consumption will result when generating all combinations.

Let’s take a look at the final solution that I created for Part 4 Extra, Extra Credit. The solution is an Excel macro that calls the PL/SQL function through a SQL statement:

The words found appear to depend on the version of Excel – Excel 2010 seems to find more words than Excel 2007.

The 799 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished. Words Oracle_ESOIMEFOALEUSAYE.txt

The 2,179 word list from Excel 2007 for word lengths between 3 and 8 characters, including the timing information to show when the SQL statement was submitted, when the first 5,000 combinations were retrieved from the database, and when the Excel spell check finished. Words Oracle_OSERIEFAARLNCAYL.txt

Excel found Ellison in the second word list. For Part 5 Extra, Extra Credit, what other words connected to Oracle Corporation were found?

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:
<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: