And we want to write a query that will return the ID and the highest NON-NULL value in each row (return NULL only if all column values, COL1 through COL5, are NULL, as for ID = 4).

Factored subquery to replicate the table is provided below, and statements to create a much bigger table, for testing, are provided further down in the message.

One may ask, why is the data in that structure to begin with. (It would be better, perhaps, if the table had just three columns: ID, COL_NAME, and VALUE.) For the sake of this exercise, let's accept that the base table just is in this structure.

If no NULLs were present, we could simply use GREATEST(COL1, ... , COL5). Alas, GREATEST and LEAST return NULL if at least one argument is NULL, and that's not what our problem requires.

MIN and MAX work the way we want (they ignore NULLs) but they only work on columns, not on rows. It is easy to write a query that unpivots the base table, then groups by ID and takes the MAX. But this has two disadvantages. One is that, in the original arrangement, each set of five values was in effect already "grouped" by ID (by the fact that the values were all in one row). This is something that we can use to our advantage - for example if a function like GREATEST IGNORE NULLS (COL1, ... , COL5) existed. Unfortunately it doesn't. And the second disadvantage, demonstrated recently in another thread by Solomon Yakobson, is that UNPIVOT, unfortunately, does a UNION ALL behind the scenes - in this case, the table is read five times. (So UNPIVOT will do the same job as "manual unpivoting", which is just a UNION ALL of SELECT ID, COL1 FROM TEST_DATA UNION ALL SELECT ID, COL2 FROM... etc.)

So - in order to avoid un-grouping only to group again just so we can use MAX, we may instead work a little harder on the original idea - use GREATEST, but find a way to handle NULLs. Here is one way:

select id,

greatest ( coalesce(col1, col2, col3, col4, col5),

coalesce(col2, col3, col4, col5, col1),

coalesce(col3, col4, col5, col1, col2),

coalesce(col4, col5, col1, col2, col3),

coalesce(col5, col1, col2, col3, col4)

) as high_val

from test_data

;

There are other ways to write this, but they are all ugly (and become uglier with the number of columns involved). However, testing - described later - shows that it's efficient.

There are various ways to write solutions with correlated subqueries, but they don't seem efficient. In Oracle 12 and above, we can use the LATERAL clause (see below). I was hoping this may be more efficient, but I couldn't figure out how to use it efficiently (or perhaps there is no "good" way to use it). Here is what I came up with.

select t.id, s.high_val

from test_data t,

lateral

( select max(y.val) as high_val

from (

select t.id as id

from dual

) x,

lateral (

select id, val

from test_data

unpivot ( val for source in (col1, col2, col3, col4, col5) )

where id = x.id

) y

) s

order by id

;

(There is also a more concise way to write this, with only one LATERAL clause, but that version - in effect - correlates two levels deep; as Solomon Yakobson pointed out in another thread today, that only works in some versions of Oracle, and only because Oracle didn't get around to prevent it yet. And performance testing shows there is no improvement; the query is shorter but the execution time is about the same.)

My question in this thread is: Are there better ways than the ones already discussed, to take advantage of the way the values are already presented in rows?

Statements to create a bigger TEST_DATA table: (NOTE - I didn't bother to add NULLs since that is unlikely to affect performance much, as long as I don't change the solutions.)

On this data, the solution with GREATEST, rigged to handle NULLs, runs in a little over 3 seconds on my laptop. The UNPIVOT and MAX solution takes about 8 seconds, and the LATERAL solution about 7 seconds (with some variation from one test to the next; I would say there isn't much of a difference between UNPIVOT/MAX and LATERAL).

what do you think about case statement approach rather than greatest function ...

case
when col1>=col2 and col1>=col3 and col1>=col4 and col1>=col5 then col1
when col2>=col3 and col2>=col4 and col2>=col5 then col2
when col3>=col4 and col3>=col5 then col3
when col4>=col5 then col4
else col5
end

Let's say just for the sake of the argument that -1000 is the least number (read "negative number of greatest absolute value") possible in the NUMBER data type. (We all know it's not -1000, but let's just say it is). Then I could do NVL(COLn, -1000) and take the GREATEST over the results. Now if the result is -1000, all I have learnt is that all values are EITHER NULL OR -1000, and I don't know more than that. The correct answer is either NULL or -1000 and I can't tell from what I've done so far (although I can, indeed, check if at least one value is non-NULL).

My question, though, wasn't so much "how do we solve this problem using GREATEST and NVL", but rather "what other techniques/features are available for this type of situation", which may prove helpful in broader classes of similar problems.

OK - suppose GREATEST didn't exist, we were writing our own version, and we didn't worry about NULLs. One way to write the function would be with a CASE like you showed.

I don't think that would be the most efficient way; too many comparisons, on average. Some sort of recursive definition, like GREATEST(COL1, ... , COL5) = GREATEST(GREATEST(COL1, ... , COL4), COL5) requires only 4 comparisons in all cases.

I just tested the CASE solution on the big data (1 million rows) and it completed in about 4 seconds. A bit longer than the GREATEST solution, even with all the COALESCE calls.

And the second disadvantage, demonstrated recently in another thread by Solomon Yakobson, is that UNPIVOT, unfortunately, does a UNION ALL behind the scenes - in this case, the table is read five times. (So UNPIVOT will do the same job as "manual unpivoting", which is just a UNION ALL of SELECT ID, COL1 FROM TEST_DATA UNION ALL SELECT ID, COL2 FROM... etc.)

...

In Oracle 12 and above, we can use the LATERAL clause (see below). I was hoping this may be more efficient, but I couldn't figure out how to use it efficiently (or perhaps there is no "good" way to use it).

...

I recall Solomon saying the UNPIVOT isn't always efficient. He usually suggests using a cross-join instead.

Try using LATERAL or CROSS APPLY without UNPIVOT. Sorry, I'm not at an Oracle 12 database now, so I can't test it, but It goes something like this:

Let's say just for the sake of the argument that -1000 is the least number (read "negative number of greatest absolute value") possible in the NUMBER data type. (We all know it's not -1000, but let's just say it is). Then I could do NVL(COLn, -1000) and take the GREATEST over the results. Now if the result is -1000, all I have learnt is that all values are EITHER NULL OR -1000, and I don't know more than that. The correct answer is either NULL or -1000 and I can't tell from what I've done so far (although I can, indeed, check if at least one value is non-NULL).

...

The point of replies #1 and #7 is, when you know a lower bound to the possible values, use GREATEST and NVL with some value lower than (not equal to) that lower bound. So, if you know that the columns might contain values as low as -1000 (but no lower), then use something lower than -1000 (like -1001 or, as Manik did, -1E66) as the 2nd argument to NVL, like this:

where "The most efficient way to unpivot is to do a cross join between the [relevant] table and a correlated VALUES of as many rows as columns that need to be unpivoted"

Suggesting you might want to start with something similar to

SELECT T.id, U.clmn, U.result

FROM test_data AS T,

LATERAL (VALUES (1, T.col1),

(2, T.col2),

(3, T.col3),

(4, T.col4),

(5, T.col5)) AS U (clmn, result);

Now of course, the same set of tests can be carried out in Oracle. You mentioned you were after a more efficient method, and this appears to be a more efficient method of unpivoting than your UNION ALL.

This thread includes some dissing of UNPIVOT. mathguy says "[as] demonstrated recently in another thread by Solomon Yakobson ... UNPIVOT, unfortunately, does a UNION ALL behind the scenes - in this case, the table is read five times."

select id, high_val
from test_data t, lateral (
select max(
case level when 1 then col1
when 2 then col2
when 3 then col3
when 4 then col4
when 5 then col5
end
) high_val
from dual connect by level <= 5
);

But it doesn't perform any better than UNPIVOT.

[UPDATE: sorry, just realized Frank already spoke of this in reply 8.]

On my laptop running the Oracle Developer Days VM for 12.2, the GREATEST(COALESCE(... solution runs in less than half a second.