Pages

Monday, 10 August 2009

Subquery Factoring Clause - WITH

Through general experience I've found the WITH statement either underutilised or people haven't been aware of it's existence, although that seems to be changing - it has been around since 9i.

For me, two typical examples come to mind.1) I was demonstrating recently an example of using SUBSTR & INSTR together to extract certain parts of a string. I built up the expressions step by step, then to test the example with a slightly different string, I would normally be forced to do a search and replace. Depending on the example, this can be annoying.

Other times recently I have used the WITH statement to essentially encapsulate a unit of work. I had extracted a SQL statement from some PL/SQL and wanted to run it with my own restrictions, but leave the original untouched - this allowed me to test without worrying if I've disrupted the original join behaviour.

WITH original as ({copy of embedded sql})select *from originalwhere {my own restrictions}

These are ad hoc reasons, there is also a potential performance benefit from using the WITH clause.

I'm sure I've seen an example where using the WITH clause provided a performance benefit, but memory is fallible so I thought I'd see what the documentation had to say.It says it will give the optimiser more choice about what to do, and that we can improve the query by using the WITH syntax.After comparing explain plans from the example straight out of the documentation, I wasn't convinced. So I ran timings over different versions of the database - 1000 iterations each. I was surprised at the results (sounds like tabloid journalism, doesn't it?):

Not only has it not performed better, its comparative performance has deteriorated as versions go up. So while we have removed duplicate code and enhanced the readability of our code, this example shows it's not necessarily more efficient.

It may just be a poor example to illustrate the performance benefit of this feature. It shows while a documented effect may be true in some circumstances, you must always test your particular scenario on your framework to get accurate comparisons. For further comment on this paradigm, I highly recommend reading this. Twice.

1. I used this in an ETL process to great benefit. The SELECT in the WITH clause was against a small table (maybe 30 or 40 rows) and the bulk was against a table with 40 to 50 million rows.

2. Sometimes I'll add the WITH statement just to show off (not really...kind of) to the non SQL types. I did it recently and performance was not improved what-so-ever. In fact, I saw numbers like you mentioned. In that particular instance there was a bigger gain in another part of the code (removed the cursor loop DELETE).