Data Science, Machine Learning, & diverse IT stuff

Month: April 2015

Some time ago on oracle-l, someone asked for opinions regarding this book: Oracle SQL Tricks and Workarounds, by Zahar Hilkevich. As far as I know, no one knew the book (or felt like answering), and as I tend to think I know about the “books to read” in Oracle domains I’m interested in, I got curious. Plus, as a database administrator, I always have this bad conscience about not being fluent enough in SQL (What way round did it work again, connect by? How was the syntax for first_value again? … And I’m still grateful to another book – Advanced Oracle SQL Programming by Laurent Schneider – where I picked up such useful stuff as sys.odcivarchar2list…).
So I thought, I might as well get the book and write a review.

Unfortunately for me, as I do not like to criticize other people, this review will not be positive, and I was quite unsure should I really do it. Now that you read this, it’s clear how I decided, and that was for two reasons:
First, to offer people my opinion in case they’d like some buying advice, and second, more importantly, because what I think is not good about the book can be best illustrated by explaining how other approaches are better, and why, and how they do it, – which is probably the most interesting point here.

Now let me start with something positive: I do like the (intended) focus on creativity. (The question is, does the author’s method indeed result in more creative work).
Jumping right in, then, the first chapter is called “Terms and Definitions”, and here already things get strange – to put it politely. One terminological thing that bothers me may be a subjective reaction, possibly even related to my not being a native English or American speaker. The author explains how he had to choose a name for “the set of alternative solutions to a specific problem” and decided

“We call the family of distinct solutions for a given problem the workarounds for that problem.”

To me, workaround implies the existence of an obstacle, something I’ve had to work around… but well, this does not really matter, it is just a terminology thing.

Now the author has to come up with a criterion of what makes for a distinct solution, and here the decision does matter: Astonishingly, he chooses the execution plan (the plan hash value, to be precise):

“When we come up with a candidate solution, we examine its execution plan against the other solutions found so far. If it is different, we consider what we have found a new, distinct workaround; otherwise, we have only found a workaround equivalent to one of the other workarounds, which we also refer to as a synonymous workaround.”

Synonymous workaround (or solution)? In other words, things mean the same because they are executed the same way? This is a severe category error and must be incredibly confusing (and frustrating!) to SQL developers who lack any optimizer-related background knowledge. Say I rewrite my query completely, and still I get the same execution plan, – really? And then I execute it on another system, where a different Oracle version is running, and the plan is totally different? What’s the difference to trial and error? Nothing, as far as this book is concerned.

Let’s say I obtain a sizeable set of phrasings for my query, how do I decide which one is the best? In the predictive, explain plan style execution plans displayed in the book, all I may consider is the cost – may I rely on that? But choosing among different plans is not addressed by the author (guess I just execute and time the queries in my development environment ;-)). All that is addressed is how to enlarge my set of equivalent formulations.

Now, had the author just been content with showing different ways of writing SQL queries, not mixing in execution plans and pseudo-theoretic jargon, all would have been better or at least less confusing. (I also prefer – and am used to, from the “Oracle literature” in general – a much more modest style of writing, but I will not focus on that here). But given the author does show the explain plan for (just about) every rephrasing, at least he should have made some global comments about how these plans differ in general. But in the book, plans where a scan of some table is executed for every row of another table are not treated any different from plans where every table is scanned once – no ranking of plans, no comment, no nothing.

So we’re back to where I started, the “approach thing”. In a nutshell what this book does wrong is, in my opinion, to promote some sort of mechanical creativity while not providing insight. How do I get a query to run faster? I better start off with a lot of knowledge: How Oracle works, how the query optimizer works, what Oracle does with the query I’m passing in, how queries get transformed and why, how access paths are chosen and why… This is one kind of story (next to, e.g., the reverse engineering adventures) I always love reading on Jonathan Lewis’ blog: He knows what kind of execution plan, what access paths etc. he wants and then he rewrites and/or hints the query until he is there :-).

So optimizer and performance related knowledge is key, and for that, I totally recommend studying another book: Chris Antognini’s Troubleshooting Oracle Performance. Most readers will surely know the book, so I won’t add much more here, apart from it definitely being one of my “Oracle books to take on a lone island” if I were to choose 😉

Secondly, as we’re not talking about Oracle Performance in general, but SQL performance specifically, it certainly makes sense to work on specific techniques and methods (like analytic functions, set-based thinking, model clause etc) – and in general, to keep up with what’s going on, sql-wise, as new Oracle versions get out. In this domain, one book I liked and found instructive was Pro Oracle SQL.

Thirdly, a methodological approach may work, if it is based on criteria that really play a role in execution plans. Although I have no personal experience with it, it may be interesting to look into SQL Tuning, by Dan Tow, who bases his methods on relevant (single-table and join) selectivities.

Of course, this book list is not exhaustive, – the main point is, we have some wonderful, insight-providing books around in the Oracle Performance domain, there’s no need for trial-and-error if we want to be creative :-).