The link above is for the poster, and for those of you who know what I am doing at work, it will be quite obvious, why I liked it so much!

The authors address the topic of the unstable execution plans. Index access works well with highly selective conditions. The sequential scan works well with low selective conditions. What is the optimizer would judge incorrectly which access path to choose? What is statistics are out of date?

The authors worked with Postgres (which was also very appealing to me) and what they did – they developed a new access method, which is a mix between the index scan and the table scan. This method does not take into account any table statistics at all, it just analyzes the execution “so far”, and if it turns out, for example, that the index scan appears to be slow (i.e., “too many records” are being retrieves, and each of them requires two disk accesses), the smooth scan will switch to one of the reading strategies, which are “morphing” to the full table scan. The least aggressive will just analyze all the records from the page, which was just read into the main memory, the more aggressive will read one adjacent page from the heap, and then increase the number of pages it would read. There are several different ways to drive the policy switch, and the paper describes different policies, and also provides statistics on the real data (and real queries:)) how the smooth scan approach helps to stabilize performance.

One of the questions, which is not covered in the paper, and which several people, including me, asked, was the following: with the different stats it may happen that not the access path for the particular table has to be changed, but the order of join has to be changed. That is, if this specific selection criteria from this specific table becomes less selective, may be we should not change the access to the semi-table scan but rather choose another table to start from, which will have more selective criteria.

I hope that the authors will continue to work in this direction, because this whole research looks very promising, and I am very happy that it received the best paper award!

2 responses to “One of the ICDE presentations I really liked”

Wow, that sound great. I can’t really grasp the specific instances where this would help, but I just need to consider it more.

What I’m more curious about is how this gets into postgres. Is this a pull request and then it goes out in some future release? Or can it be a standalone optimizer strategy that can be plugged into an existing postgres instance?

Right now it’s not a postgres contribution. It’s just a research project, so it’s only available for them. I should have asked, whether they are planning to make it a contrib, but I did’t. In any case it will take a while, so we won’t be able to use this technology in an observable future.