On 17 April 2012 13:19, Greg Stark <stark(at)mit(dot)edu> wrote:
> All in all I think it's handier to have a stable ORDER BY sort than an
> unstable one though. So I'm not necessarily opposed to it even if it
> means we're stuck using a stable sort indefinitely.
I think it might be useful to disguard the stability property if it's
not a real requirement, as I think doing so gives us additional leeway
to compose descending runs (pre-sorted subsets) that are not in what
is termed "strictly descending" order (that is, they can be a[0] >=
a[1] >= a[2] >= ... and not just a[0] > a[1] > a[2] > ...).
I'll share some preliminary findings on timsort performance (and,
indeed, quicksort performance). I decided on two intial tests - one
that tested performance of sorting against master for a reasonable
case, and the other for a rather sympathetic case.
The first test was for a pgbench run of a query against the dellstore
database, "select * from products order by actor offset 10001",
pgbench -T 300. Here, actors is a text column. I didn't look at scalar
types, because presumably quicksort is going to do much better there.
After running analyze on the table, the pg_stats.correlation is
-0.00493428. There is at best a very weak physical to logical
correlation for the column, as far as the random sampling of analyze
can tell, though there may well still be many individual "runs" of
ordered subsets - I should be able to instrument timsort to determine
how pre-sorted data already is in a well-principled fashion, but not
today.
master:
tps = 43.985745 (including connections establishing)
tps = 43.986028 (excluding connections establishing)
timsort:
tps = 39.181766 (including connections establishing)
tps = 39.182130 (excluding connections establishing)
Here, quicksort benefits somewhat from my earlier work (though there
is no SortSupport entry in any of the tests performed), as a full
tuplesort specialisation is used. timsort_arg is simply a drop-in
replacement for qsort_arg. It's fairly close, but quicksort does win
out here, which is perhaps not hugely surprising. Timsort does of
course have to maintain state like pending runs in a way that
quicksort does not, and quicksort is expected to take advantage of
memory locality to a greater degree. However, if we measure the
expense of the sorts in pure terms of number of comparisons, an
altogether different picture emerges:
timsort: 119,943
quicksort: 136,071
I'd like to see what happens when timsort_arg is further specialised
(not that I envisage multiple specialisations of it or anything - just
a single tuplesort one).
I contrived a very sympathetic test-case too. Namely, I created a new
numeric column for the dellstore database's orderlines table, with a
default value of "nextval('some_seq'), resulting in a
pg_stat.correlation of 1. Then, I changed the actual value of the very
last tuple in the table, with the intention of tripping up our
quicksort "check for pre-sorted array in a single bubblesort iteration
first" optimisation.
Now, I'll grant you that that was a very carefully placed banana skin,
but the results were even still quite surprising and interesting. With
the query "select * from orderlines order by test offset 60351", a
rather large gap in the performance of quicksort and timsort emerges:
Master:
=====
(first run)
number of transactions actually processed: 1891
tps = 6.301991 (including connections establishing)
tps = 6.302041 (excluding connections establishing)
(second run)
number of transactions actually processed: 1878
tps = 6.259945 (including connections establishing)
tps = 6.260839 (excluding connections establishing)
timsort:
=====
(first run)
number of transactions actually processed: 19997
tps = 66.655289 (including connections establishing)
tps = 66.655820 (excluding connections establishing)
(second run)
number of transactions actually processed: 19880
tps = 66.266234 (including connections establishing)
tps = 66.266810 (excluding connections establishing)
I can reproduce this result consistently - these two sets of figures
were produced hours apart.
Number of comparisons for single execution of this more sympathetic query:
Timsort: 60,351
Quicksort: 2,247,314 (yes, really)
The difference here is very big, and cannot be adequately explained by
the mere wastage of a single "bubble sort iteration to check if it's
presorted". I have heard a few stories about weird quicksort
edgecases, and you probably have too, but I don't know for sure which
one this might be right now. My theory is that we're paying a high
price for "going quadratic in the face of many duplicates" protection,
by following Sedgewick's advice and doing a partition swap in response
to the pivot and element being equal (which they pretty much always
are here). This normally isn't so obvious, because of the "check for
pre-sorted input" thing usually takes care of this instead.
My next step is to actually try out a benchmark with a more realistic
dataset, that still reasonably showcases timsort's strengths. As Tim
Peters himself put it in a document that describes the algorithm, "in
a database supplied by Kevin Altis, a sort on its highly skewed "on
which stock exchange does this company's stock trade?" field ran over
twice as fast under timsort", so I'll try and find something along
those lines that is relatively easy to recreate, and relatively
realistic. Certainly, highly skewed data is not at all uncommon.
Just for the record, I don't have any strong opinions on:
1. What we should be doing with timsort, if anything. It is one
thing to demonstrate that it's a useful algorithm under certain
artificial conditions, but quite another to argue for its inclusion in
Postgres, or for it being selectively used at points where that is
likely to be a win, based on some criteria or another like known
cardinality, physical/logical correlation or assumed costs of
comparisons for each type. At the very least, it is an interesting
algorithm, but without integration that makes it actually serve user
needs, that's all it will ever be to us. Deciding if and when it
should be used is a rather nuanced process, and I'm certainly not
about to declare that we should get rid of quicksort. It does appear
to be a fairly good fit to some of our requirements though.
2. Whether we should attempt to patch-up quicksort to prevent this
problem. I lean towards no, since the cure may well be worse than the
disease.
Thoughts?
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services