a light that never goes out

Main menu

Monthly Archives: February 2009

The Tour of California Stage 2 started in Sausalito, went over the Golden Gate bridge, then started for real and went to Santa Cruz. Since the start was a few blocks away I got up and wondered down the street to check it out. Fairly exciting, much larger crowd than I would expect for such an early hour on a holiday in crappy rainy weather. I saw Lance Armstrong which was cool, hopefully his comeback works out well for him. All the riders are in such stupendous shape it’s pretty crazy to even think about. Though this time, with the weather, I was happy to go back home.

I love that Postgres support partitions, even if they aren’t the cleanest to set up and maintain. However I have run into some issues with regards to the query planner and partitioned tables. The gist of it is that if the tables are partitioned the query planner can get quite confused and take an endless amount of time to run what should a ‘simple’ query. Which isn’t in and of itself a problem, query planners notoriously pick less than great query plans. The problem is that for most cases you can explicitly tell the planner to things ‘the right way’ (well, your way, anyway) and with partitioned tables you really can’t.
I have a key value table organized by group and I have it partitioned by id in order to be able to limit the total number of rows in a particular table (see Postgres Partitioning and hibernate for one way to do that) and my goal is to denormalize this table to use as the basis for a set of datawarehouse tables. In order to denormalize the key/value table I have to do various subselects against keys that I’m interested in for the datawarehouse. However the table is partitioned by group_id (a group that the keys belong too, maybe 10 keys per group on average) and the denormalized table will use group_id so searching by key results in large table scans for each subselect of all the child tables. Ideally there would be a single scan gathering up the data one cares about (ala GROUP SETS) or in my case if there is a UNION of the results of my query on each child table that would work too.

As you can see the partitioned table cost estimate is significantly worse. As you get more rows and more partitions the difference goes off the charts. In my real world use case I have 50 tables with a million rows each and 9 keys I’m pulling out, the cost estimate lower bound was a 50 digit number. Talking to the folks on #postgres irc channel there doesn’t seem to be any reasonable way to deal with this, so what
I ended up doing was in my java code running my query on each child individually which was very very fast. There is talk of GROUP SETS coming in 8.4 or 8.5 which will be nice.