Partitioning enhancements in SQL Server 2008

Horizontal partitioning can be a great benefit to data warehouses, as it greatly facilitates the management of very large databases. Of course, querying on partitions can also have performance benefits, especially when only a small percentage of the database is queried and partition elimination can occur. SQL Server 2005, however, can have some unexpected behavior when querying few partitions.

In SQL Server 2005, one thread is allocated per partition when multiple partitions are queried. These threads then work in parallel to retrieve the rows to satisfy the query. If only one partition is queried, however, SQL Server 2005 allocates as many threads as are available to the one partition. As you can quickly see, there is a vulnerability when it comes to queries that query very few, but more than 1 partition.

The common scenario is when only 2 partitions are queried. For example, let’s say you have a data warehouse database that is partitioned on a sales date by month. You write a query to compare last month’s sales to this month’s sales. Now, let’s say this runs on a 32-processor server. How many threads get allocated to this query? The answer is 2, one per queried partition.

This last scenario can become even more complicated when a window of time causes the query to fluctuate between 1 and multiple partitions. Let’s say we compare today’s sales to the same day of the week last week. If both weeks are in the same month, we get full thread allocation. If the weeks span two months (= 2 partitions), however, we only get 2 threads allocated and the query can appear to slow down considerably, for no apparent reason (at least, probably not to the issuer of the query).

Note: All of these scenarios assume that the queries are able to eliminate partitions due to a direct search on the partitioning key. Be sure partition elimination is occurring in order to gain the best partitioned table query performance.

SQL Server 2008 behavior changes the way in which threads are allocated. It allocates all available threads to each partition relevant to the query in a round robin fashion. This results in fast and consistent query execution, no matter how many partitions are queried.

Note that this makes it all the more important to allocate partitions to filegroups that are spread across many disk spindles, allowing the query on a given partition to be as efficient as possible.

But wait, there’s another improvement to partition query behavior in SQL Server 2008. SQL Server 2005 only allows lock escalation to the table level. Let’s say your query eliminates 75% of a table’s partitions, but scans all of the remaining partitions (1/4 of the table). If SQL Server decides to escalate the lock, all other queries will be locked out even if they are querying completely different partitions. SQL Server 2008 provides a table option to override this default behavior. Note that the default is still to escalate to table locks (at least for now) so this option will need to be changed to take advantage of partition-level lock escalation. Find out more about this, like I did, on über blogger Paul Randall’s post.

Another step in the right direction for partitioning. SQL 2000 partitioning performance was horrible. SQL 2005 still not as good (because of the multi-core procs as mentioned in the article). Looks like these 2008 features might finally give DW architects some more options.

It would follow that with 2005, you would tend to make smaller partitions to minimize the size of a partition a single thread would have to process. In 2008, larger partitions are fine, as there will be more threads working? Any guidelines for deterimining partition size?