Grouping Sets

2008/11/28

I have given a talk about new data types and T-SQL features to three local groups. I am presenting it again at the St. Louis Day of .Net on December 13th (http://stlouisdayofdotnet.com/). Since Itzik Ben-Gan was giving a similar talk at PASS last week, I decided to attend his talk to get some additional insight on the topic.

Itzik’s favorite new T-SQL feature is the grouping sets. Before his talk, I didn’t really see any benefit over using UNION queries (examples below) which will give the identical results. Itzik said that the new grouping sets performed better than UNION queries. When I had experimented with the new feature, I had found identical performance. Rather than embarrass myself by questioning the T-SQL guru, I made a note to revisit this later.

Today I decided to play with the new feature to see for myself. What I found was that if no WHERE clause was used, the performance was the same between the two techniques. The execution plans were almost identical with both queries performing two clustered index scans. Adding a WHERE clause with a filter on the clustered indexed column changed things considerably, and now the grouping set query performed better. It had one clustered index seek while the UNION query had two. I found that adding any WHERE clause, even one that resulted in a scan, improved the performance of the grouping set query.

My examples had only had two grouping levels, and I suspected that adding more grouping levels would show an even more dramatic difference in performance. My hunch was correct. The performance was always better with the grouping set query as long as a WHERE clause was included. The UNION queries had to access the table once for each level while the grouping set query would access the table just once.

Here are the queries I experimented with using the SQL Server 2008 AdventureWorks sample database.

TEST ONE, identical performance, each has a clustered index scan:

select salesorderid,SUM(unitPrice),ProductID
from Sales.SalesOrderDetail
group by grouping sets(SalesOrderID,ProductID)
go
select null as salesorderid,SUM(unitprice),productid
from Sales.SalesOrderDetail
group by ProductID
union
select salesorderid,SUM(unitprice),null
from Sales.SalesOrderDetail
group by SalesOrderID
-----------------------------------------------------------------------

TEST Two: Add a WHERE clause. Now the grouping set query has one clustered index seek while the UNION query has two:

select salesorderid,SUM(unitPrice),ProductIDfrom Sales.SalesOrderDetail where SalesOrderID between 40000 and 50000group by grouping sets(SalesOrderID,ProductID)goselect null as salesorderid,SUM(unitprice),productidfrom Sales.SalesOrderDetailwhere SalesOrderID between 40000 and 50000group by ProductID unionselect salesorderid,SUM(unitprice),nullfrom Sales.SalesOrderDetailwhere SalesOrderID between 40000 and 50000group by SalesOrderID -------------------------------------------------------------------------------

TEST Three: Better performance even when filtering on a non-indexed column. Still one clustered index scan instead of two

select salesorderid,SUM(unitPrice),ProductIDfrom Sales.SalesOrderDetail where UnitPrice between 10 and 20group by grouping sets(SalesOrderID,ProductID)goselect null as salesorderid,SUM(unitprice),productidfrom Sales.SalesOrderDetailwhere UnitPrice between 10 and 20group by ProductID unionselect salesorderid,SUM(unitprice),nullfrom Sales.SalesOrderDetailwhere UnitPrice between 10 and 20group by SalesOrderID-----------------------------------------------------------------------------

select salesorderid,SUM(unitPrice),ProductID,SpecialOfferIDfrom Sales.SalesOrderDetail where SalesOrderID between 40000 and 50000group by grouping sets(SalesOrderID,ProductID,SpecialOfferID)goselect null as salesorderid,SUM(unitprice),productid, null as SpecialOfferIDfrom Sales.SalesOrderDetailwhere SalesOrderID between 40000 and 50000group by ProductID unionselect salesorderid,SUM(unitprice),null,null from Sales.SalesOrderDetailwhere SalesOrderID between 40000 and 50000group by SalesOrderID Unionselect null,SUM(unitprice),null,SpecialOfferIDfrom Sales.SalesOrderDetail where SalesOrderID between 40000 and 50000 group by SpecialOfferID