/*********************************************************************************************************
Before starting, let’s take a look at the data that exists in the table
**********************************************************************************************************/

SELECT * FROM TEST_TBL
GO

COL1 COL2 COL3

----------- ---------- -----------

1 A 10

1 A 20

1 A 30

1 B 90

2 A 30

2 A 100

3 C 110

3 C 120

/*********************************************************************************************************
Simple Example of a GROUP BY Clause. The grouping is being done over COL1 and COL2
and the aggregate functions are used to display the total, the average, the max and min values, etc.
**********************************************************************************************************/

/*********************************************************************************************************
Now, let us take the same SQL and use the RollUP() function in addition to the GROUP BY clause:
**********************************************************************************************************/
SELECT COL1, COL2, SUM(COL3) AS TOTAL_VAL
FROM TEST_TBL
GROUP BY ROLLUP (COL1, COL2)
GO

Usage of the ROLLUP() function generates the GROUP BY aggregate rows PLUS super-aggregate (cumulative) rows and a final grand total row as well. If you see below, one row with a sub-total is generated for each unique combination of values of (COL1, COL2), and (COL1). As should be clear from the preceding statement, the order of the columns in the ROLLUP() function can change the output as well as the number of rows in the final result set.

COL1 COL2 TOTAL_VAL

----------- ---------- -----------

1 A 60 ==> Same as in the representation above

1 B 90 ==> Same as in the representation above

1 NULL 150 ==> Aggregation of the records from above

2 A 130 ==> Same as in the representation above

2 NULL 130 ==> Aggregation of the record from above

3 C 230 ==> Same as in the representation above

3 NULL 230 ==> Aggregation of the record from above

NULL NULL 510 ==> The grand total

The above SQL can also be written as:

SELECT COL1, COL2, SUM(COL3) AS TOTAL_VAL
FROM TEST_TBL
GROUP BY COL1, COL2 WITH ROLLUP

/*********************************************************************************************************
Now, let us take the same SQL and use the CUBE() function in addition to the GROUP BY clause:
**********************************************************************************************************/
SELECT COL1, COL2, SUM(COL3) AS TOTAL_VAL
FROM TEST_TBL
GROUP BY CUBE (COL1, COL2)
GO

A CUBE() as the name suggests generates data for the grouping of all permutations of expressions i.e. since we have 2 columns in our example that we are using the CUBE() function on, we have 2^2 which means 4 grouping sets:

COL1, COL2
COL2, COL1
COL1
COL2

Thus, one row will be produced for each unique grouping set from above and in addition, a sub-total row is generated for each row and an aggregated grand total row is produced with NULL values in all the other columns. You can see the output below:

COL1 COL2 TOTAL_VAL

----------- ---------- -----------

1 A 60

2 A 130

NULL A 190

1 B 90

NULL B 90

3 C 230

NULL C 230

NULL NULL 510

1 NULL 150

2 NULL 130

3 NULL 230

And as is obvious from the output, the order of the columns has no bearing on the output in the case of the CUBE() function.

The above SQL can also be written as:

SELECT COL1, COL2, SUM(COL3) AS TOTAL_VAL
FROM TEST_TBL
GROUP BY COL1, COL2 WITH CUBE

In order to distinguish the NULL values that are returned because of the usage of the ROLLUP() and/or CUBE() functions versus actual NULL values in the tables, one can make use of the GROUPING function. Example:

As seen from above, whereever the GRP_VALUE is marked as 1, those are the records that were generated because of the ROLLUP() function usage. Since the GROUPING() function takes in only one argument, we made usage of the outermost column in order to filter out all those records where the NULL value was being generated because of the usage of the function. In the case of the CUBE() function, you can use the GROUPING() function multiple times to filter out those records.

Another function to be aware of is the GROUPING_ID() function. This function can be used to compute the level of grouping. So, if there are two columns like we have in our example in this post, the GROUPING_ID() will be computed as:

In the next post, we will talk about the new feature of GROUPING SETS that has been introduced in SQL Server 2008. Using the GROUPING SETS, one can aggregate only the specified groups instead of the full set of aggregations as generated by the CUBE() or ROLLUP() functions.

Share this:

Like this:

LikeLoading...

Related

This entry was posted on November 2, 2007 at 12:01 am and is filed under SQL Server.
You can follow any responses to this entry through the RSS 2.0 feed.
Responses are currently closed, but you can trackback from your own site.

2 Responses to “GROUP BY and CUBE(), ROLLUP(), GROUPING() and GROUPING_ID() functions”