Compression in SQL Server 2008

SQL Server 2008 allows for data compression in tables, indexes, and partitions, which can save disk space and allow more data to fit into RAM, increasing query performance. Microsoft has found that in data warehousing scenarios, the new compression techniques have brought savings that have reduced the size of tables to 15 to 50 percent of their original size. Data compression in SQL Server 2008 comes in two flavors: row compression and page compression.

Row Compression

The vardecimal storage format, which debuted in SQL Server 2005 SP2, is the conceptual ancestor of SQL Server 2008 row compression. Before SQL Server 2005 SP2, the decimal type was always stored as fixed data. Depending on the number of the value’s precision, each decimal value would require between five and 17 bytes to store. Vardecimal stores decimal values in a variable-length storage format. This format reduces the number of bytes required to store a decimal value by eliminating the leading and trailing zeros.

Server 2008 data compression extends this feature to all fixed-length data types such as integer, char and float. Rather than storing data of these types with a fixed number of bytes, this data is stored with the minimum required bytes, without having to alter the types themselves. You simply enable row compression, and SQL Server 2008 Enterprise Edition takes care of the rest.

Row compression doesn’t work on XML, BLOB and MAX data types. From the developer’s perspective row and page compression is transparent and it will not be necessary for you to alter existing applications if you want to take advantage of this feature.

Page Compression

Page compression uses both column-prefix and dictionary compression. Column-prefix compression applies to variable-length columns and works by storing a prefix token and a suffix value in table rows. The value of the prefix token is stored in the page header. Prefix tokens are generated only for common prefix values within the page. Dictionary compression, which also stores the token value in the page header, stores a token in the table row for common values within a page for use with non– variable-length columns.

Enabling Compression

To enable row compression on a specific table, execute a statement with syntax similar to

CREATETABLE Alpha (col1 INT, col2

CHAR(1000))

WITH(DATA_COMPRESSION =ROW)

You can enable page compression on specific partitions by issuing a statement similar to

CREATETABLE Alpha_Partition (col1 INT,

col2

CHAR(1000))

ON PartitionScheme (col 1)

WITH

(DATA_COMPRESSION = PAGE ON

PARTITIONS (1-3)

To alter an existing index so that it uses compression, use one of the following statements:

ALTERINDEX ColIndx ON Alpha REBUILD

WITH(DATA_COMPRESSION=PAGE)

ALTERINDEX ColIndx ON Alpha REBUILD

partition =2

WITH(DATA_COMPRESSION=PAGE)

Space Saved

The new stored procedure, sp_estimate_data_ compression_savings, estimates the amount of space that applying data compression will save. Your savings will depend on the type of data stored.

From the Blogs

Duplicate records clutter databases and render the data within them unclear. This kind of problem is very common, and it’s the main reason that deduping software exists. But there’s another benefit to deduplication software: the ability to infer connections between individual records from various data sets....More

Companies looking to grow and extract value from their data are increasingly turning to Chief Data Officers (CDOs) to execute their data strategy. The role is new, and a playbook is necessary to address the many challenges CDOs face....More

After spending 20 years building analytics, BI and database solutions, I've focused on Cloud data solutions over the past 2 years. I've chosen 5 common challenges that I face every day with Cloud migrations and that you'll face in your Cloud BI projects....More