Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I want to handle a large dataset (> 1 billion rows) in SQL Server 2008 R2. I heard that if you set up "sharding" or "horizontal partitioning", it makes it quicker to handle large datasets as it breaks the tables up into multiple files.

Have you used sharding or horizontal partitioning before? If so, could you suggest any patterns that, in your experience, achieves better results compared to a single massive table?

1 Answer
1

First, note that 1 billion rows can be handled effectively with a partitioned architecture on ordinary commodity server hardware. Exotic shared nothing architectures will not be necessary for this data volume, however, you will probably get significant benefits from table partitioning.

Sharding is something different from horizontal partitioning, and implies a 'shared nothing' architecture, which is not supported by most versions of SQL Server1

SQL Server can support horizontal partitioning, and a shared disk architecture will be adequate for ~1 billon rows.

In SQL Server, you create a partition function selects a partition based on values or ranges of values in a column on a table, e.g.

Then create one or more filegroups to allocate the partitions to. For a large data set, these file groups could be set up on different physical volumes. Note that direct attach storage will be much faster than a SAN for this in almost all cases. In the example below, we would have created 6 filegroups called PartVol1-PartVol6.

One or more partition schemes can be created to allocate table partitions to filegroups based on the value of the partition function, e.g.

Note that table is created on the partition scheme instead of a specified file group, and the clause specifies the column to be used as the partition key. Based on the partition key, rows in the table will be allocated to one of the filegroups in the partition scheme.

Note: A rule of thumb for designing a partitioning scheme is that each partition should have a row count in the low 10's of millions, say between 10 and 50 million depending on the width of the rows. The disk volume that the partition sits on should be fast enough to do a scan of at least a single partition in a few seconds.

Partitioning, Sharding and Shared Nothing systems

A bit of terminology seems in order here to disambiguate some of the discussion on this topic.

A 'shared nothing' system is a parallel system where the nodes do not have shared SAN storage, but use storage local to the node. The classic example of this type of architecture is Teradata. Shared nothing systems scale well to very large data sets as they have no central I/O bottlenecks. I/O throughput scales with the number of nodes in the system.

A 'shared disk' system is one where one or more database servers share a single disk storage subsystem. The database can be a single server with either local storage or attached to a SAN, or a cluster of servers attached to a shared SAN. Systems of this type are constrained by the throughput available from the storage subsystem.

'Sharding' is a term used to describe splitting a database amongst multiple physical servers in a shared nothing architecture. Various platforms will have greater or lesser support for sharded databases. In Teradata circles, the term is not used because Teradata presents a transparent single system image to the clients, even though the physical architecture is a shared nothing type.

Older versions of SQL Server have limited support for sharding through distributed partitioned views. Microsoft now make a version of SQL Server 2008 R2 that supports a shared nothing architecture with a single system image, but this version is only available to OEMs and can only be purchased in a hardware bundle.

For 1 billion rows

For 1 billion rows (unless the individual rows are extremely wide) a shared nothing or sharded architecture is comfortably in the realms of overkill. This type of volume can be handled on a single server of reasonable spec if it has an adequately fast disk subsystem.

Local direct attach disk is by far the most cost effective in terms of price for performance. A single SAS RAID controller can take several arrays, and multiple controllers can be installed in a server. Depending on the configuration, a modern 24-25 slot SAS array can put through thousands of IOPS or 1GB+/sec in streaming performance; a server with multiple PCI-e busses and multiple controllers can theoretically handle more.

The type of performance necesary to work with a 1 billion row database can be achieved quite easily and cheaply with commodity server hardware and direct attach storage of this type. A SAN could also be used, but you might need multiple SAN controllers to get equivalent performance, and the hardware is likely to be an order of magnitude more expensive.

As a general recommendation, use direct attach storage for applications with heavy I/O
requirements unless you need really good uptime. Configuration and
change control errors are a far bigger source of unscheduled downtime than hardware failure in modern data centre ops.

SANs can give you a more manageable storage platform if you have a large portfolio of
applications as they give you a range of centralised storage management facilities. However, this comes at a steep price and getting high performance out of SAN based infrastructure is difficult and expensive.

1 Microsoft do make a parallel version of SQL Server, but it is only available through OEM channels bundled with hardware. The versions available off the shelf do not support this capability.