Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I've never worked with SQL Server partitioning but I currently faced with designing a database for which the volumes probably warrant it. The system is for coupons. The coupons are to be issued periodically, usually every six weeks although there will also be ad-hoc issuance - eg for a special event. There are 15 million customers and for each issuance event, every customer will receive 6 different coupon types, giving a total of 90 million coupon instances. We need to track the coupon instance redemption data and maintain this for 6 months, although typically a coupon is only valid for six weeks. Any redemption request for an invalid coupon will not reach the database because it will be validated by the POS till.

Over a six month period we'll need to store up 360 million rows in the Coupon Instance table and up to 72 million (assuming max 20% redemption rate) in the Redemption table. I get the feeling that these numbers are too big for a single partition?

My question is - what to use as the partition key? One obvious candidate would be by issuance event, giving approximately 6 partitions. But then I think that maybe even that would give a partition size that is too large to allow for optimal performance? Would it be possible to partition by two keys eg by issuance event + last digit of the customer id? So the logic would be:

If issuance event = 1 and last digit of customer id < 5 then
Store in partition 1
Else if issuance event = 1 and last digit of customer id >4 then
Store in partition 2
Else if issuance event =2 and last digit of customer id <5 then
Store in partition 3
Else if issuance event =2 and last digit of customer id >4 then
Store in partition 4
Etc...

Also, I'm not sure of the spec of the database server that we'll need. Will 16gb and 8CPUs be enough? The db needs to be able to return a result from the coupon instance table, keyed on a numeric barcode value in less than half a second. The expected transaction request for validate (select) and redeem (insert) is expected to peak at approximately 3,500 per minute.

The SQL Server 2008r2 64bit db server will be provisioned as VM from a very powerful host with access to a high performance and large capacity SAN.

I'd be very grateful for any advice from those that have deployed a SQL Server solution to manage similar volumes.

This question came from our site for professional and enthusiast programmers.

1

Your tables are still small - no NEED for partitions, I have a table with a couple of billion rows without partition, works. Partitions are nice for FAST DROP, though.
–
TomTomNov 29 '11 at 21:52

Nonsense @TomTom, partitions can be of benefit at row counts a fraction of this. Granted the partition scheme has to be of benefit to the access patterns to realise a performance gain but a blanket "no NEED" at this size is plain wrong.
–
Mark Storey-SmithNov 29 '11 at 22:41

Nope, it is correct. NEED != benefit. NEED is when you run into problems doing queries without partitions.
–
TomTomNov 30 '11 at 5:42

Hey @TomTom I think that you need a small break buddy, that's a little strong, even if not actually offensive. I concur with Mark StoreySmith, a blanket "no NEED" is plain wrong, however your assertion that it's probably not needed is correct. I imagine it's a matter of indexing. I also know that Mark knows what you mean by need vs benefit. Cut us all a little slack and let up on the caffeine, k? (And trust me, I'm known to have very little patience some days, especially days like today where I'm on pain meds for my back)
–
jcolebrand♦Nov 30 '11 at 19:01

4 Answers
4

The server spec questions should be directed to either Serverfault or DBA.SE.

For the partitioning question, I don't think you necessarily need to partition for this.

360m rows is a lot but it's not too unwieldy.

Do NOT under any circumstances try to partition based on the last digit of a field. I'm not sure this would even work, but it's not SARGable which wouldn't be tenable.

If you only need to do a single row seek based on a numeric key, partitioning probably will not help.

If you do decide to pursue the partition route, bear in mind to be effective all your queries need to include your partition key(s) so the engine knows which partition to check. Otherwise it will check them all and you actually hurt performance.

I also concur. Sometimes you just need better indexes.
–
jcolebrand♦Nov 29 '11 at 21:43

I disagree @JNK. A single row seek based on a numeric key that benefits from partition elimination is reducing IO. If patterns of access are such that frequently accessed partitions remain in the buffer pool over infrequently accessed partitions, you have further performance benefits. And we've not even touched on my favourite feature that partitioning gives you, partial availability.
–
Mark Storey-SmithNov 29 '11 at 22:45

For the record, on your other points I agree wholeheartedly :)
–
Mark Storey-SmithNov 29 '11 at 23:08

@MarkStorey-Smith - It's going to depend on his key. As currently defined in the OP the partition wouldn't add any value. It also sounds like he won't be able to use a two-part key with a date field or a "normal" partition scheme.
–
JNK♦Nov 30 '11 at 13:07

You CAN partition on multiple keys if you use a persisted computed column; as others have said, however, partitioning doesn't work for every situation. I'm not sure that I understand your scenario enough to give you specific advice, but here are some general guidelines:

Partitioning is useful in reading data when the partitioning key is part of the SQL statement, which allows the optimizer to invoke parition exclusioning. You need to be sure that the key you choose is useful for most queries.

One benefit of a good partitioning strategy is for aging data; for example, if your partition key is date-based (i.e, the day of the year), and you want to remove all data that is older than a certain date, it's very easy to SWITCH those paritions to an empty table and truncate.

You really need to define your requirements a little more clearly. You mention that you will have approx 360 million rows in 6 months. How about in 2 years time? Will you still be growing only at the rate you are currently growing. Or is there a chance that you will experience exponential growth. Do you want to keep data in this table forever; or would you want to archive data on a regular basis.

Partitioning can be used for data archiving. See sliding window scenario. See this whitepaper and this one.

Partitioning can also be used to manage index fragmentation. You can rebuild/reorganize particular partitions.

You should also consider partitioned views as opposed to partitioned tables. Partitioned Views dont require SQL Server Enterprise license.
Partitioned views also enable you to perform online index rebuilds on a particular "partition".

Partitioning can also be considered when doing your disaster recovery planning. It can be used for partial database recovery. For eg: you can have your old partitions on a different filegroup than the main/current partitions. And then when you are recovering, you recover the primary filegroup, then the filegroup on which your current partitions reside and then lastly you can restore the filegroups on which the old partitions reside. This can reduce the amount of time your application has to be down.

We only need to keep the data for six months. Each week, we would would run a housekeeping job that would delete any coupons issued more than six months previous.
–
Rob BowmanNov 30 '11 at 10:05

1

So basically you would have to delete/remove approx 15 million rows every week. How wide is the table? I would suggest you partition the table by date column. This way the weekly deletes would be a simple meta operation. You simply have to SWITCH the oldest partition out of the main partitioned table into a staging table. Then drop the staging table. This is called Sliding Windows scenario. Look up the first white paper I posted oh how to do this.
–
D KNov 30 '11 at 18:34