Each user will insert around 3 entries per day. There will be around 1'000'000 users. This means 3'000'000 new records in this table every single day. Once a record is older than 1 month, we delete it.

Most of the queries have the following WHERE clause:

WHERE userId = @userId AND setOn > @setOn

Most queries return no more than 3 rows, except for one which returns all rows inserted within this month (which is at most 90 rows).

The date and userId cannot be changed, once the record is inserted.

Now my question is - how to best arrange the index for this table? I am stuck with the two alternatives:

Clustered index on (userId, setOn) - this will give me fast searches, but I am worried about excessive page splits, because we will insert a lot of intermediary values (same userId but different date).

Non-Clustered indexes on (userId) and on (setOn) - this will also cause page splits on (userId) index (but is it as expensive as in the first option?). The search is slowed down because we are using NC index.

Clustered index on an additional column (id) and Non-clustered index on (userId, setOn) - this will eliminate page splits of the data table, but will still cause some on the NC index. This option is also not optimal for search, since we search using NC index.

What are your suggestions? Any other options?

PS - Thanks for your time.

After 2 days of pondering, I came up with a different solution to this problem.

Basically I grouped 31 days into a single row. This means I insert a new record only once a month per user. This reduces page splits from 3 times a day per user to once a month per user. Obviously there are drawbacks, here are some of them

Row size is huge - however in 99.999%
of the time I query only a single row
from MonthlyDiaries.

I am potentially using more space
than I need, because some days might
have no entries. Not a big deal.

To find an entry for a specific day
will require an extra index seek on
DiaryEntries. I believe it is not going
to be such a great cost, since I am
retrieving not more than 90 rows and
in 80% of the cases I retrieve only 1 row.

Overall, I think it is a good trade-off: reducing from 3 page splits/day/user to just 1 page split/month/user, but in return paying a small price by making my searches slightly slower. What do you think?

7 Answers
7

Fragmentation is mostly an issue for scans, less so for seeks. Fragmentation has a big impact on read-aheads, and seeks don't use nor need read aheads. An unfragmented index with poor column choice will perform always worse than a 99% fragment index with good, usable, columns. If you'd had described DW reporting style queries that scan the tables then I would had recommend focusing on eliminating fragmentation, but for the load you describe it makes more sense to focus on efficient (covering) seeks and (small) range scans.

Given that youre access pattern is always driven by @userId, this must be the leftmost column in the clustered index. I would also add setOn as the second column in the clustered index because it adds some marginal value on most queries (I say marginal because the @userId is so selective, at worst is 90 records from 90 mil., that the extra filtering added by @setOn is not critical). I woudldn't add any non clustered index, from the queries you describe there is no need for any.

The only problem is the deletion of old records (the 30 days retention). I would recommend against using a secondary NC index to satisfy this. I would rather deploy a weekly partitioning scheme with sliding window, see How to Implement an Automatic Sliding Window in a Partitioned Table on SQL Server 2005. With this solution, the old records are removed by a partition switch, which is just the most efficient way possible. A daily partitioning scheme would satisfy the 30 days retention requirement more exactly and perhaps is worth trying and testing. I hesitate to recommend 30 partitions directly because you describe some queries that have potential to seek a particular @userId records in each partition, and 31 partitions may create performance problems under heavy load. Test and measure both better.

Hi, Remus. It is true what you say - "Fragmentation is mostly an issue for scans, less so for seeks." But what about the cost of the actual page split, when an intermediary record is inserted? I understand that this cost is very high. Considering, that most of my INSERTs are going to cause a page split, isn't this a major problem? (Please forgive me if I'm misunderstanding something, coz I'm still a beginner in this field).
–
niaherNov 24 '09 at 2:57

1

'Very high' is relative. Higher than insert at the end of the table, sure. But we're only talking 3 mil. inserts a day, that is 35 per second. Even with a 10x spike is not exactly a high rate. Page splits and other fine points become relevant at rates of thousands of inserts per second.
–
Remus RusanuNov 24 '09 at 3:18

Wow, thank you. This is probably the most useful/encouraging piece of information I've got in days. However as my final decision, I came up with a completely different design. Will post it here soon.
–
niaherNov 24 '09 at 3:34

First add a default constraint on your table. Second, add a partitioning scheme. Third rewrite your most common query.

The clustered index should be set to setOn, user Id. This removes the possibility of the index becoming fragmented. You should use table partitioning to split the table so each month is stored in a separate file. This will reduce maintenance. You can look online for a partition sliding window script that you can run every month to create a new table for the next month, delete the oldest month, and adjust the partitioning scheme. You can also move the really old months to an archive table if storage is of no concern to you.

Your queries where clause should be in the form:

WHERE setOn > @setOn AND userId = @userId

Or when you are returning the whole month:

WHERE setOn BETWEEN @setOnBegin AND @setOnEnd AND userId = @userId

Your new schema design, without the partitioning would look like this:

After you get that working, you have to add partitioning. For that, start with This blog post for some theory. Then start reading this MSDN whitepaper. The white paper is written for 2005, and there were partition improvements in 2008 that I have not investigated, so the solution might be simpler in 2008.

Hi Justin. Thank you for your answer. I took a look at the links you gave me - very helpful. As much as I like your advice on partitioning, I see a problem with having an index on (setOn, userId). The thing is that since we do not specify exact setOn value, but rather a range, the resultset from the (setOn > @setOn) will be huge. Finding correct userId from this resultset will be very expensive, since there is no special order to how userId values are positioned (unless of course the setOn values are identical). Please correct me if I am wrong.
–
niaherNov 22 '09 at 14:08

1

Niaher, The best way to tell for sure is to load up some test data and look at the estimated and actual query plans. If it just selects the partitions and does an clustered index seek, your query is about as optimal as it can get. That is of course unless there is a more optimal partitioning/clustering plan. You might want to experiment with a conditional index (index with a where clause) on just userid for the date range most of your queries are on.
–
Justin DearingNov 22 '09 at 14:42

You want the new rows to be inserted at the end of the table's physical file, like a journal file, because there are so many rows being inserted each day.

Therefore the rows should be ordered in chronological order

Therefore setOn should be the first part of the primary key.
- or, ideally, add a 'postId' column, which is just an integer that autoincrements itself

If you don't want a postId column, then the primary key will be (setOn, userId), otherwise it can simply be postId.

Thus, we have obtained fast insert times. Now we want fast retrieval times when selecting on a per-userId basis.

For this we need to add a secondary index to the table, which should be on useId. Since there are only 90 records per user, this is sufficient for the rdbms to be able to retrieve quickly all rows for that user (all 90 of them, since only one month's rows at a time), and then table scan those 90 rows, which will be blindingly fast.

The index can be any standard b-tree, red-black tree, index, whatever comes with your database.

Insertion will be slightly slowed by insertion into the index, but not too much. Tree structures are pretty good at handling random insertions.

Since the UserId index is based on the set of UserIds, which is a stable set, the tree should be fairly stable and not need too much rebalancing: just the leaf nodes at the end will change as journal entries are added and purged, which will not really change the shape of the tree too much.

I'm not a fan of your new solution. It is just going to introduce new problems, the largest being that UPDATES are (normally) slower than INSERTS, and create a larger risk for blocking when the updates are taking place.

If you are worried about page splits, all you should have to do is adjust the "FillFactor" for the clustered index. FillFactor defines how much of each page is left blank (by default) to allow for changes or inserts.

Setting a reasonable FillFactor means that inserts shouldn't cause (as many) page splits, and your purges of old records mean that more space should be freed in those pages, keeping a (somewhat) consistent free space per page.

Unfortunately, the SQL default is usually 0 (which means the same as 100), which means that all pages are completely full, which causes lots of page splits. Many people recommend a value of 90 (10% free space in each data page). I can't tell you what the ideal for your table would be, but if your ultra-paranoid about page splits, try 75 or even less, if you can spare the extra disk space. There are some perfmon counters you can monitor to watch for page splits, or you could run queries to tell you the percent free space in each data page.

Regarding the specifics of the indexes on your table (the original version), I would recommend a clustered index on ([userId], [setOn]), for the reasons that Remus mentioned.

You will also need a nonclustered index on ([setOn]) so that your "delete old records" query doesn't have to do a full table scan to find all old records.

I'm also not a fan of GUIDs for simple identifiers, most of the time, but I imagine that's probably a little late to change.

Edit: Some preliminary calculations on an estimated fillfactor for this table.

For each user, 3 new entries per day, kept 30 days, so ~90 total entries. Assuming you do a daily purge of all records older than 30 days (as opposed to only purging every 30 days), then you are only adding/deleting less than 5% of records daily.

So a fillfactor of 90 (10% free space on each page) should be more than sufficient.

If you are only purging monthly, then you'd be letting nearly 60 days stack up before you delete the oldest 30, which means you'd need something like 50% fillfactor.

I'd highly recommend a daily purge.

Edit 2: After further consideration, a non-clustered index on [setOn] may not be selective enough to be used by your purge query (a single day is 1/30 or 3.3% of rows, which is right on the edge of "useful"). It might just do a clustered index scan anyway, even if the index exists. Probably worth testing both with and without this additional index.