Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

Yet, when I select the sensor value valid for a specific time the execution plan tells me it is doing a sort. Why is that?

I would have thought that since I store the values sorted by the Date column, the sorting would not occure. Or is it because the index isn't solely sorted by the Date column, i.e. it can't assume that the result set is sorted?

SELECT TOP 1 SensorValue
FROM SensorValues
WHERE SensorId = 53
AND DeviceId = 3819
AND Date < 1339225010
ORDER BY Date DESC

Edit: Can I do this instead?

Since the table is sorted DeviceId, SensorId, Date and I do a SELECT specifying only one DeviceId and one SensorId, the output set should already be sorted by Date DESC. So I wonder if the following question would yield the same result in all cases?

According to @Catcall below, the sort order is not the same as the storage order. I.e. we can't assume that the returned values are already in a sorted order.

Edit: I've tried this CROSS APPLY solution, no luck

@Martin Smith suggested I'd try to OUTER APPLY my result against the partitions. I found a blog post (Aligned non-clustered indexes on partitioned table) describing this similar problem and tried the somewhat similar solution to what Smith suggested. However, no luck here, the execution time is on par with my original solution.

the query optimizer is extended so that a seek or scan operation with
one condition can be done on PartitionID (as the logical leading
column) and possibly other index key columns, and then a second-level
seek, with a different condition, can be done on one or more
additional columns, for each distinct value that meets the
qualification for the first-level seek operation.

This plan is a serial plan and so for the specific query you have it seems that if SQL Server ensured that it processed the partitions in descending order of date that the original plan with the TOP would still work and it could stop processing after the first matching row was found rather than continuing on and outputting the remaining 499 matches.

In fact the plan on 2005 looks like it does take that approach

I'm not sure if it is straight forward to get the same plan on 2008 or maybe it would need an OUTER APPLY on sys.partition_range_values to simulate it.

Your idea seems logical, but I feel I'm over my head when it comes to the OUTER APPLY and sys.partition_range_values. Can you give me any hints or links to other sites that describes this approach? (and...I'm using sql server 2012 if that makes any difference)
–
m__Jun 13 '12 at 5:45

@m__ I won't have time to look at it now but the idea would be to have a nested loops plan (probably just use INNER LOOP JOIN rather than OUTER APPLY) with the driving table being two columns (left boundary, right boundary retrievable from sys.partition_range_values) and the inner side of the join just accessing one partition per execution.
–
Martin SmithJun 13 '12 at 7:22

And then change the query slightly to TOP 1 ... ORDER BY driving_table.left_boundary DESC, DATE desc with the idea of getting a plan where it processes each partition individually in descending date order and stops when the first row is found, similar to the 2005 plan
–
Martin SmithJun 13 '12 at 7:29

I found a blog post that used CROSS APPLY to fix the same problem, but it did not improve the execution time at all (see updated question). I will try to look inte your OUTER APPLY suggestion...
–
m__Jun 13 '12 at 8:11

@m__ - What plan do you get for that query you added to the question on 2012? When I try it on 2008 R2 with my example data I see it only pushing 1 row round rather than the 500 previously. It does look as though it is doing an unnecessary second join on SensorValues though. Why not just SELECT * FROM Top1?
–
Martin SmithJun 13 '12 at 9:25

Not what they are asking. They are asking why the plan requires a sort operator to get the TOP 1 ordered by date and what index would avoid that I think.
–
Martin SmithJun 12 '12 at 14:12

The OP said, "Since the table is sorted DeviceId, SensorId, Date and I do a SELECT specifying only one DeviceId and one SensorId, the output set should already be sorted by Date DESC."
–
Mike Sherrill 'Cat Recall'Jun 12 '12 at 14:19

Ah sorry I see you were specifically addressing the "Can I do this instead?".
–
Martin SmithJun 12 '12 at 14:34

1

Well, earlier, the OP also said, "I would have thought that since I store the values sorted by the Date column, the sorting would not occure [sic]." So at least part of the problem is that misconception about what a clustered index does. I think it's good to straighten that out anyway.
–
Mike Sherrill 'Cat Recall'Jun 12 '12 at 14:41

Maybe I'm just being stubborn (so please forgive me ;-)). Anyhow, I've read the blog post by Hugo Kornelis and it is pretty straight forward. However, in his example he is using one clustered index and one non-clustered, the non-clustered index is smaller in size and is thereby being used in the execution plan. In my case I have only one clustered index, can sql server still return the values in wrong order (it has no smaller index to use and full table scans are way too slow)?
–
m__Jun 13 '12 at 9:28

Seems possible. When I run the OP's CREATE TABLE and the SELECT query on an empty table I get a serial plan with no sort operator.
–
Martin SmithJun 12 '12 at 14:27

@MartinSmith: Thanks Martin. Trying to find some reference to back this up too...
–
gbnJun 12 '12 at 14:29

Although I hadn't bothered setting up a partition function on date before. Now I have and seems to be partitioning is the culprit with 2005 possibly behaving better for this particular query.
–
Martin SmithJun 13 '12 at 0:00

Basically you are right - since the primary key is in the order "DeviceId, SensorId, Date", the data in the key is not sorted by date, so can't be used. If your key was in a different order "Date, DeviceId, SensorId", then the data in the key would be sorted by date, so could be used...

Changing the sort order gives me performance-issues because it has a hard time fetching values for the specific device. Would a non-clustered index, i.e. "CREATE NONCLUSTERED INDEX NewIndex ON SensorValues ( Date ASC )", do the trick?
–
m__Jun 12 '12 at 8:36

1

Well, it did remove the sorting but instead added an Index Seek and Key Lookup. The resulting execution time is on par with my earlier solution.
–
m__Jun 12 '12 at 9:36

Sorry, I didn't mean to suggest you change your key, I was just illustrating the point! It really depends what you do most often - if you regularly read ordered by solely date, more often than the key access, then you could change your original key to be non clustered, and add a clustered index ordered solely by Date. It depends on what else you do with the table really
–
Fergus BownJun 12 '12 at 9:40

Or you could just add a separate non clustered index, which should help, but won't help as much as a clustered index...
–
Fergus BownJun 12 '12 at 9:42

Actually, I've just reread the question, and realised that I'm being dim - with the where clause you have specified, the change in index won't make any difference I don't think
–
Fergus BownJun 12 '12 at 9:46