Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

Can't you test it? I think your question is much more useful if phrased as "how I can test which of these queries is more efficient?"
–
Aaron Bertrand♦Aug 13 '12 at 18:56

How can I test it? I mean they give the same result and both are accurate. But which one is better from performance point of view.
–
JackofallAug 13 '12 at 18:57

Definitively stick to the first option. The second option needs another scan / seek to the table depending on your index to find the MAX value which is unnecessary.
–
MiNTAug 13 '12 at 18:57

5

You might be better off if you only ask interview questions about technologies you are proficient with based on your real life experience. Otherwise you are very likely to have some false positives and false negatives.
–
A-KAug 13 '12 at 19:05

2

Has anyone ever asked you how to handle ties? Have you considered the following: select top(1) with ties * from STUDENT order by EnrollmentDate Desc
–
A-KAug 13 '12 at 19:38

2 Answers
2

The second query does not necessarily do this, depending on the data type of the EnrolmentDate column and how granular the data is. If this column does not contain a time component, the query will return all students registered on the last day a student registered, which does not satisfy the question. If there is a time component, it's possible (but much less likely) that there will be multiple rows returned.

(Edit: Alex Kuznetsov correctly pointed out in the comments that the first query doesn't necessarily return the last enrolled student either when there are ties. It is, however, guaranteed to return a record in that event, instead of all records, which is normally satisfactory. I think my point was more that comparing the two queries is comparing apples to oranges, so to speak.)

In any event, if we assume all enrollment dates/times are unique, from what's given, the answer to the question isn't necessarily clear cut either. You would need to qualify for me what you mean by more efficient.

The first query will only scan once, but could potentially incur an expensive sort (you didn't say which indexes exist on the table, so I assume none). The latter query will do a scan to find the maximum, then do another scan to find all matching rows, which would possibly use less CPU, but more logical I/Os. It's entirely possible the second query would be less expensive overall (again, with no indexes available).

Having said all that, if I was to start doing performance tuning on this business operation, I would most certainly start with query (a).

I agree with you and I would say that if all he is getting as an answer is one of the two options without the types of caveats you have here, he is clearly only interviewing people who have a poor understanding of databases.
–
HLGEMAug 13 '12 at 22:06

@HLGEM: "he is clearly only interviewing people who have a poor understanding of databases" -- I'm not sure I can agree with that: there really isn't enough information given to say. The OP stated that "one candidate answered (b)" but didn't say how many others answered with (a), or what kind of discussion ensued. As an interviewer, it would be much more informative to probe deeper to ask why, regardless of which answer was chosen -- I think that would be much more revealing than just a simple multiple-choice question.
–
Jon SeigelAug 14 '12 at 2:27

1

(...) In that respect, it's actually quite an interesting question to ask, even if it wasn't intended that way.
–
Jon SeigelAug 14 '12 at 2:27

I would expect that given the order implicit by the StudentID field that perhaps

a) SELECT TOP (1) * FROM STUDENT ORDER BY StudentID DESC;

would in fact be the bar-none quickest way. However, if the assumption that students enroll in the same order that they are entered in the database is not correct, then I would expect that B would be quicker since a scan over the EnrollmentDate field looking for the max value would be quicker than having to order the entire field.

You are wrong my friend, he needs the last student that rolled in a class, so that has nothing to do with StudentID (for that matter it could have been that StudentID was not an auto generated sequential number). So the correct way is to sort by enrollment date descending.
–
MiNTAug 13 '12 at 19:04

Since there is nothing about classes listed in the original question, I fail to see how my answer is incorrect. In fact, I directly state in the answer that if I am wrong about the sequential nature of the key field, then my statement would not return the correct result.
–
Max VernonAug 13 '12 at 19:12

someone might enter the data wrong and later correct it, which can break the sequence. Suppose you are entering 10 students at the same time, if you made error in enrollment date and fixed it later, can be break the sequence. I got your point btw, just wanted to explain, id may be fast but not a fool proof way
–
JackofallAug 13 '12 at 19:18

My assumption is that the Enrollment field would be machine entered, i.e. the record would be created automatically by the student actually enrolling through an online form. However, you're correct in that this is an assumption that is completely unsafe.
–
Max VernonAug 13 '12 at 19:26