Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I also understand that the "correct" way of sorting a view is by putting an ORDER BY around the SELECT statement querying the view.

But being relatively new to practical SQL and the usages of views, I would like to understand why this is done so by design. If I've followed the history correctly, this was once possible and was explicitly removed from SQL Server 2008 and so on (don't quote me on the exact version).

However, the best reason I can come up with as to why Microsoft removed this feature is because "a view is an unsorted collection of data".

I am assuming there is a good, logical reason as to why a View should be unsorted. Why can't a view just be a flattened out collection of data? Why specifically un-sorted? It doesn't seem that hard to come up with situations where (at least to me / IMHO) it seems perfectly intuitive to have a sorted view.

5 Answers
5

A view is not materialized - the data isn't stored, so how could it be sorted? A view is kind of like a stored procedure that just contains a SELECT with no parameters... it doesn't hold data, it just holds the definition of the query. Since different references to the view could need data sorted in different ways, the way that you do this - just like selecting from a table, which is also an unsorted collection of rows, by definition - is to include the order by on the outer query.

Also to give a little insight into the history. You could never put ORDER BY in a view, without also including TOP. And in this case the ORDER BY dictated which rows were included by TOP, not how they would be presented. It just so happened that in SQL Server 2000, if TOP was 100 PERCENT or {some number >= number of rows in the table}, the optimizer was fairly simplistic and it ended up producing a plan with a sort that matched the TOP/ORDER BY. But this behavior was never guaranteed or documented - it was just relied upon based on observation, which is a bad habit. When SQL Server 2005 came out, this behavior started "breaking" because of changes in the optimizer that led to different plans and operators being used - among other things, the TOP / ORDER BY would be ignored completely if it was TOP 100 PERCENT. Some customers complained about this so loudly that Microsoft issued a trace flag to reinstate the old behavior. I'm not going to tell you what the flag is because I don't want you to use it and I want to make sure that the intent is correct - if you want a predictable sort order, use ORDER BY on the outer query.

To summarize and just as much to clarify a point you made: Microsoft didn't remove anything. They made the product better, and as a side effect this undocumented, non-guaranteed behavior became less reliable. Overall, I think the product is better for it.

Here's mentioned(in a comment) that the TOP performs a cursor operation over the result set. Therefor it can have an explicit order.
–
Tim SchmelterJul 24 '12 at 20:46

@TimSchmelter that doesn't help when the optimizer sees TOP 100 PERCENT / ORDER BY and removes it from the plan completely. Try it out.
–
Aaron Bertrand♦Jul 24 '12 at 20:47

It was just a sidenote. The rest of the comment is as follows anyway: "the SELECT TOP x trick is scheduled to be deprecated in a future version of SQL Server, so it would be best to not to use it at all."
–
Tim SchmelterJul 24 '12 at 20:51

@TimSchmelter it's already a no-op. I don't think it will stop working in any new version any time soon, because it will break too much existing code.
–
Aaron Bertrand♦Jul 24 '12 at 20:52

2

@TimSchmelter there order is not talking about the order of the rows, it is talking about the order of the columns within a row. In SQL Server we generally don't talk about a row being a tuple because the physical implementation of a row is so obvious to us (the table lists the columns in the order you defined them).
–
Aaron Bertrand♦Jul 24 '12 at 21:07

Tables... are just bags of rows - and then viwes are just virtual bags of virtual rows - views "don't exist" - e.g. there's no data stored for them at all - they're just "stored definitions of a query to be executed", basically.
–
marc_sJul 24 '12 at 20:36

1

+1 The equivalency of tabels and views seems to me the main reason why views should not contain an "order by"
–
miracle173Jul 25 '12 at 7:34

"Views behave like tables". I would say that "Views behave like base tables. Views are tables."
–
ypercubeJul 31 '12 at 0:49

one possibility is to avoid conflicting sorts- if the view is sorting by one order and the select on that view is sorting by another order (not being aware of the view sort), there may be performance hit. So it is safer to leave the sorting requirement to the user.

another reason, sort comes with a performance cost, so why penalizing all users of the view, when only some users need the sort..

I am not entirely sure if this is true, but it seems that if there was ordering in the view it would have to execute that first regardless of any additional ordering in the statement invoking the view.

no, that does not make any sense. It would be possible to allow a sort order in a view definition. if a user selects data from this view using another sort order each optimizer could ignore the sort order of the viewdefinition because it does not have any influence on the result set.
–
miracle173Jul 25 '12 at 12:22

One answer not yet given is that it can interfere with predicate pushing which can greatly affect performance.

An example is rolling up a large set of data into a 10 line summary and then add a top N with order by:

select * from TheView

For that case performance should not be significantly affected by including a top/order within the view. However for a case with a predicate:

select * from TheView where <condition>

The engine may not always be able to "push" the condition to a position earlier in the pipeline because of the top/order by. This may cause unneeded rows to be processed and the plan will be similar to the one without a predicate.

Whether you want a push to happen before the order by is really a developer choice and can change the answer. Most often the push is the logical intent.

Using "top 100 percent" does logically allow predicate pushing however the implementation unfortunately ignores "order by" for this case and defeats the intent.

For real use cases, a nice compromise would be for the engine to perform "order by pulling." This would allow an "order by" to be specified within the view (with top 100 percent or no top) and that order is only used if the view is selected from directly, without an explicit order by, no joins, no aggregates, no view chaining etc. Both cases listed above could be supported by this simple model.