Passing Different int Values to NTILE()

Passing 1 to NTILE() is worthless; it gives us 1 for each
entry. But what if we pass a larger number to NTILE() than the number of result
set rows? In our test table, we have four entries in the squash category. Lets
try passing different values to NTILE(), starting with 2.

Query:

SELECT
...
NTILE(2) OVER (
ORDER BY Weight DESC
) AS [Ntile]
FROM dbo.ContestResults
WHERE Category = 'Squash'

Note that two entries have the exact same weight (679) but
are split between tile 1 and 2. I tried rebuilding the table and putting Dan
Gardners entry in first and, as you might expect, the tiles were swapped. So,
for entries with equal weight, the order of the base records determines the tiling.

Another pattern is that if you have n records in your
result set and n-1 as your ntile integer, the first two entries always
have a tile of 1.

Dealing Cards

The distribution is actually very simple; its just like
dealing cards. Think of the tiles like card players. If you dealt five cards to
four players, the first player would have an extra card. If you dealt one more
card, it would go to the second player. If you think of the distribution of
tiles like that, it will match the results.

Now we have row numbers starting at 1 for each category.
This will allow us, for instance, to pull the top three entries from each
category. This is really the best choice to fulfill our original task of
picking who receives prizes.

Using ROW_NUMBER() with PARTITION BY and ORDER BY with a Common Table
Expression

Just like RANK() and DENSE_RANK(), you cant do this:

SELECT
...
ROW_NUMBER() OVER (
PARTITION BY Category
ORDER BY Weight DESC
) AS RowNumber
FROM dbo.ContestResults
WHERE RowNumber <= 3

You get Invalid column name 'RowNumber'. So, we turn again to Common Table
Expressions (CTEs) or table variables. I think the CTEs are cleaner, so I will
only illustrate that and not table variables.

Common Table Expression:

WITH t
AS (
SELECT
...
ROW_NUMBER() OVER (
PARTITION BY Category
ORDER BY Weight DESC
) AS RowNumber
FROM dbo.ContestResults
)
SELECT
...
RowNumber
FROM t
WHERE RowNumber <= 3
ORDER BY Category, RowNumber

A Real-World ROW_NUMBER() Example: Most Recent Entry by Person

I have had to use ROW_NUMBER() multiple times recently for a
specific requirement. We have a user action that happens multiple times, and we
have to pull only the most recent entry for each person. ROW_NUMBER() is
a huge winner here.

Say were storing data for a pediatrician and need to pull
information for the most recent visits by patients. This is a simple example,
but you can see the power of using ROW_NUMBER() here.