Description

In some ways, window functions are similar to aggregate functions in that they perform calculations across a set of rows. However, unlike aggregate functions, the output is not grouped into a single row.

Window function queries are characterised by the OVER keyword, following which the set of rows used for the calculation is specified. By default, the set of rows used for the calculation (the "window) is the entire dataset, which can be ordered with the ORDER BY clause. The PARTITION BY clause is used to reduce the window to a particular group within the dataset.

Each department has a separate sequence of ranks due to the PARTITION BY clause. This particular sequence of values for rank() is given by the ORDER BY clause inside the window function’s OVER clause. Finally, to get our results in a readable format we order the data by dept and the newly generated ranking column.

Now, we need to reduce the results to find only the top 5 per department. Here is a common mistake:

Trying to filter only the first 5 values per department by putting a where clause in the statement does not work, due to the way window functions are computed. The computation of window functions happens after all WHERE, GROUP BY and HAVING clauses have been completed, right before ORDER BY, so the WHERE clause has no idea that the ranking column exists. It is only present after we have filtered and grouped all the rows.

To counteract this problem, we need to wrap our query into a derived table. We can then attach a where clause to it: