Today, we continue to explore all the powerful and fun things you can do with SQL. SQL is a very expressive language, and when it comes to analyzing your data, there isn't a better option. You can see the evidence of SQL's power in all the attempts made by NoSQL databases to recreate the capabilities of SQL. So why not just start with a SQL database that scales? (Like my favorites, Postgres and Citus.)

Today, in the latest post in our "Fun with SQL" series (earlier blog posts were about recursive CTEs, generate_series, and relocating shards on a Citus database cluster), we're going to look at window functions in PostgreSQL. Window functions are key in various analytic and reporting use cases where you want to compare and contrast data. Window functions allow you to compare values between rows that are somehow related to the current row. Some practical uses of window functions can be:

Finding the first time all users performed some action

Finding how much each users' bill increased or decreased from the previous month

Find where all users ranked for some sub-grouping

The Basic Structure of a Window Function in Postgres

Window functions within PostgreSQL have a built-in set of operators and perform their action across some specific key, but they can have two different syntaxes that express the same thing. Let's take a look at a simple window function expressed two different ways:

The first format:

SELECT last_name,
salary,
department,
rank() OVER (
PARTITION BY department
ORDER BY salary
DESC)
FROM employees;

The second format:

SELECT last_name,
salary,
department,
rank() over w
FROM employees
WINDOW w as (partition by department order by salary).

With the first query, we can see the window function is inlined, whereas the second, it is broken out separately. Both of the above queries produce the same results:

Both of these show the last name of employees, their salary, their department — and then rank where they fall in terms of salary in their department. You could easily combine this with a CTE to then find only the highest paying ( where rank = 1) or second highest paying ( where rank = 2) in each department.

What Can You Do With Window Functions in Postgres?

Within Postgres, there are a number of window functions that each perform a different operation. You can check the PostgreSQL docs for the full list, but for now, we'll walk through a few that are particularly interesting:

rank — As we saw in the earlier example, rank will show where the row ranks in order of the window order.

percent_rank — Want to compute the percent where the row falls within your window order? percent_rank will give you the percentage ranking based on your window think of it as ((rank - 1) / (total rows - 1))

lag — Want to do your own operation between rows? Lag will give you the row value xrows before your current row. Want to the value for future rows? You can use lead for that. A great example of this could be computing month over month growth

ntile — Want to compute what percentile values fall in? ntile allows you to specify a percentile to group buckets into. For 4 quartiles you would use ntile(4), for percentile of each row you would use ntile(100).

Hopefully, you'll find window functions as useful. If you have questions about using them, the PostgreSQL docs are a great resource. Or, feel free to jump into our Slack channel.