Making a case for SQL Case Statements

The CASE statement is SQL’s method of handling your standard IF/THEN logic in Excel. You begin with CASE, followed by at least one pair of WHEN and THEN statements, and then you END it. You can be exhaustive in your WHEN/THEN statements, accounting for every possibility, or you can also use the ‘else’ functionality, and create a default value if the column you are looking at does not match any of your WHEN statements.

What’s the use case, no pun intended, for using case statements? I’m glad you asked. Anyone who works with data knows that it often needs a bit of cleaning-up or standardization before it is ready for use, and that’s where case statements can really come in handy.

Take this dataset for example. It breaks down revenue and bids for each campaign and device.

If I needed to pull mobile and desktop revenue data for each campaign (assuming in this situation you consider tablets and iphones are both considered mobile) I could pull this into excel, filter, do some sumifs, etc, and add all of my iphone & tablet data points to get total mobile revenue. Or with a simple case statement, I could easily re-classify iphones and tablets as mobile, and have SQL do all the addition for me. Much easier, much quicker, and no room for user error.

Assuming iphone, tablet, and desktop are my only 3 device options, we essentially need a case statement that looks at the device column and says “if I find “iphone” or “tablet” in device, put ‘mobile’ in my new column labeled “mob_device”, and if it’s not one of those, just copy device over to the new column”. There’s more than one way to accomplish this, but here’s what that could look like:

You can see in the new column created by the case statement that both iphones and tablets are now labeled “mobile”. When the boolean expression evaluates to true, the new value ‘mobile’ is placed in the mob_device column. If it evaluates to false, the else statement takes care of it and places whatever value it already has in the device column into the mob_device column. I included both columns in my select statement, but there’s really no need to include the device column at this point. To find out exactly what my revenue is for each mobile or desktop campaign, all I need to do is a simple sum().

There are more simplistic methods for using a case statement, but evaluating a boolean expression (in this case “device = ‘tablet'” as an example) in the WHEN gives you a lot of free reign. You aren’t just limited to testing equality, does A = b ?, you can compare using operators like greater than (>) or less than (<). Is A > B, is A < B, etc.

Using this same data source as an example, if I wanted to increase bids for all campaigns where revenue was greater than $100, my case statement might look something like this.

So as you can see, there’s really nothing too fancy about case statements, but when properly implemented they can be a huge time-saver and well worth your while. Next time you’re facing a messy dataset, breath deeply and consider bringing in a few case statements to help get your data in line.