Postgres 9.3 min/max for the cidr data type

16 Mar 2015

I've become a big fan of PostgreSQL (aka PostgresDB or just Postgres depending on where you look). Over the years they have added some of the best database technology to the point where if I do need a relational database, I will use PostgreSQL without a second thought.

I also believe that you should use a relational database for data that has relationships - you know, when you might want to compare two pieces of data against each other in some way or other.

I've found that all data is relational... ;)

I digress. This post is about the CIDR data type that PostgreSQL makes available, and how (in 9.3 and below) there is no min or max function inbuilt into the data type. That may be coming with 9.5 I think, but its not what we are using today.

CIDR or Classless Inter-Domain Routing if you want to be all formal about it, is a common way of detailing out IP subnet ranges. Sadly PostgreSQL in version 9.3 and below does not have a default method for determining the minimum IP address and maximum IP address within a CIDR range. Using a 'trick' its actually quite easy to determine this for yourself, but it's a) non-obvious, and b) not documented in a straight-forward way that I've seen. Hence this post...

So on to my use-case. MaxMind helpfully provide a set of GeoIP databases related to what countries and/or cities any specific IP address is associated with. GeoIP identification is a common thing to use in an application and specifically the free country geoIP db from MaxMind is pretty accurate (higher than 90% I've seen said on the Internets, so it must be true!).

Their geoIP listing is pretty good - for example if you take the first few rows of the 'blocks' data, you get the following:

geoname_id and the like relate to the locations data (country, region, that sort of thing). The key focus here in the blocks table is the network column.

1.0.0.0/24 of course actually means the 255 IP addresses between 1.0.0.0 and 1.0.0.255. In the networking world the 1.0.0.255 address would actually be used for broadcast purposes, but its still an IP of note - it's just unlikely to ever be seen by a service you are running and wanting to know the geoIP location for.

If you can use PostgreSQL natively, this entire article is pretty irrelevant, as you can use some of the more interesting functions to figure out if an IP address of 1.0.0.57 is within a CIDR range. For example:

SELECT geoname_id FROM geoip_blocks WHERE 1.0.0.57 <<= network;

This will return the geoname_id of 2077456.

Not an issue right? Except if you are using the Java Persistence API (JPA) and therefore using the Java Persistence Query Language (JPQL) you will quickly find that <<= (which means 'is contained within or equals') and the like are not supported. PostgreSQL is just too advanced...

To get around the lack of advanced query operators, you have to use the older BETWEEN query functionality. Except that by default you can't. Due to no MIN or MAX, its really hard to use a between on a single object that is actually a range of IP addresses.

Ideally you would want to use the MIN or MAX functions to be able to return the minimum or maximum IP address. In PostgreSQL 9.3 and below they do not do this. So you are left with wanting to figure out the MIN or MAX for yourself based on the network CIDR data.

Roll in a little known (or at least not very well documented) trick, and you can do this very easily. Knowing the trick is the majority of the battle. It turns out that if you subtract 0.0.0.0 from a CIDR range, it will return the number of IP addresses possible up to that point of the range. In this case:

SELECT CIDR '1.0.0.0/24' - INET '0.0.0.0';
returns:
16777216

Well, that was the easy part right? Getting the minimum is really straight forward. So is the maximum when you use the same technique, but the thinking process you need to go through isn't quite as obvious. If you have worked in the networking field for a while, you will know that the /24 of a CIDR range is also the same as the subnet mask 255.255.255.0. You have probably seen this in the ipconfig of your own network device. What the subnet mask really means is that the first three octets (the 255.255.255. part) are specific down to a single IP address. The last octet is 0 which actually means any number can go there - so 255 total possibilities.

This almost helps us right? You could create a CIDR lookup table and do some math alongside the min function to figure all this out. Thankfully PostgreSQL also provides a function called hostmask, which returns the inverse of the subnet mask - 0.0.0.255. This shows us which octets have what number of possible values.

This is helpful because using the trick above, you can do the following to get the maximum number of IP addresses within a single CIDR range:

So now we have a way of getting the minimum or maximum of a specific range, and by translating a single IP address into a number, we can use that information and the BETWEEN clause in a SELECT statement to figure out which network CIDR range an IP address is located within. As an example, our test value of 1.0.0.57 just happens to be the 16,777,273th IP address in the IPv4 world.

SELECT INET '1.0.0.57' - INET '0.0.0.0';
returns:
16777273

16777273 of course is in-between 16777216 and 1677471. This is good enough for JPQL as long as the minimum and maximum data is in a table already. Given we have to create and then update the geoIP data tables from time to time (monthly for the free tables, more often for the paid-for subscription from MaxMind), then we can easily add a couple of extra columns and update them with the minimum and maximum values.

It's all well and good using commands that don't really use data in tables - so an example of running an update on an entire table to populate the MIN/MAX fields would actually look like this:

Yes, there are 177,615 different network blocks as I write this in the MaxMind Country geoIP database.

So, how best to do this from start to finish? Why with Ansible of course! Following is the general way I write an Ansible script, and thankfully MaxMind provide a MD5 checksum of their data files so I use that to see if I need to download data or not.

First in my process, I start with what I need to accomplish in pseudo code:

keep copy last md5 checksum for comparison
get latest md5 checksum
compare the md5 files
if different delete old data files
if different download new data files
if different extract new data files
if different import into database

This really isn't something I feel a role is good for (its an infrequent set of commands that we do not need to run very often - maybe once a day at most), so this becomes its very own playbook. I'm using Ansible 1.7.x in this example as there are still a few backwards-incompatibility bugs in 1.8.x as I write this, so just be aware this probably does not work in 1.8.x or higher (specifically the inclusion of an include using a variable for that directory). I also don't use some of the core modules (wget directly is probably a sin to use in the eyes of the creator of Ansible ;) ) for good reasons - I don't always need to be idempotent, and some of the core modules do bizarre things in my OS of choice.

Then I go through and expand each line (either multiple commands or just one depending on how good my pseudo code skills are on the day). As an example, what originally was

Yes, I used raw - it gives me a consistent result every time I run it across Ansible version updates (something I've had issues with more than once), and in this case I know I have to run the command - it really can't be idempotent at this point of the script.

Don't worry if the below is confusing - I will eventually explain it all in future posts...

What I ended up with of course was something slightly longer-winded (some lines sanitised of course!):

If you are used to reading Ansible scripts, you will have already seen that my conditional on if I should download the bigger data zip is based on a when clause:

when: md5new.stdout != md5old.stdout

It would be great to do that in a simple 'set variable so that I can compare the md5 files', but thats not the direction Ansible went, so I improvised and use CAT to accomplish the same concept.

CAT as a command is pretty useful for doing that comparison - there are many other ways of course I'm sure! set_fact is not one of those ways, or at least isn't in all the attempts I've made to use that particular command ( /rant!).

You will also have noticed that there is a SQL file referenced. And for the very astute, no db password :) I use another trick for storing db passwords (a .pgpass file in the home directory of the user running the sql commands) as I hate them being listed in Ansible directly for what are hopefully obvious reasons.

The SQL file is a bit more involved, as we have to do a few things (back to pseudo code!):

To make sure everything works as intended, and to minimise production interruptions, all this is run in such a way that if it fails, it fails before the table switching occurs. The SQL could also be easily broken up into two consecutive commands, for creating the temporary data and then only running the table switch and renames at the end if all is good with the create. This would be sensible for data sources that you are importing where you can't trust that the data will be in the same format (or consistent in quality!) each time. MaxMind in my experience give consistent high quality data so I didn't feel it was necessary in this example to go to that extreme (my testing did, but thats because I didn't trust my own SQL writing skills!).

The entire script takes ~5 seconds to run, but the really important piece is when the tables are switched, and by changing the method to drop/rename rathe than drop/create, our total outage time is less than 0.1 seconds. If desired this can also all be encapsulated within a non-locking commit, but if your application engineers are good, they will be able to handle a 0.1 second outage within their code. I would never ask them to handle a 5 second outage though - thats just asking for trouble :P

It's also possible to run the above as a full schema swap (i.e. tables/indexes/constraints keep their names, all you do is drop old schema and rename temp schema). From a timing perspective this takes around about the same time, and in some of my testing was actually taking 10 to 20 milliseconds longer. Of course some of the time it was slightly faster, so I'll write that off as networking variances within AWS.

Well there you have it - a real world example of creating MIN/MAX for JPQL usage, while keeping all the awesome functionality of the CIDR type within PostgreSQL for when it can actually be used natively. One day we'll get MIN/MAX as functions within PostgreSQL. One day soon I think, but we needed geoIP queries via JPQL and given the above, we already have that.