Using Amazon DynamoDb for IP and co-ordinate based geo-location services part 7: querying the IPv4 range table

In the previous post we loaded the limited IP range records into DynamoDb. As we’re only talking about about 250 records we could have added them in code one by one. However, that strategy would never work for the full MaxMind IP data set of 10 million records. So instead we looked at the built-in Import/Export functionality in DynamoDb. You’ll be able to go through the same process when you’re ready to import the full data set.

In this post we’ll see how to query the IP range database to extract the ID of the nearest geolocation. We’ll get to use the AWS Java SDK.

Querying DynamoDb

We won’t go into the details of querying DynamoDb in general. You can read about it the AWS documentation pages if you want to know more. We’ll concentrate on the specific query to get one data record from our test IP range table.

First off we’ll need the AWS Java SDK in our project. You can either download it from here or from the Maven repository if you have a Maven project:

An alternative is to save the credentials in a file called “credentials” with no file extension. On Windows I had to save this in my user directory in a folder called “.aws”. Note the ‘.’ in the folder name. So in my case I saved the file in the c:\users\andras.nemes\.aws folder. The file contents follows a special format to save the AWS keys:

Again, make sure you set the correct DynamoDb region endpoint otherwise the code won’t find your table. This page lists the region endpoints for every Amazon service that you can use in your code.

The following code will show you a way to find the geoname ID belonging to a single IP address. I selected an IP that exists in the limited IP range table so that I know for sure that we’ll get a valid result. We’ll go through the code in words afterwards:

We first declare the IP to be searched for and the name of the DynamoDb table. Next we find the head element and decimal form of the IP the same way as we did before. Here comes a reminder of the convertIpToDecimalValue method:

Then comes a series of query declarations. We want to use as many parameters as possible to narrow down the range of records that must be scanned: the head element, the lower and the upper limit of the IP range. If we don’t use a narrow enough search, especially if we leave out the head element, then the query will need to scan all records until it finds a match. If the matching record is located towards the end of the table then our query will almost certainly fail with an exception when scanning the full IP range table with millions of rows: either the read throughput or the upper limit of scanned rows – which is slightly above 12k – will be exceeded.

In the above case we want to find the record where the decimal value of the IP we’re looking for lies between the upper and lower limit of an IP range where the IP range head element is equal to the head element of the IP.

The query will return a collection of items. We’re only expecting one so we extract it using iterator.next().

If you run this code then geoname ID should be 2077456. IP “1.0.0.5” gives a decimal value of 16777221 which lies between the upper and lower limit of the following data record in DynamoDb:

We’ll go through a similar process for the longitude-latitude ranges start from the next post.