Improving IP Geolocation using Query Logs

IP geolocation databases map IP addresses to their geographical
locations. These databases are important for several applications such
as local search engine relevance, credit card fraud protection,
geotargetted advertising, and online content delivery. While they are
the most popular method of geolocation, they can have low accuracy at
the city level. In this paper we evaluate and improve IP geolocation
databases using data collected from search engine logs. We generate a
large ground-truth dataset using real time global positioning data
extracted from search engine logs. We show that incorrect geolocation
information can have a negative impact on implicit user metrics. Using
the dataset we measure the accuracy of three state-of-the-art commercial
IP geolocation databases. We then introduce a technique to improve
existing geolocation databases by mining explicit locations from query
logs. We show significant accuracy gains in 44 to 49 out of the top 50
countries, depending on the IP geolocation database. Finally, we
validate the approach with a large scale A/B experiment that shows
improvements in several user metrics.