This page presents an annotated bibliography of papers and datasets
related to the field of Internet Protocol (IP) address geolocation.
Many applications require the association of Internet resources with an
accurate geographic label at some granularity. For some applications
knowing the country of origin might be sufficient; for others a more
precise indication at state, city or zip code granularity, or even a specific
latitude/longitude is needed. Below we provide an overview of published
literature related to geolocation in an attempt to describe the current
state of the art. We conducted this literature search as part of our
efforts to
compare geolocation tools.

Introduction

IP address geolocation reminds one of the classic bumper sticker,
"think globally, act locally." In today's far reaching Internet,
organizations and institutions of all kinds from corporations to
governments want exactly that, the ability to communicate to the
entire world and, at the same time, to develop applications which
help them to target, limit, customize their messages, balance
resources, and coordinate responses based on the location of the receiver.
Organizations accomplish this by using tools and services that
translate an IP address or prefix range into a geographic location
(country, state, city, zip, geographic latitude/longitude) associated
with the address(es). Simple, right?

However, which method(s) work best? Which sources of geolocation
services and information return the most reliable locations and at what
cost? What is the geographic resolution? Further, if a source
provides the geographic location of the owner of an IP address, is
this location the same as the location where the device is actually
broadcasting and receiving packets? And, if different, can the
difference be quantified?

What constitutes a "good" geolocation result? Some numbers: with a total land area of
1.5×108 km2
and 195 countries, the average country size on Earth is about
7.7×105 km2, or a linear
size of 880 km. The surface area of the US is about
107 km2.
With 50 states, over 3,000 counties and on the order of 43,000 zip codes, the
average linear size of a state, county or zip code is about 450, 55 and 15 km,
respectively. Looking at another big country, China (about the same size as
the USA) has 33 provinces, 333 prefectures, about 3000 counties, and about
42000 townships, giving sizes of 550, 170, 60 and 18 km, respectively. To begin to be
useful a geolocation method would at the very least need to be able to pinpoint the correct
country, and, in large countries like the USA or China, the correct state or province.
Looking at the above numbers this would require geolocation errors of at most a few
hundred kilometers. An accuracy measured in tens of kilometers would be required to be
effective at a truly local level (county or zip code).

Useful Definitions

A number of concepts are commonly encountered in the geolocation literature.
We define the main ones here.

A Vantage Point (VP) is a measurement infrastructure node with a known geographic location.

A Landmark is a responsive Internet identifier with a known location to which the VP will launch a measurement that can serve
to calibrate other measurements to potentially unknown geographic locations. Some papers use the term
Active Landmarks to refer to points which act as both landmark and vantage point.
Often they are part of an infrastructure platform like
PlanetLab.

A Target is an Internet identifier whose location will be inferred from a given method.
Typically some targets have known geographic locations (ground truth), which researchers can use to evaluate the accuracy of their geolocation methodology.

A Location is a geographic place that geolocation techniques attempt to infer for a given target. Examples include cities
and ISP Points of Presences (PoPs).

Not all terms are used in all papers.

Geolocation Papers

The tables below contain annotations for
papers on the topic of geolocation
.
We have collected and reviewed papers published between 1996 and 2010,
starting with papers from
peer-reviewed academic research conferences, and then including
papers cited from this initial seeding, as well as follow-up papers
written by the same authors. We provide a flexible interactive
table
that supports selection of relevant attributes from these papers.

The first table emphasizes papers that directly address geolocation
methodology, introducing new methods, extensions to previous methods,
performance analysis, etc. The second table includes papers that
address other geolocation-related issues, including applications of
geolocation, and coordinate-based methods for modeling network delays.

Alongside author and publication information, the tables include a
number of additional columns.

Data describes the type of data on which the results claimed in the paper
are based. We mention here if the paper describes "ground truth" (authoritative mappings
between IP addresses and geographic locations) used to validate geolocation
results.

Findings gives a brief description of the main results claimed in the paper.

Probes gives an indication of the experimental setup (probes, landmarks, targets)
used in a geolocation experiment (where appropriate).

Click on a checkbox below to show that attribute for each paper in a separate
column. Enter text in the 'Filter' field to limit the listing. Below
this table is a similar table of attributes of the data sets analyzed
in this set of papers.

Filter:

Columns:

Year

Publication

Authors

Paper Type

Method

Data

Findings

Measurement Setup

(This table is enhanced with Javascript. Use a Javascript-enabled browser to enable filtering, sorting, and view control of this table.)

GeoTrack most promising with median errors of 28 (well-connected hosts) to few hundred km. Median errors for geolocation on same set of univ hosts: 28 km for GeoCluster; 102 for GeoTrack; 382 for GeoPing

explores several issues related to implementation of GeoPing-type geolocation: correlation between RTT and geographic distance: optimal placements of landmarks and probes; methods for evaluating similarities between delay patterns

overview of geolocation methods with general discussion of limitations; discussion of ways adversaries can avoid geolocationp; mentions extraction of IP using Java applet, and RTT measurement by HTTP refreshes

no geolocation method is robust (works for all IP addresses, network configs, and against adverserial users); those trying to evade geolocation, can complicate the task for locators, but geographic information can leak in many ways

combines active measurement approach with an active web-mining technique. Uses CBG for "coarse" geolocation; refines location using "relative network distance" in combination with large number of landmarks located using web-mining technique.

databases are strongly biased to popular countries; db IP blocks use official advertisements of ISP; while some of the ISP address space is geolocated decently (e.g. 20% of Maxmind within 10s of km of groundtruth), in most cases DBs are off by 100s to 1000s of km.

Method to generate PoP-level geographic maps from IP-level graph based on DIMES traceroutes. PoP identification based on structure ('motifs') and a partitioning algorithm that assigns nodes to PoPs; geographic location assigned to PoP from geoloc DBs

Statistical analysis of geographic properties of IP prefixes within context of implications for routing policies. Uses undns for geolocation (i.e., IP-to-geographic location mapping based on geographic information in DNS names)

170000 IP prefixes from RouteViews from 27-Feb-2005; traceroutes to CoralCDN clients and servers; traceroutes from PlanetLab hosts to 4 IPs per prefix

discontiguous prefixes announced by AS from single location usually due to fragmented alloction by registries;
announcement of contiguous prefixes announced by AS from different geographic locations limits oportunities of aggregration of prefixes

Measurement Infrastructure

The above IP geolocation bibliography references several active measurement infrastructures
either because they are used
directly in a geolocation experiment, or because datasets obtained with
the infrastructure are analysed. These resources are listed here with
references back to the papers.

Filter:

Columns:

Organization

Description

Paper ID

(This table is enhanced with Javascript. Use a Javascript-enabled browser to enable filtering, sorting, and view control of this table.)

Current geolocation techniques can be broadly divided into two categories: database-driven
(or registry-based P-25) and measurement-based.
This categorization mirrors a similar division in the types of geographic information
available for IP geolocation: qualitative data, and numerical (quantitative) data. Both
have been present in geolocation efforts from the onset.

The class of quantitative data includes the workhorse of measurement-based geolocation
methods: delay measurements from probes to landmarks and targets. A number of
publications establish the relationship between Internet delay and geographic distance
(P-06,
P-09,
P-10,
P-25)
in the presence of obfuscating factors like circuitous routing, buffering and other delays
(P-11,P-21), etc.
Also included in this class is network topology information, typically derived from traceroute
measurements. Topology information can be an integral part of a geolocation
algorithm (e.g., when intermediate routers to an end target are geolocated alongside
the target itself in a global optimization; P-10),
but is also used in simpler arguments that relate topological proximity to geographic
proximity (e.g., when geolocating the last intermediate router when the real target
is unreachable). Hop counts (also derived from traceroutes) are explored in a recent
paper (P-24) as another quantitative measure
of geographic distance.

The class of qualitative data includes the usual suspects (WHOIS registry, DNS LOC
records, DNS names, BGP router tables), but also databases based on information
gathered from the Internet community (either directly through user input, or indirectly,
e.g. by parsing large quantities of URLs P-17).
This probably also includes the various types of proprietary databases used in
commercial geolocation products. All of these contain geographic
information (directly as in DNS LOC records, or indirectly by linking to an organization
or AS number) that, if correctly interpreted, provide clues about the geographic
location of an IP address, or IP address block.

The earliest geolocation attempts, GTrace
(P-02; constructed around
NetGeo),
GeoTrack and GeoCluster
(P-04) emphasize qualitative data
(primarily WHOIS records and DNS names), but already delay (RTT) measurements
are incorporated. GTrace uses RTT data to validate results using "speed-of-light"
arguments; GeoPing (P-04) is purely RTT-based.
From these early attempts a number of measurement-based algorithms have appeared
in the academic literature.
The table below provides an overview of the accuracy achieved by the various techniques.
In this list only the first three are database-driven; all others (starting with GeoPing)
are measurement-based.

GeoPing (P-04),
which uses similarities between "fingerprints" (based on delay measurements
from a set of probes) for target and landmarks to select the location of the landmark
with the most similar fingerprint as the target location, appears to be mostly of
historical significance at this point as the first measurement-based geolocation method.
Constraint-based geolocation
(CBG; P-07),
using deterministic geometric constraints derived from delay measurements to constrain
the probable location of a target, has set the stage for future development, and is
the most common "benchmark" used to compare more recent models against.

Subsequent geolocation methods show an increasing sophistication in extracting geographic
information, either by supplementing delay measurements with additional data, or by more
complex algorithms. Topology-based geolocation
(TBG; P-10)
introduces topology measurements to simultaneously geolocate intermediate routers and targets.
Further refinements include an improved analysis of delay measurements (separating
the distance-sensitive propagation delays from other processing delays;
P-11, P-21),
incorporating database-driven approaches to improve geolocation accuracy
(P-10), and integrating hop counts into
the geolocation algorithm
(P-24).
Algorithms also are evolving. The most recent models favor probabilistic approaches,
which seem to be a better match to the essentially statistical nature of the relation
between geographic distance and delay measurements. GeoWeight
(P-20)
marks a transition by combining deterministic constraints, similar to CBG, with
probability assignments;
P-18,
P-22,
P-24 and
P-25 describe delay measurements using probability density functions, and
use various statistical methods to build a geolocation algorithm.

Few detailed descriptions of database-driven techniques exist in the literature. The
exceptions are NetGeo (P-03)
and Structon (P-20). Not surprisingly,
published literature contains little concrete information about algorithms employed
in commercial geolocation products. Whether the qualitative
input data are web pages, WHOIS registry records, or DNS names a database-driven
geolocation algorithm tends to be a collage of various heuristic arguments, approximations
and intelligent guesswork.

Error Distance Matrix

The table below compiles numbers from geolocation experiments described in the above
publications for measurement-based techniques. The column headers indicate a range
of median errors in geolocation distance reported in the papers; the values in the
columns are the number of experiments that report median errors in the indicated range.
Even though direct comparison of these numbers is tricky
due to the wide variations in experiment characteristics (different types of targets,
different set of landmarks, etc.), the picture that emerges is that state-of-the-art
measurement based techniques can comfortably geolocate targets with median errors
of < 250 km, while some techniques under favorable conditions can approach
an accuracy of < 100 km. To put this in context: 1000 km can be roughly viewed
as country granularity; 50 km approaches city or zip code granularity.

A direct comparison between measurement-based and database-driven approaches, or even
just between measurement-based algorithms is tricky at best. A systematic comparison
would require the availability of a reliable "ground truth" database of IP addresses
at known geographic locations. This is difficult to find. However, in practice,
the pool of potential test targets at known locations is limited: most recent
published experiments select their ground truth from hosts in measurement
infrastructures like PlanetLab in North America or Europe. So, even though hard to
quantify, the ground truth in some published experiments probably is similar.
In some papers the same ground truth is used to compare different algorithms
(typically CBG is used as a benchmark, which explains the high number of entries for
CBG in the above table), providing some insight in comparative performance.
Obvious questions remain though. How representative are results based on a
limited number of PlanetLab targets for the Internet as a whole? How much does
the accuracy for a method vary from well-connected hosts (routers) to a
heterogenous collection of end hosts? Looking at the above table, the median
errors for CBG experiments vary from better than 50 km to more than 500 km (one
order of magnitude) presumably reflecting a wide variation in experiment
characteristics.

In an average sense the performance of the best geolocation techniques can be
quantified reasonably well: the best measurement-based methods have median errors
of at most a few hundred km (well within country granularity), with the best
results maybe approaching 50 km (city or zip level). Similarly database-driven
techniques also appear to do quite well at the country level, but start running
out of steam at the city level. Whether database-driven or measurement-based,
all techniques suffer from what might be called an outlier syndrome. All techniques
are plagued by outliers with location errors well exceeding 1000 km (or country level).
It would seem that for any potential application of geolocation the key question
to ask is whether being right most of the time is good enough. If the answer is yes,
a secondary question is whether the average accuracy of a selected algorithm
is satisfactory.