Abstract:
Local aspects of Web search --- associating Web content and
queries with geography --- is a topic of growing interest.
However, the underlying question of how spatial variation
is manifested in search queries is still not well understood.
Here we develop a probabilistic framework for quantifying
such spatial variation; on complete Yahoo! query logs, we find that our model
is able to localize large classes of queries to within a few miles
of their natural centers based only on the distribution
of activity for the query.
Our model provides not only an estimate of a query's geographic
center, but also a measure of its spatial dispersion, indicating
whether it has highly local interest or broader regional or national appeal.
We also show how variations on our model can track geographically shifting
topics over time, annotate a map with each location's "distinctive
queries," and delineate the "spheres of influence"
for competing queries in the same general domain.