Abstract:
Sponsored search is one of the enabling technologies for today's Web search
engines. It corresponds to matching and showing ads related to the
user query on the search engine results page. Users are likely to
click on topically related ads and the advertisers pay only when a
user clicks on their ad. Hence, it is
important to be able to predict if an ad is likely to be clicked, and maximize
the number of clicks. We investigate the sponsored search problem from a machine
learning perspective with respect to three main sub-problems: how to use click
data for training and evaluation, which learning framework is more
suitable for the task, and which features are useful for existing
models. We perform a large scale evaluation based on data
from a commercial Web search engine. Results show that it is
possible to learn and evaluate directly and exclusively on click data
encoding pairwise preferences following simple and conservative
assumptions. We find that online multilayer perceptron learning, based
on a small set of features representing content similarity of
different kinds, significantly outperforms an information retrieval
baseline and other learning models, providing a suitable framework for
the sponsored search task.