Abstract

Citizen scientists are playing an increasing role in helping collect, process, and/or analyze data used to study a variety of scientific phenomena. We address the problem of identifying tasks that are rewarding to the citizen scientists, which results in greater participation, leading to more data and better models. We apply our methodology to eBird, whose participants are avid birders interested in observing different species while contributing to science. In order to improve the birders' chances of meeting their goals, we consider the following probabilistic maximum coverage problem: Given a set of locations, select a subset of size k, such that the birders maximize the expected number of observed species by visiting such locations. We also consider a secondary objective that gives preference to birding sites not previously visited. We consider two variants of the probabilistic maximum coverage problem, provide a theoretical analysis, describe several algorithms with provable approximation guarantees, as well as heuristic approaches, and provide empirical results using eBird data. Our algorithms are fast and provide high quality recommendations.