A coding-theoretic framework for query learning

Clay Scott

About the Event

In query learning, the goal is to identify an unknown object while minimizing the number of “yes” or “no” questions (queries) posed about that object. Our motivating application is toxic chemical identification, where the goal is to identify a toxic chemical while testing as few symptoms as possible. This talk considers a common query learning algorithm called generalized binary search (GBS). We show that GBS may be viewed as an extension of Shannon-Fano coding, a well-known algorithm from source coding theory. We then leverage this coding theoretic framework to generalize GBS in three directions that are often important in practical applications: (1) Robustness to uncertainty in the prior distribution over objects, (2) query noise, and (3) identifying only the group to which an object belongs, such as the intervention associated with a toxic chemical. This is joint work with Gowtham Bellala and Suresh Bhavnani.