Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract

To make intelligent decisions, robots often use models of the stochastic effects of their actions on the world. Unfortunately, in complex environments, it is often infeasible to create models that are accurate in every plausible situation, which can lead to suboptimal performance. This thesis enables robots to reason about model inaccuracies to improve their performance. The thesis focuses on model inaccuracies that are subtle –i.e., they cannot be detected from a single observation– and context-dependent –i.e., they affect particular regions of the robot’s state-action space. Furthermore, this work enables robots to react to model inaccuracies from sparse execution data. Our approach consists of enabling robots to explicitly reason about parametric Regions of Inaccurate Modeling (RIMs) in their state-action space. We enable robots to detect these RIMs from sparse execution data, to correct their models given these detections, and to plan accounting for uncertainty with respect to these RIMs. To detect and correct RIMs, we first develop algorithms that work effectively online in low-dimensional domains. An execution monitor compares outcome predictions made by a stochastic nominal model, to outcome observations gathered during execution. The results of these comparisons are then used to detect RIMs of stateaction space in which outcome observations deviate statistically-significantly from the nominal model. Our detection algorithm is based on an explicit search for the parametric region of state-action space that maximizes an anomaly measure; once the maximum anomaly region is found, a statistical test determines whether the outcomes deviate significantly from the model. To correct detected RIMs, our algorithms apply corrections on top of the nominal model, only in the detected RIMs, treating them as newly-discovered behavioral modes of the domain. To extend this approach to high-dimensional domains, we develop a search-based Feature Selection algorithm. Based on the assumption that RIMs are intrinsically low-dimensional, but embedded in a high-dimensional space, this best-first search starts from the zero-dimensional projection of all the execution data, and searches by adding the single most promising feature to the boundary of the search tree. Our lowdimensional algorithms can then be applied to the resulting low-dimensional space to find RIMs in the robot’s planning model. We also enable robots to make plans that account for their uncertainty about the accuracy of their models. To do this, we first enable robots to represent distributions over possible RIMs in their planning models. With this representation, robots can plan accounting for the probability that their models are inaccurate in particular points in state-action state. Using this approach, we enable robots to effectively trade off actions that are known to produce reward with those that refine their models, potentially leading to higher future reward. We evaluate our approach on various complex robot domains. Our approach enables the CoBot mobile service robots to autonomously detect inaccuracies in their motion models, despite their high-dimensional state-action space: the CoBots detect that they are not moving correctly in particular areas of the building, and that their wheels are starting to fail when making turns. Our approach enables the CMDragons soccer robots to improve their passing and shooting models online in the presence of opponents with unknown weaknesses a