Layering Learning on top of the Task Control Architecture
Reid Simmons 8th of March, at 3:30 in Weh7220.
The TCA philosophy is to begin with a nominal plan that works most of the
time, and layer on reactive behaviors (monitors and exception handlers) to
handle exceptional situations. To date, this methodology has been implemented
by hand: we see where the system breaks, analyze the failure, and determine
how to fix it. While this strategy has been mostly successful, it is not very
appealing, since the onus is on us, the developers, to make the robot more
reliable.
An alternative is to use machine learning techniques to learn new monitors and
exception handlers. While this is appealing, there are many unanswered
questions: If the robot does not already have the monitors for a given
situation, how can it tell that it is in trouble (and therefore, needs to
learn something)? What types of monitors should the robot learn? What
information is needed to learn reasonable exception handlers? etc.
I would like to spend the hour discussing various options, but here is a
strawman: The robot has some very general monitors and exception handlers
(e.g., when the bump panels are engaged, stop!) that detect when the robot has
gotten itself into trouble. These monitors, however, generally cannot predict
when the robot *will* get into trouble. When the monitors trigger, it is an
indication for an opportunity to learn. The robot then uses its past history
of actions and sensor readings to learn a monitor that is able to reliably
predict that the undesirable situation is coming up (this should be perfect
for EBNN!) The monitor can then be easily incorporated into the plan via
TCA. How the robot then learns how to *recover* from the impending situation
is yet to be determined (that's why they call it research!)