While learning in the current ARL system implementation is a batch
process, it is possible to extend it into an online version. The
system could then accumulate more training data and learn while it is
operating in interaction mode. The CEM algorithm could, in principle,
run as a background process while new data is acquired and while the
system synthesizes interactions. Using some straightforward
reformulations, an online CEM would update its mixture of conditional
models dynamically as it obtains new samples. Recall that, in
interaction mode, the system is interacting with only a single
human. However, fundamentally, the same type of training data can
still be recovered:
pairs. Even though some
components of the data are synthetic, half are real and result from a
human who is engaging the system. Significant learning is possible
with this data as the system acquires a model of what the human user
does in response to its synthetic actions. Of course, the system has
to be initially trained offline with human-human interactions to
bootstrap some behaviour. Eventually, though, the system can be placed
in an interactive learning mode. Herein the system would
continue learning by dynamically acquiring new responses to stimuli
and includes these in its dictionary of possible reactions. This would
make it adaptive and its behaviour would be further tuned by the
engagement with the single remaining user. This mode of operation is
currently under investigation.