The following is an introduction to the Action-Reaction Learning
architecture. The system's function is to observe multiple-human
interaction passively, learn from it and then utilize this knowledge
to interact with a single human.

Figure 2.1:
Offline: Learning from Human Interaction

The system is depicted in Figure 2.1. Three
different types of processes exist: perceptual, synthesis and learning
engines interlinked in real-time with asynchronous RPC data
paths. Figure 2.1 shows the system being presented
with a series of interactions between two individuals in a constrained
context (i.e. a simple children's game) 2.1. The system collects live
perceptual measurements using a vision subsystem for each of the
humans. The temporal sequences obtained are then analyzed by a machine
learning subsystem to determine predictive mappings and associations
between pieces of the sequences and their consequences.

On the left of the figure, a human user (represented as a black
figure) is being monitored using a perceptual system. The perceptual
system feeds a learning system with measurements which are stored as a
time series within. Simultaneously, these measurements also drive a
virtual character in a one-to-one sense (the gray figure) which
mirrors the left human's actions as a graphical output for the human
user on the right. A similar input and output is generated in parallel
from the activity of the human on the right. Thus, the users interact
with each other through the vision-to-graphics interface and use this
virtual channel to visualize and constrain their
interaction. Meanwhile, the learning system is 'spying' on the
interaction and forming a time series of the measurements. This time
series is training data for the system which is attempting to learn
about this ongoing interaction in hopes of modeling and synthesizing
similar behaviour itself.

Figure 2.2:
Online: Interaction with Single User

In Figure 2.2, the system has collected and
assimilated the data. At this point it can computationally infer
appropriate responses to the single remaining human user. Here, the
perceptual system only needs to track the activity of the one human
(black figure on the left) to stimulate the learning or estimation
system for real-time interaction purposes (as opposed to interaction
learning as before). The learning system performs an estimation and
generates the most likely response to the user's behaviour. This is
manifested by animating a computer graphics character (gray figure) in
the synthesis subsystem. This is the main output of the ARL engine. It
is fed back recursively into the learning subsystem so that it can
remember its own actions and generate self-consistent behaviour. This
is indicated by the arrow depicting flow from the reaction synthesis
to the learning + estimation stage. Thus, there is a continuous
feedback of self-observation in the learning system which can recall
its own actions. In addition, the system determines the most likely
action of the remaining user and transmits it as a prior to assist
tracking in the vision subsystem. This flow from the learning system
to the perception system (the eye) contains behavioural and dynamic
predictions of the single user that is being observed and should help
improve perception