As robots enter the human
environment and come into contact with inexperienced users, they need
to be able to interact with users in a multimodal fashion - keyboard and
mouse are no longer acceptable as the only input modalities. In this paper
we introduce a novel approach for programming robots interactively through
a multimodal interface. The key characteristic of this approach is that
the user can provide feedback interactively at any time - during both
the programming and the execution phase. The framework takes a three-step
approach to the problem: multimodal recognition, intention interpretation,
and prioritized task execution. The multimodal recognition module translates
hand gestures and spontaneous speech into a structured symbolic data stream
without abstracting away the user’s intent. The intention interpretation
module selects the appropriate primitives to generate a task based on
the user's input, the system’s current state, and robot sensor data.
Finally, the prioritized task execution module selects and executes skill
primitives based on the system’s current state, sensor inputs, and
prior tasks. The framework is demonstrated by interactively controlling
and programming a vacuum-cleaning robot. The demonstrations are used to
exemplify the interactive programming and the plan recognition aspect
of the research.