Scenario One: MapSkills

MapSkills has been developed for teaching geography to 10-15 year old children with the assistance of the EMOTE Empathic Robot Tutor. It was designed and developed in collaboration with learners, teachers and stakeholders. In MapSkills, the empathic robot tutor tailors its assistance and feedback to the affective state of the student, whether directly perceived via its sensors or inferred from student activity in the task or both.

The robot’s learning space provides a novel, contemporary, exciting approach to learning mapping skills, using an interactive touch table. The touch table makes a good display medium for maps and offers advantages over the paper maps often still used in the classroom.

Through discussions with teachers and children the learning experience for mapping skills was transformed into a Treasure Hunt, with the aim of making map skill learning more engaging and enjoyable.

The learning experience begins with the empathic robot tutor providing learners with instructions required to perform the task. Learners do not require any training to engage with the EMOTE empathic robot tutor and quickly learn how to interact. The learners engage in tasks related to compass reading, map symbol knowledge, and distance measuring.

The robot assists the learner in completing tasks with a range of pedagogical actions possible (e.g. hinting, prompting, splicing, engaging in small talk, telling jokes, etc.). The selected pedagogical action is determined through the robot assessing the attempted answer, learner skill levels and their affective state. For instance, if the learner is frustrated and low skilled and having difficulty with a task, the tutor will present the answer, explain it and move on. If they are happy or calm and high skilled, the tutor will give them a lot of time to figure the answer out themselves. If the learner is successful, the empathic robot tutor will empathise with that and be ‘happy’ for the student. Using rules like these, the tutor adapts its behaviour to the learner during the task.

Future Directions

We envisage that our system can be customised into other education domains. The EMOTE empathic robot tutor is not relevant just to map skills, but to a wide variety of domains.

The empathic robot tutor is currently embodied in a NAO robot, however, the computer architecture has been developed to enable it to work with a variety of robots. This offers considerable potential for schools that have already invested in, or thinking about investing in robots, providing them with ways to improve and enhance the learning experience children are achieving with robots.

The EMOTE Empathic Robot Tutor is an exciting and novel advance in social and educational robotics. It not only provides pedagogical strategies, domain knowledge and skills, but more, it tailors the experience to the learner, with the script and lesson plan responding and refining to meet the learner’s requirements, just as in 1-to-1 teaching. Our aim is not to replace a real teacher by a robot, but to use this system as a tool to motivate and encourage the pupils in schools and enhance their learning experience.

Figure 1 provides photos of the hardware set up for the EMOTE Empathic Robot Tutor.

Figure 1 – Hardware Set Up for the EMOTE Empathic Robot Tutor

As figure 1 shows, the EMOTE Empathic Robot Tutor only requires the NAO robot torso, significantly reducing the cost and reducing risk of hardware damage. The rationale for a torso only interaction is that empathic responses are typically achieved through upper torso and head movements, such as gaze behaviour and gestures. Further, the task of teaching mapping skills is not facilitated by having a mobile robot as the touch table provides the learning space.

As can be seen, only a small amount of physical space is needed for learners to engage with the EMOTE Empathic Robot Tutor. The user is able to interact with the robot across a table. Thus, the setup and hardware would fit well into a classroom, school or even a home, taking up only a small amount of space.

To run the EMOTE Empathic Robot Tutor we need a power socket and a desk with minimum dimensions of width 0.6 x depth 0.4 x height 0.7 metres. We will bring the hardware, including the robot and 18” touch tablet along with relevant power adapters.

Power

The EMOTE Empathic Robot Tutor prototype is powered by a 240V, 5 amp UK/Europe power supply, with mains power required for the robot, tablet and sensors (Kinect 2).

The NAO robot is equipped with an internal battery that can last up to 30 minutes.

The Touch Tablet and Laptop can run for up to 4 hours on battery.

Power supply is required for the sensors Kinect and Ethernet router.

The overall setup will require a power supply to work for longer time.

Sensors

Sensing Learner Behaviour

The EMOTE Empathic Robot Tutor senses learner behaviour through:

Microsoft Kinect sensor V2

Body position tracking X, Y, Z

Lean position

Facial features extraction

Pointing detection

Face direction detection

The Kinect module uses the Microsoft Kinect sensor in real-time to extract head gaze information, depth and facial action units.

Web camera with OKAO vision

Body position tracking X, Y

Facial expressions extraction + smile

Face and eyes direction detection

The OKAO module works in real time analysing images from the frontal web camera, passing the information to the Perception module. The extracted information relates to the position of the learner in relation to the camera, their head rotation, eye gaze information, facial expressions and smile estimation.

Affectiva Q sensor

Electro dermal activity

Skin temperature

Accelerometer values X, Y, Z

The Q Sensor module provides some physiological information derived from the skin of the users such as skin temperature, skin conductance and accelerometer readings translated into X, Y and Z axis.

Making Sense of Sensory Data

Firstly, the Perception module, see figure 3, interprets the sensory data from the sensors with which the Tutor platform is equipped. Sensors include the OKAO vision, which contributes information about facial expressions, the Q-sensor, which outputs electro-dermal data, and the Kinect 2 sensor mostly for skeleton data.

The Affect Perception elements of the module then process these into Valence/Arousal dimensions for affect, giving Positive, Neutral and Negative outputs. Additionally, the module calculates the gaze of the user by using angle information from face and eyes. Lastly, it stores all the incoming information along with a HD video stream in a synchronised manner for offline analysis.

Figure 3 Perception module and its functions

In order to utilise this information and execute the proper behaviours on the robot we developed a middle layer module called Skene. This module translates head position information into vectors to allow the robot’s head to move accordingly and tracks user’s face. It also receives coordinate information from the map Application, in order to be able to instruct the robot to gaze, point or wave towards specific points on the screen.

The main instructions that guide the robot’s behaviour come from the Interaction Manager that tells the robot what and when to perform. The Interaction Manager utilises the affective state of the learner in order to select an appropriate pedagogical strategy that corresponds to the learner’s state and send the behaviour to Skene for execution.

Thus, for example, if the user is bored, the sensors will pick this up and the affect perception module will calculate the Valence and Arousal values accordingly. Then this information will be fed to Interaction Manager that in turn will select an appropriate strategy which in this case might be a joke. The Interaction Manager will send the behaviour to Skene for execution.

Control

The EMOTE Empathic Robot is a fully autonomous system. The prototype targets geography teaching (map skills) for 10-15 year olds. The robot autonomously responds to the learner’s affective state and map events, scaffolding and supporting learning.

Autonomy is supported through a priming interaction. The EMOTE Empathic Robot Tutor opens the learning experience giving learners every instruction required to perform the task. Thus, learners can engage with the EMOTE Empathic Robot Tutor with little or no experience and quickly learn how to interact.

The overall architecture that supports this autonomy is detailed in figure 3 above, with relevant publications detailed below. Generic modules were developed and configured for the EMOTE Empathic Robot, including the following:

This includes an open learner model that functions by checking a learner’s answers against a set of relevant constraints. If an answer does not violate a constraint then that answer is correct. The model for the mapping skills application tracks the competencies compass reading, map symbol knowledge, and distance measuring. It provides the tutor with an indication of current skill levels calculated using a weighted average so that more up to date information is more relevant than old information.

This controls the high-level interaction decisions of the tutor. The Interaction Module makes decisions based on input from the learner module and the perception module. It infers affective states based on the how the task is progressing, determining which pedagogical action to execute.

A range of pedagogical action are possible: pumping, hinting, prompting, splicing, engaging in small talk, telling jokes, etc. The action to execute is decided based on the diagnostics of the attempted answer, learner skill levels and their affective state. For instance, if the learner is bored, the tutor will try to reengage them by telling them a joke. If they are frustrated and low skilled, the tutor will present the answer and move on. If they are happy or calm and high skilled, the tutor will give them a lot of time to figure the answer out themselves. Using rules like these, the tutor adapts its behaviour to the learner during the task. Once the dialogue action is identified, it is sent to the Skene module for further processing.

The Skene module takes as input the dialogue action and its parameters and creates a natural language utterance embedded with embodiment specific gesture mark-ups. For the NAO robot, Skene creates utterances with arm gestures and so on. The utterance with gesture markups is then sent to the robot. The robot will synthesize the utterance and speak it. It will also display arm, face and head gestures embedded within the utterances. It also supports semi-automated gaze behaviour, such as gaze at an active speaker or joint gaze at an object of interest.