An interview with Roger Azevedo, a professor in the Department of Psychology at North Carolina State University about his work to study the effectiveness of intelligent virtual humans (IVHs) on learners’ self-regulatory processes and other learning outcomes in undergraduate biology.

What is the big idea of your project?

Generally, we’re focusing on a triad of ideas. First, let’s look at learners; what are the cognitive, metacognitive, and affective, motivational processes when they are working individually or collaboratively, when they are using some advanced learning technologies? Some of that is around tangible computing that we’re doing with geospatial colleagues, so it’s no longer even screen based — it’s embodied cognition. If we have students who are instrumented, and the technology is instrumented, we can study the learning phenomena and how it is related to scientific reasoning through the multimodal data we are collecting, which includes eye tracking, physiology, log file, screen recording, and hand movement data.

Second, we want to train data scientists. Now, imagine having all the data I mention above fed to a data scientist who is also instrumented. That data scientist could be an undergraduate student in psychology, could be a graduate student in engineering, someone working at SAS, or any of us as scientists. The learning phenomena is so complex, so what I also want to study is: How do data scientists make inferences with this multimodal, multichannel data? What inferences are they making? What are these inferences based on and are they accurate? So we can think about training future scientists in different sectors, including academia.

And third, because we know that human scientists are biologically limited, we could have a virtual human connected to the data scientists? Imagine a partnership where the virtual human is watching both the learners interacting with the technology, but also the data scientist as he or she observes and makes inferences about the multimodal, multichannel data in real-time. So now we’re bringing in AI, deep learning, and computer vision so we can get to the point where the virtual human can meta reason about what it’s learning, how it’s learning, what it knows, what it doesn’t know, and what the scientists know and don’t know. You can learn more about this idea in the following video (from Cyberlearning 2017 Shark Tank session).

How might the human and virtual human interact?

Discourse through natural language processing would be great, so we get to a point where there is artificial and real human collaboration to make inferences about this data. The virtual human could be embodied and living in your lab, in your computer, or ubiquitous, like Siri is.

We are also working with a colleague in electrical engineering on a visual analytics cluster. How do we get the human to query the virtual human’s mind; what would that look like? Is it graphic? If I’m the virtual human watching the data scientist, imagine being able to project a visible 3D neural net so that the human scientist can turn around and see how the virtual human is making inferences about what I’m observing.

We’ve been talking to design colleagues about what is the best way to represent the neural network of the virtual being, and to be able to interrogate it. What’s happening now with a lot of the deep learning is that it’s a black box, and nobody really knows how decisions are made. We need to open that up so the humans can understand, and query it. So data visualizations come in, and what would those visualizations look like — not only the representation of the human data, but also the representations of understanding how the virtual human is understanding what the learner and data scientist are doing. There are a lot of meta-level things going on.

We just finished and IES grant working with 8th grade students, studying 100 kids a day for 3 weeks where we took a hydrosphere unit and turned it into software. That was in collaboration with Gautam Biswas and Vanderbilt. Collaborating with colleagues in computer science and STEM education, our new EHR CORE grant, MetaDash, brings MetaTutor into the high school biology classroom. We’re collecting learning data and presenting the teacher with a dashboard that has information about students’ cognitive, affective, and motivational processes.The question is: Given all the multimodal data that we’re collecting from instrumented students, how can we repackage it to give teachers a dashboard that presents useful information about the students? We ask teachers what data they would want to see, how they would want to see it, and when they would want to see, and pilot test that. We can also instrument the teachers with portable eye trackers and physiological bracelets, and collect verbalizations. Now suppose that at some point during class, data presented on the dashboard says that 70% of the students are confused. Does the teacher pay attention to that data? What inferences does he or she make? How does that change instructional decision making?

Metatutor has been funded close to 9 years now, all the way back to an NSF REESE grant. It’s an intelligent tutoring system that has 4 pedagogical agents to train college students to use cognitive and metacognitive strategies. During learning, if a student does something overtly like spend too much time on a particular diagram that we know is not relevant to their current goal, Mary the Monitor, an agent, will pop up and prompt a metacognitive judgement by saying something like: “Hey, do you think this diagram is relevant to your current learning goal?” Depending on how the student responds, she’ll agree or ask you to explain why if you say that the diagram actually it is relevant. We have a Strategizer agent who models strategies for the students, like summarization and taking notes. It’s really trying to provide adaptive scaffolding of self-regulated learning, focused specifically on cognitive and metacognitive processes.

The Hydrosphere IES grant with Gautam involved taking Betty’s Brain and MetaTutor and kind of mashing them together. The idea was that we had two agents: Rachael, who was in charge of science content, and Brad, who was in charge of regulatory processes. The kids thought that having more than 2 agents was going to be confusing, so we just had two. It was a 3-week curriculum in a hypermedia environment, with a skills diary. Gautam collected data in Nashville and we collected data in Raleigh. At the beginning of each class, students were given an assignment by their teacher. They then looked at several multimedia resources, including videos created by the teacher and others, and indicated what their cognitive strategies were going to be. So for today, or for this particular learning goal, they might say: “I’m going to read a lot.” The skills diary had sliders, and they could set them, begin their work, and then go back to their diary and change the sliders if they changed their mind. They could also indicate affective states with the sliders, like “Today, I’m probably going to be very confused” or “very frustrated”. The idea was to make them aware of the self-regulatory strategies that they were about to use in the next 50-minute period, access them at any time, and recalibrate their perceptions of their self-regulatory skills.

The kids were not instrumented; Wake County doesn’t allow the instrumentation of children with physiological sensors. We did capture some facial expressions, but in the classroom, that’s messy; we end up losing most of the facial expression data when you are talking about 100 kids a day over 3 weeks, That’s another big issue: collecting real data in a messy classroom. A major implication is the need to find, test, and implement valid and reliable methods to collect emotions data in the classroom and other authentic contexts outside the lab.

Where do you see your work going in the future?

What was interesting is that the students didn’t do much in terms of metacognition; they didn’t understand that very well. Most of what they used were the cognitive strategies. These were 8th graders, and they were not keenly aware of fluctuations in their motivation and emotions. We did a pretest and posttest, and all kinds of embedded quizzes, but we never got to point where the system was intelligent enough to provide kids with individualized instruction around these self-regulatory processes. That’s where we would like to go next.

And at a high level, as researchers, we’re all experiencing an issue of a lack of theories that are comprehensive enough. For example, cognitive theories typically don’t address emotions or motivation. We have theoretical issues, conceptual issues, measurement issues, and of course the applied issues for education and training. I talked about some of this in my CIRCL perspective. Lastly, our collaborative work with researchers from several disciplines focuses on the collection of multimodal data from students of all ages, across different contexts and with different types of advanced learning technologies,promises to advance interdisciplinary models, theories, frameworks, methods, and analytical techniques.

This material is based upon work supported by the National Science Foundation under grants 1837463, 1233722, 1441631, and 1556486. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.