LEAP: Analytics for LEarners As People

Research Overview

Watch a recent seminar by Dr. Mihalcea, this research team’s Principal Investigator.

Students come to school not as a clean slate, but with diverse values and goals, a wide range of family and social backgrounds, personal history and current challenges, interests and hobbies, psychological traits, and so on. Both intuition and research maintain that personality traits and past experience play key roles in academic performance. For example, perseverance is a critical psychological trait for success; healthy mental states and positive attitudes are associated with higher productivity and performance; gender or race can sometimes result in bias and influence performance; hobbies from early childhood can indicate college majors. However, current learning analytics research has not paid enough attention to the “person” elements. With the advancement in data science and the availability of vast academic performance and personal data, it is critical for us to leverage these resources to understand the role of personal attributes of learners, to provide educators with data-based interventions for students with different personal characteristics, and to help students build insight about themselves so that they can adopt learning strategies that work for each of them individually. Just like precision medicine, we are at a time when data science is making personalized education more feasible and effective than ever before.

The goal of this project is to build a new generation of learning analytics tools which integrate our understanding of “student as a person” and “student as a learner”. The research team will develop language, speech, and sensor-based machine learning tools that translate input data (academic performance, social media streams, WiFi access data and survey data from 1000 students) into attributes that will form a student profile. They will explicitly link academic performance and mental health with the personal attributes of the students, including values, beliefs, interests, behaviors, background, and emotional state. Such tools will help address questions such as: What are the characteristics of a student who excels in writing? What is the typical profile of a student who struggle in Physics? What are the signs of depression in a student? The tools will provide educators with new insights in identifying group and individualized interventions, and provide the students themselves with richer means for introspection and the ability to make decisions that work for them in the learning process. We will also implement an app as our pilot intervention tool.

The team has compiled a large collection of social media (Twitter) data for 2-3,000 students per university for 100 universities, which can be used to compare the outcome of their research questions across universities.

The team has developed Natural Language Processing techniques to: extract student academic interest data; infer the level of grit as a personal trait based on crowdsourced data; and predict students’ grades and anxiety level with student forum data.

The team is developing Machine Learning models to measure the informativeness of texts for particular concepts that the students need to grasp, and predict students’ level of expertise and how it evolves with time.

The team is using statistical models to predict when students are engaged with specific learning tasks and the likelihood of long-term retention of learned content.

The team has received multiple research grants to expand and continue their work, including a $1.5M grant from the Defense Advanced Research Projects Agency titled “Multimodal Semantic Mapping of Human Activities through Deep Graph Generation and Reasoning.”

The team coordinated the outreach/diversity event “Ada Lovelace Opera: A Celebration of Women in Computing”, which was a performance of the Enchantress opera, and a showcase of research by women in Computer Science.

July 2017

The team has started to collect data from 100 students. They have also ported the StudentLife app to their internal framework, and will start data collection with StudentLife in the fall. The team has developed methodology to: (1) infer students’ values, behaviors, and sentiment from social media; (2) make cross-group comparisons using textual datasets; (3) extract linguistic features from classroom forums (Piazza, Echo360) with the purpose of predicting academic performance.

In the past year, the team has presented at multiple conferences about their work related to the MIDAS-funded project:

EMNLP workshop on Natural Language Processing and Computational Social Science;

COLING Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES);

8th International Conference on Social Informatics (SocInfo); 55th Annual Meeting of the Association for Computational Linguistics (ACL);

European Association for Computational Linguistics (EACL);

International Conference on Computational Linguistics (COLING).

The research team is also exploring funding opportunities from federal agencies, in a collaborative proposal with the University of Texas.