Text classification methods for tasks like factoid question answering typically use manually defined string matching rules or bag of words representations. These methods are ineffective when question text contains very few individual words (e.g., named entities) that are indicative of the answer. We introduce a recursive neural network (RNN) model that can reason over such input by modeling textual compositionality. We apply our model, QANTA, to a dataset of questions from a trivia competition called quiz bowl. Unlike previous RNN models, QANTA learns word and phrase-level representations that combine across sentences to reason about entities. The model outperforms multiple baselines and, when combined with information retrieval methods, rivals the best human players.

We improve "learning to search" approaches to structured prediction in two ways. First, we show that the search space can be defined by an arbitrary imperative program, reducing the number of lines of code required to develop new structured prediction tasks by orders of magnitude. Second, we make structured prediction orders of magnitude faster through various algorithmic improvements.

Maintaining and cultivating student engagement is a prerequisite for MOOCs to have broad educational impact. Understanding student engagement as a course progresses helps characterize student learning patterns and can aid in minimizing dropout rates, initiating instructor intervention. In this paper, we construct a probabilistic model connecting student behavior and class performance, formulating student engagement types as latent variables. We show that our model identifies course success indicators that can be used by instructors to initiate interventions and assist students.

Instructor intervention in student discussion forums is a vital component in Massive Open Online Courses (MOOCs), where personalized interaction is limited. This paper introduces the problem of predicting instructor interventions in MOOC forums. We propose several prediction models designed to capture unique aspects of MOOCs, combining course information, forum structure and posts content. Our models abstract contents of individual posts of threads using latent categories, learned jointly with the binary intervention prediction problem. Experiments over data from two Coursera MOOCs demonstrate that incorporating the structure of threads into the learning problem leads to better predictive performance.

Maintaining and cultivating student engagement is critical for learning. Understanding factors affecting student engagement will help in designing better courses and improving student retention. The large number of participants in massive open online courses (MOOCs) and data collected from their interaction with the MOOC open up avenues for studying student engagement at scale. In this work, we develop a framework for modeling and understanding student engagement in online courses based on student behavioral cues. Our ﬁrst contribution is the abstraction of student engagement using latent representations. We use that abstraction in a probabilistic model to connect student behavior with course completion. We demonstrate that the latent formulation for engagement helps in predicting student survival across three MOOCs. Next, in order to initiate better instructor interventions, we need to be able to predict student survival early in the course. We demonstrate that we can predict student survival early in the course reliably using the latent model. Finally, we perform a closer quantitative analysis of user interaction with the MOOC and identify student activities that are good indicators for survival at different points in the course.