You are here

Matthias Schonlau

Research interests

Professor Schonlau's research interests include applied survey sampling and survey methodology, statistical machine learning from text data such as open-ended questions as well as software implementation.

Text data from open-ended questions in surveys are frequently ignored in the practice of survey research. Yet open-ended questions are important because they do not constrain respondents’ answer choices. Where open-ended questions are necessary, sometimes multiple human coders hand-code answers into one of several categories. Automated algorithms do not achieve an overall accuracy high enough to entirely replace humans. We classify open-ended questions automatically using text mining for easy-to-classify answers and humans for the remainder. Expected accuracies guide the choice of a threshold delineating between “easy” and “hard”.

This approach has spawned a variety of related projects including: algorithms for automatic occupation coding (categorizing answers to the question “What is your job?” in official surveys); classification of open-ended questions that can take more than one label (equivalent to all-that-apply questions); an algorithm for semi-automatic classification all-that-apply type open-ended questions; training a learning algorithm for double coded data, when such codes are available, and whether or not to purposely double code the training data when there is a fixed budget for human coders.

Earlier work includes selectivity in web-surveys, cross-sectional weighting in household surveys, the effect of following rules in household panels and respondent driven sampling.

Education/biography

Professor Schonlau joined the faculty in 2011. From 1999-2011 he was a statistician at RAND corporation and head of the RAND Statistical Consulting Service. He was initially located at RAND's Santa Monica (Los Angeles) headquarters and then moved to RAND's Pittsburgh office. Professor Schonlau spent the academic year 2009/2010 on sabbatical at the German Institute for economic analysis (DIW) in Berlin, Germany. The sabbatical was made possible in cooperation with the Max Planck Institute for Human Development (MPIB). From 1997-1999 Professor Schonlau held a joint appointment with the National Institute of Statistical Sciences and AT&T Labs - Research. He obtained his PhD from the University of Waterloo in 1997 and his master's from Queen's university in 1993. Professor Schonlau grew up in Germany.