TagHelper supports analysis of corpora in English, German, Spanish, and Chinese, and has been used as an instructional tool in two university courses, namely Machine Learning in Practice and Computer Supported Collaborative Learning.

If you publish work using TagHelper tools, please cite the following paper:

Email Carolyn Rose at cprose@cs.cmu.edu if you would be interested in taking an on-line course related to applied machine learning and text processing.

Building on the early success of the TagHelper project, an exciting development in the past year has been two successful evaluations of fully automatic adaptive collaborative learning support interventions. The purpose of these interventions is to "listen in" on student conversational behavior using text processing technology developed on the TagHelper project, decide based on that behavior when to intervene, and to offer support to make the learning experience more successful. In both studies, the automatic collaborative learning support lead to significant increases in learning gains in comparison to a no support control condition.

Women in CS is hosting a workshop related to TagHelper tools on Saturday, October 6.

There was a TagHelper Tools Tutorial at AI in Education 2007 . Please send email to Carolyn Rose at cprose@cs.cmu.edu to get more information (or report any difficulties you might have with the code). You can download my slides from the tutorial here .

There was a TagHelper tools all day tutorial on June 19 at Carnegie Mellon University. Here are the slides from lectures
1,
2,
3,
and 4.

The goal of our research is to develop text classification technology to address concerns specific to classifying sentences using coding schemes developed for behavioral research. A wide range of behavioral researchers including social scientists, psychologists, learning scientists, and education researchers collect, code, and analyze large quantities of natural language corpus data as an important part of their research. A particular focus of our work is developing text classification technology that performs well on highly skewed data sets, which is an active area of machine learning research.

Arguello, J. and Rose, C. P. (2006). Museli: A Multi-source Evidence Integration Approach to Topic Segmentation of Spontaneous Dialogue, Proceedings of the North American Chapter of the Association for Computational Linguistics