Training

Introduction to an R Package for Text Analysis: stm

Instructor

Brandon Stewart is an Assistant Professor of Sociology at Princeton University where he is also affiliated with the Politics Department, the Office of Population Research and the Center for the Digital Humanities. He develops new quantitative statistical methods for applications across computational social science, and is an author of several R packages, including stm, an R package that provides text analysis tools for working within the general framework
defined by the Structural Topic Model. Brandon Stewart earned his PhD in Government at Harvard in 2015 and a Master’s degree in Statistics in 2014, also at Harvard.

Time/Place

5/03/2016 from 9:30 AM to 12:00 PM ~ Wallace 300

Description

The Structural Topic Model is a general framework for topic modeling with document-level covariate information. The covariates can improve inference and qualitative interpretability and are allowed to affect topical prevalence, topical content or both. The software package implements the estimation algorithms for the model and also includes tools for every stage of a standard workflow from reading in and processing raw text through making publication quality figures. The workshop will provide a hands-on introduction to using the stm package which currently includes functionality to:

ingest and manipulate text data

estimate Structural Topic Models

calculate covariate effects on latent topics with uncertainty

estimate a graph of topic correlations

compute model diagnostics and summary measures

create the plots used in various papers about stm

Audience

Attendees should have previous R experience.

Format

Lecture, discussion and hands-on exercises.

Requirements

Attendees should bring a laptop with R and the R package stm already installed. The stm package is available on CRAN and can be installed using: install.packages("stm")