Possibly slightly better text analysis with lme4

lme4 and its cousin arm are extremely useful for a huge variety of modeling applications (see Gelman and Hill’s book), but today we’re going to do something a little frivolous with them. Namely, we’re going to extend our Denver Debate analysis to include some sense of error.

Instead of the term-frequency scatter plot seen in the previous post, this code fits the most basic possible partially-pooled model predicting which of the two candidates, Obama or Romney, spoke a given term. This allows us to get a slightly better idea of which candidate “owned” a term on the night, and simultaneously accounts for volume of usage (evidenced by narrower confidence intervals).

Anyway, we will almost certainly return to lmer() at some point in the future, but this code offers some ideas as to how best translate a model object into a data frame amenable to plotting.