Chronicling Hoosier, Using Big Data to Answer Old Questions

IUPUI University Library Center for Digital Scholarship librarians have used the Library of Congressses Chronicling AmericaAPI to take another look at the usage of the word hoosier through time and across geographies. Chronicling Hoosier began as a technology driven pursuit of the long sought origin story of hoosier. It's since transformed, in part because of data collected, into a data driven examination of who used the term, in what context, and towards what end. The first iteration of the data looks primarily at number of times the word is used by year and within what geographic region. The numbers in and of themselves are interesting with the long harbored connection between hoosier and early US river culture being reflected in this visualization representing a year-by-year, from 1836-1922, progression of the appearance of the word in newspapers. River adjacent cities along the Ohio, Mississippi, and Missouri are highly represented in hoosier newspaper mentions.

Preliminary word-in-context data was also summarized through word clouds by decade. Represented are additional connections to past theories about the word's development. A strong Indiana affiliation is obvious early on as is the word's relatedness to other state dweller nicknames of the mid-1800s such as Sucker for Illinois residents and Buckeyes for Ohioans. Also visualized is hoosier's entrée into the advertising world as companies, most prominently the Hoosier Cabinet Manufacturers, capitalize on the word. Suggesting a positive affiliation? The next step of analysis will include a deeper dive into the word's usage in context. Of particular interest is sentiment analysis. Is the word used in a positive or negative context, or something else entirely? Did its usage change significantly over time and/or by geography?

The image below from a non-newspaper source of 1852 depicts a particularly negative hoosier folk type