Services

Instructors:

Ben (www.gebre-medhin.com) is a PhD Candiate in sociology. His interests are in the subfields of organization, the professions, and higher education. His dissertation focuses on elite universities and the MOOC movement. For part of that project it uses topic models and text analysis tools to document changes in discourse within an organizational or professional field over time.

In this workshop we will cover two main supervised text analysis methods, the dictionary method, and supervised classification. We will use list comprehension to implement the dictionary method, using sentiment analysis as our example. Using the Python library scikit-learn, we will also implement a few supervised classification techniques, including Naive Bayes and Support Vector Machines. Specific skills covered include a) measuring themes in text using dictionaries, b) feature selection, c) Support Vector Machines, d) Naive Bayes, e) cross-validation, and f) feature importance.

Prior knowedlge: Basic familiarity with Python is required if you wish to follow along with the tutorial. Completion of D-Lab's Python FUN!damentals workshop series will be sufficient.

This workshop is one of a four-part series that will prepare participants to move forward with text analysis research, with a special focus on humanities and social science applications. Please register for each workshop separately. The other workshops in the series are listed below: