Causal Inference with Big Data

Introduction

In the 21st century, information is created and stored at unprecedented rates. The access to high-dimensional large data sets – “Big Data” – has opened up new possibilities for business analytics and economic research. Massive datasets alone are, however, insufficient to answer fundamental questions within business and economics. Using the potential outcome framework, we explore various methods useful for causal inference in the Big Data era. We discuss the promise and pitfalls of large-scale experimentation and consider empirical applications relevant for business and policy analysis.

Course content

The course covers the following topics:

What is big data?

The potential outcome framework

Regression and matching

Large-scale experimentation

Treatment effect heterogeneity

False positives and p-hacking

Publication bias

Regression discontinuity designs

Supplementary analysis

Data visualization

Feature engineering and feature learning

Introduction to image analysis

Introduction to text analysis

Learning outcome knowledge

After having completed this course, students should be familiar with the potential outcome framework and microeconometric methods useful for answering “what if” questions using Big Data. Students learn the distinction between causal models and predictive models.