How do we make data relevant to what seems like a non-binary, non-enumerable story? We learn from the famous doctor who saved countless newborn babies by thinking of them as numbers. Even something as valuable as a baby’s life can be reduced to a number from 0 to 10.

A reporter is curious about how Trump the presidential candidate compares to Trump the private billionaire when it comes to charity. By systematically double-checking Trump’s claims, he found an even better story.

How do we turn a narrow topic on a local story into a year-long project? Take that narrow topic on a state-by-state tour and compare how laws and data and history and stories differ in 50 comparable ways. This page is a set of real-world examples of how to find focus in a larger issue, and make a big project by looking at the U.S. state-by-state picture.

Read an investigative story, then document all of its sources with a spreadsheet. You should get an idea of how many people and organizations were contacted, as well as the document trail that was followed.

For an investigative story that you've read, use LexisNexis to do 5 queries, to find 5 more stories of at least 500+ words, from 5 different years and publications that have insights/context not found in the investigative story you read.

This is a collection of public datasets conveniently packaged as SQLite databases to practice on. You don’t have to worry about the data cleaning/import process, just download the SQLite database files and query them from your favorite SQLite client.

The U.S. Census records such an incredible, intimidating volume of data points about American life that is is hard to know where to begin. This guide aims to explain how Census data is organized, how to find important and interesting data points, and how the data is distilled and displayed by journalism outlets.

How the SQLite database is used in billions of real-world applications today is of little relevance to us in this class. But the web browser is a easy-to-understand scenario of how a database gets created and filled.

A lot of powerful journalism is simply looking at one list and seeing which of its names are on another list. The JOIN clause is the clearest way to express that concept, and to execute it in a blink of an eye. It is the main reason why we learn SQL instead of trying to hack around the usually versatile spreadsheet.

Log into Lexis-Nexis again and find 10 stories from the past, 5 each focusing specifically on Rep. Mike Honda and his challenger for the CA-17, Ro Khanna. From five of the 10 stories, come up with a question that you think is worth asking.

Election Day is near, so what better SQL practice is there than the FEC datasets, which are not only comprehensive and fairly well-documented, but large enough that knowing SQL becomes a huge advantage over the limitations of spreadsheets.

Ryan Shapiro is a Ph.D. candidate at MIT and a research affiliate at the Berkman Center for Internet & Society at Harvard University. He is an historian of national security specializing in governmental transparency and the policing of dissent. Politico has referred to Shapiro as “a FOIA guru at the Massachusetts Institute of Technology”, while the FBI has declared Shapiro’s FOIA research methodologies themselves to be a threat to national security.
Shapiro will speak to us about his research and journey from animal rights activist to FOIA-powered-scholar and transparency activist.

Shapiro is a PhD candidate at MIT and a prolific user of FOIA laws. He’ll talk about how FOIA became relevant to his research and activism, how he got better at making requests, and how to argue with the FBI.