Print: Applying Quantitative Analysis to Classic Lit

Illustration: Rodrigo Fuenzalida

If Google has its way, all of English literature will one day exist as searchable digital text. Franco Moretti, a Stanford English professor, wants to be ready for the deluge with new kinds of questions and new tools to answer them — things like computational linguistics, data mining, computer modeling, and network theory. Moretti is already famous in bookish circles for his data-centric approach to novels, which he graphs, maps, and charts. Until recently, though, he’s been able to crunch only a few novels at a time, doing all that quantitative stuff by hand. Now he’s going digital, building searchable databases of old books, working to write software that can mine for patterns. Instead of diving deep into a few beloved titles, Moretti aims to zip across the creative output of entire eras. He calls it distant reading, and if his new methods catch on, they could change the way we look at literary history.

Take one experiment. Moretti decided to test the idea that Victorian writers, through their choice of adjectives, might reveal their belief that moral qualities were indivisible from reality itself and that physical traits reflected a person’s virtue. So he assembled a database of 250 novels and sent the file to computer scientists at IBM’s Visual Communications Lab, who turned the books into a series of word clouds. “Boom! There were exactly the adjectives I had hoped would pop up!” he says. “Adjectives like strong, bright, fair, in which the physical and the moral blend.”

For another project, he looked at the titles of 7,000 books in 18th- and 19th-century England and discovered a correlation between shorter titles and the growth of the book publishing industry. (Moretti theorizes that more concise titles made books easier to promote in a crowded marketplace.) He is also working with a programmer to test new software that can “read” terabytes of
obscure, mostly unread fiction and classify the books by genre.

“In 19th-century Britain, maybe 30,000 novels were published,” Moretti says. He is dying to analyze them all. It will be like peering through the first telescope, he says — surveying more literature at a glance than he could read in a lifetime. “We will get a sense,” he says, “of a much wider universe.”