Data-Mining a City's Visual Identity

Paris looks like, well, Paris, and like nowhere else on earth, which is a large part of the charm of the French capital. As a tourist, you don’t even have to visit the Eiffel Tower to know you’ve landed in La Ville-Lumière. Wander down any side street in a residential neighborhood, and the city simply has a distinctive look and feel, the result of myriad small distinctions from the way Parisian balconies are constructed to the style of the city’s streetlights.

When presented with random images of Paris, people who have been there are surprisingly good at identifying the place (as opposed to, say, Barcelona). A delightful research project, from academics at Carnegie Mellon University and INRIA/Ecole Normale Supérieure in Paris, tried exactly this. The researchers showed subjects a sampling of images of Paris, as well as decoys from 11 other cities around the world. Subjects correctly nailed Paris 79 percent of the time. (You can play along with this game here.)

"What this suggests is that people are remarkably sensitive to the geographically informative features within the visual environment," the researchers write. "But what are those features?"

This is a question people are much less good at: identifying with some kind of scientific accuracy exactly what makes Paris look like Paris. But the researchers suspected that a computer might be able to do this. Computer scientists these days are data-mining everything: the language in tweets, the numerical statistics in vast data sets, the geographic locations in Foursquare check-ins. Training software to data-mine visual information, though, is a little trickier.

Any given photo of a streetscape contains hundreds of separate elements (individual doorways, street signs, curb heights, building details). Only a handful of them actually reflect the unique fingerprint of a city. In the case of Paris, some obvious choices are those famous street signs embedded on the sides of buildings, cast-iron balconies, and decorative lampposts. The researchers wanted to see, though, if they could use an algorithm to identify these geographically specific features automatically (and more accurately).

They were looking not for the big, obvious landmarks like the Eiffel Tower, but for "the visual minutiae of daily urban life." These are the patterns of smaller features in building design and street life that repeat themselves all over a city – but that are, at the same time, subtly different from the patterns in cities elsewhere.

A lot of visual data-mining work has been done with Flickr, but those images are heavily skewed toward famous landmarks. Instead, the researchers pulled about 10,000 images from Google Street View for Paris and each of 11 other comparison cities (as an added bonus, all of those images are taken from the same vantage point of a Google Street View car, making it much easier to compare streetscapes and building facades across neighborhoods and cities). All of those images resulted in tens of millions of individual "patches" such as the ones shown above.

From here, we’ll let the researchers explain how they trained their algorithm to narrow in on the most telling features:

The algorithm turned out to be pretty good at distilling Paris’ visual identity. At left, we see images of the city randomly captured from Google Street View. The patches at the right represent the top-ranked visual elements of the city found by the algorithm.

Similarly, here is a comparison for Boston:

And San Francisco:

The researchers note that their algorithm had a much harder time with American cities, where some of the most commonly identified elements were car brands and road features. "This might be explained by the relative lack of stylistic coherence and uniqueness in American cities (with its melting pot of styles and influences)," they write, "as well as the supreme reign of the automobile on American streets."

Their ultimate goal was to "provide a stylistic narrative for a visual experience of a place," and such narratives could be told at the neighborhood level, the city level, or even regionally. At the broadest scale, this technique could help identify how different cultures and regions have influenced each other. You could, perhaps, follow the historic trail of the Ottoman Empire across Eastern Europe, the Middle East and North Africa simply by looking at balconies and doorways along the way. If anyone is interesting in carrying this research forward in that direction, the authors even offer a suggested name for this new field they’ve spawned: "computational geo-cultural modeling."

About the Author

Emily Badger is a former staff writer at CityLab. Her work has previously appeared in Pacific Standard, GOOD, The Christian Science Monitor, and The New York Times. She lives in the Washington, D.C. area.