Maps of various name suffix densities in USA and Slovenia

This post and maps were inspired by Moritz Stefaner’s -ach, -ingen, -zell. I firmly believe in giving credits to whom they are due, so there it is.

That said, I embarked into a similar adventure, first for Slovenia. Etimology of Slovenian towns and other populated places may differ a little from German one, so I was naturally curious what it would be like on a map. I had several geo files for Slovenia around, and also a comprehensive list of all populated places with coordinates, making this a relatively short endeavour.

In addition to common suffices suffixes, I also extracted common prefixes. This is because many Slovenian place names begin with “Gornja” (Upper) or “Velika” (Great), so I wanted to see if there are meaningful spatial distributions of these names. It turns out that they are.

For example, this one. By columns: “gornja” (a variant of “upper) vs “dolnja” (variant of “lower”), “zgornja” and “dolnja” (another couple of variations on the same dichotomy), the “velika” and “mala” (“great” and “small”). It’s apparent that places with those prefixes have characteristic spatial distributions. Why, I don’t know. Dialects of Slovenian language vary wildly, to the point that some of them are virtually incomprehensible to me.

To see the interactive version with more maps, click here, or click the image. Switch between prefixes and suffixes using links in the upper left square.

Distribution of places with some common prefixes

Having written the code and downloaded the geonames.org database, it was just a matter of changing a few things to produce a similar map of a similar distribution in the USA. I colored it a litlle differently, but it’s basically the same thing.

Again, click here or the image for interactive version. Note that you can click on a little link above each map to display the list of place names.

Then, a friend and coworker of mine said that he always wondered about the distributions of U.S. towns with borrowed names from European places. That would effectively show distributions of immigration in early history of USA, with exception of Spanish names, which tend to be on the Mexican border because of history, and some random noise in string matchings.

Check out the maps! Some technical details: the maps were drawn with d3, and hexagons produced with the hex-binning plugin.

Name matching was not a big challenge, but I did want to find unique suffixes. So I wrote some software to first isolate the most frequently occurring seven character suffixes, then I gradually shortened them until big dropoff, say, more than 50 places occurred. That way I prevented near duplicates to be included, for example “-ville”, “-ille”, “-lle”, which have approximate same distribution, so only one of them has a place on the map.

The biggest challenge was in fact generating a hex grid within the borders, and then fitting the data inside it. That’s the reason the pages need some time to load. I brute-forced that by generating points inside the bounding box and checking if withing the polygon in question with turf.js, then setting all hexagon lenghts to zero, and finally filling them with real data.