Explanation

The results on this page are based on the comparison of letter trigram frequency in the
given languages. This means, we took the text of 262 language editions of Wikipedia,
counted how often three letters in a row appear, and compared the result with each
other to figure out how similar the languages are — in this respect.

The alphabets have not been normalized, which leads to a great difference in some
languages where you would not expect them, for example between Serbian and Croatian.
Chinese and Japanese have been skipped due to their huge number of ngrams
(for the raw data, see the letter frequency corpus).

Note that this similarity does not mean that the languages are indeed similar in any
other sense of similarity besides their simple letter frequency.
This page makes no direct historical, cultural, or political statement.