Explanation

Wikipedia exists in more than 280 languages. Often, two articles in two different
language editions of Wikipedia are about the same thing: in this case they are connected
with something called a language link. These language links are written in the source
wikitext of the article. The English article on
Berlin
has corresponding articles in more than a hundred languages - and if you take a look at
the end of the
Wikitext for Berlin
you will find the so called language links to all these other articles on the other
Wikipedias. These make sure that you see the list of languages on the left hand side
of many Wikipedia articles.

If you now go to any of the other language versions of this article, you will basically
find the same list again and again - merely replacing the link to its own version with
a link to the English version. Therefore, there are more than hundred articles that all
include the basically same list to each other. To keep these links updated, bots crawl
the Wikipedias and try to keep the links synchronized.

On smaller Wikipedias this actually means that in some stub articles a huge part of the
actual content of the article is created just by the language links (see
this example).

Here is an explanation of the columns:

Pages: the number of the pages in the main namespace.
Includes redirects.

Lang. links: the number of language links on these pages.

Double links: the number of language links that link to a Wikipedia
language edition that already has a link in the same page. The number links to a list
of these double links so that a language edition can check them, as they are often errors.
Also, any analysis of the language links based merely on the SQL language links table
will not be catching these.

Text size: the number of characters in the wikitext of these pages.
Note: characters, not bytes. Care has to be taken when comparing the size of
non-alphabetic languages.

Lang. links size: the number of characters devoted to representing
language links in the wikitext of these pages.

Ratio: Percentage of the language link size compared to the
overall text size.

The dumps used where the most recent ones as downloaded on June 22nd 2012.

The whole idea of this page is to give a feeling of the effect of the first phase of
Wikidata,
the project I am currently working on. Wikidata aims to centralize most of the language
links in one repository and thus remove them from the wikitext of the individual
language versions.