How Wikipedia Works/Chapter 15

So far we've concentrated on the English-language version of Wikipedia, but Wikipedias have been created in over 280 languages, each representing its own individual community and unique collection of content. A common assumption is that articles in the other Wikipedias are basically translated from English, but this couldn't be more misleading: These sites all create their own content with translations only playing a minor role. Taken as a whole, the Wikimedia projects count as one of the most comprehensively multilingual and global projects on the Internet today.[32]

The English-language Wikipedia is the largest site, but other Wikipedias are also impressively large: seven of the other-language editions of Wikipedia have over million articles. These very active sites often have high growth rates and are technically innovative. If you visit http://wikipedia.org/ (Figure 15.1, “Wikipedia.org portal page, showing all the languages”), you'll see that it serves as the gateway to the other language editions of Wikipedia.

In this chapter, we'll explore what being global means for Wikipedia, by now a truly international and connected project. What are other-language Wikipedias like, and how can you get involved in them? We'll also talk about language issues as they relate to the English-language Wikipedia, including displaying foreign-language characters, writing about topics from a global perspective, and adding links to other-language versions of Wikipedia.

[32] Byte Level Research publishes an annual globalization report card that regularly ranks Wikipedia second in the world after Google for "how successfully companies developed web sites for international markets." See http://bytelevel.com/news/reportcard2008.html.

Contents

A very early goal of the project was to make Wikipedia multilingual; Jimmy Wales first proposed a German-language version of Wikipedia in early 2001. By May 2001, within months of the English-language Wikipedia's founding, Wikipedias had been started in Catalan, Chinese, German, French, Hebrew, Italian, Spanish, Japanese, Russian, Portuguese, and Esperanto.

New language editions continue to be added, as described in "The Long Tail of Languages" below. As of mid-2013, the largest Wikipedias were in English (4.2 million articles), Dutch and German (1.6 million articles each), French (1.4 million article), Swedish (1.2 million article), and Italian, Spanish, and russian (1 million articles each). Size alone should not be taken as the only criterion of prominence, however. For instance, the Chinese-language Wikipedia has often attracted media attention, in part because the Chinese government continues to partially limit access to the site within China (as part of the so-called Great Firewall of China). Despite this, the Chinese-language Wikipedia has more than 710,000 articles, written in large part by the many Chinese editors in Taiwan, Hong Kong, and outside East Asia.

Wikipedia has at least 46 language editions with over 100,000 articles and 119 with over 10,000. By the time a Wikipedia reaches 10,000 articles, it usually has a consistent approach, a self-regenerating community, and a basic policy structure in place. The remaining sites are just getting started, with a handful of articles and active contributors, as the next section explains.

In the generally optimistic Wikipedia way, many language versions of Wikipedia have been started but, at this time, have only a few hundred articles. What function do these sites serve? No one could call them a comprehensive encyclopedic resource yet. The truth is that they are just beginning wikis—much like the English-language Wikipedia in 2001 or 2002. If you do speak one of these languages fluently enough to contribute, working on a smaller Wikipedia can be a great deal of fun. You'll find the culture of a small site with few users is very different from the giant English-language Wikipedia, which has so many customs, rules, and (obviously) so many more articles already written. Even on the small Wikipedias, articles are, for the most part, not translations but instead newly written pieces in that particular language.

Figure 15.1. Wikipedia.org portal page, showing all the languages

Sometimes other-language Wikipedias are small because they are very new or because not many people speak the language, and thus, the potential contributor base is small. Alternatively, Wikipedias may exist in widely spoken languages that do not have a strong presence on the Internet, such as Telugu, the third-most spoken language on the Indian subcontinent and one of the top fifteen spoken languages in the world (Figure 15.2, “The front page of the Telugu edition of Wikipedia at , from April 2008” shows the front page of the Telugu Wikipedia). These languages may not have a strong written tradition, or perhaps Internet access is limited in the areas where most native speakers live. Some of these Wikipedias quite possibly constitute the largest online corpus for their language; in any case, they represent the language online in a place where others can easily find it.

The range of languages represented by Wikipedia is very large. Wikipedias exist in constructed languages (Esperanto [eo] and Volapük [vo] with their internationalist aim) and significant dead languages (Latin [la] and Old Church Slavonic [cu]), which have no native speakers.[33] (The two-letter codes are language identifier codes, explained in "Links Between Languages" on Section 2, “Links Between Languages”). The issue of language preservation, when few native speakers of a smaller language are living, is not an explicit Wikimedia Foundation goal. But, on the other hand, providing free information to all people in the world, regardless of their language, certainly is, and many of the smaller Wikipedias represent the only online reference works in that language. In some cases, Wikipedia may be the only encyclopedia in a particular language! Despite this diversity, the 250+ languages already supported by Wikimedia come nowhere close to representing all of the world's active languages. SIL, creators of the Ethnologue, a standard reference work for languages and one of the maintainers of the ISO standard for identifying languages, suggests the total count of world languages is closer to 5,000.

Therefore, new language editions are still being proposed and started. How does this work? The key requirement is that you can provide evidence of a potential active user base. Active volunteers for the new Wikipedia will be needed to provide the content and watch the wiki for spam and vandalism. Wikipedia has a procedure for beginning a new language project, and all new requests must be approved by the site developers, who can create the project. Meta, the Wikimedia wiki discussed in Chapter 17, The Foundation and Project Coordination, has a special page for making these requests. Once a request has been submitted, a committee on new language editions, or langcom, reviews the request. Someone fluent in the language must commit to translating the Mediawiki interface (including the text of tabs, buttons, and key pages) for the new project. You can see (and participate in) some new projects in the translation process at http://incubator.wikimedia.org/.

The Klingon Wars

A Wikipedia existed in Klingon, the language used by the Klingon race in the fictional Star Trek universe, from 2004 to 2005. After some debate, Jimmy Wales decided to close the site, and the decision was implemented on-the-spot at Wikimania 2005. As the History of the Klingon Wikipedia page on Meta tells it, "The existence of the Klingon project was divisive and led to entrenched debates over fairness and parity with other languages, and particularly with other constructed languages […] Work was limited by the fact that the Klingon vocabulary is closed and incomplete." The content was ultimately moved to a new site hosted by Wikia at klingon. wikia.com in December 2006, and it had 161 articles as of July 2008"/>.

The challenge of editing on another language Wikipedia can be interesting and worthwhile, even if you only have a minimal knowledge of the language in question. All the Wikipedias use the same MediaWiki engine, so buttons, navigation links, and icons have familiar functions, regardless of the labels on them.

One way to help out is to watch content on a small wiki. Simply remember to check Recent Changes every so often on a slow-growing wiki, and you can help keep spam and poor contributions to a minimum. To adopt a wiki, you really need only be familiar enough with Wikipedia's standards to recognize definitely unhelpful changes. Seeing the fresh edits will help you to direct new editors to multilingual meta-pages and to identify good new editors. The wikipedia-L mailing list is for discussions of general Wikipedia-related issues in any language.

At this time, you must create an account for each new language project you wish to work on. This is changing with the introduction in mid-2008 of single-user login, sometimes called unified login, which users can use to link their existing accounts across all Wikimedia projects (see "Project Accounts and Single-User Login" on Section 1, “Wikimedia Commons” for more). All Wikipedias should allow anonymous editing, however, which may be easier if you just want to make a few changes. If you edit when not logged in, watch out for compulsory previews when you try to save: Click what you suspect must be the Show Preview button.

What about adding an edit summary in another language? Projects may vary on this. For instance, an edit summary is compulsory on the Polish Wikipedia; otherwise, you won't be able to save unless you're logged in. On the Portuguese Wikipedia, if you're not logged in you must fill in a CAPTCHA box before saving your edit. If you're prepared for these occasional extra formalities, editing Wikipedias in other languages is actually very easy.

Remember that policies, guidelines, and community practices may vary a great deal between different language communities. Although some basic principles—NPOV, civility, and the GFDL license, for instance—are fundamental to all Wikimedia projects, how procedures are carried out is decided by the project community. You'll often find that a smaller project has fewer rules and guidelines and debate tends to be more thoughtful than on larger projects that receive more outside attention.

With a full range of languages comes a full range of writing systems: Greek, Cyrillic, Arabic, Hebrew, ideograms, and other less-familiar ones. Even languages that use the basic Roman alphabet may use accents and other diacritics. Scripts of all kinds are also used and integrated into the English-language version of Wikipedia, for instance, to give original forms of proper names. Figure 15.3, “The first paragraph of the English-language Wikipedia article on Mahatma Ghandi, which uses three languages with different scripts (English, Sanskrit, and Gujarati), as well as IPA symbols” shows an example in the article w:Mohandas Karamchand Gandhi, which uses Gujarati and Sanskrit scripts in the lead section as well as IPA pronunciation symbols.

Embassies

You can find a list of people who speak various languages and participate in the English-language Wikipedia and are willing to help with questions in those other languages at Wikipedia:Local Embassy. This page forms part of the Embassy system, a special place on each Wikipedia for visitors speaking other languages. The particular language Wikipedia is described, and visitors can ask questions or request help. You can find a list of all Embassies at http://meta.wikimedia.org/wiki/Wikimedia_Embassy. This list contains links to each embassy, along with the names of contributors on that wiki who speak other languages and are willing to help out.

Any one of these scripts may fail to display properly in your web browser if you don't have the necessary fonts installed. If you're viewing text that you don't have font support for, you may see small boxes or question marks instead of the correct characters. If this is the case, you need to download and install the proper font. w:Help:Multilingual support collects information and some advice about font support. This page has a chart where you can compare images of some of the common problematic fonts (such as East Asian character sets) to what you see on your computer. The Firefox web browser provides relatively good multilingual support, as do most newer operating systems, including Windows Vista.

Figure 15.3. The first paragraph of the English-language Wikipedia article on Mahatma Ghandi, which uses three languages with different scripts (English, Sanskrit, and Gujarati), as well as IPA symbols

Language support for operating systems is certainly still driven by demand in the developed world, and this means that many of less widely used scripts, such as those for some Indic languages, will not typically be supported natively by your browser or operating system. Character sets that usually need to be downloaded include those for native languages. To find these fonts, the Wikipedia edition in that language can be a good resource; Wikipedias that use non-Latin scripts often have a help page about where to get the necessary fonts for viewing them linked from their main page. For instance, to see the proper rendering of Cherokee in native script in the Cherokee article, you must download a special font; the help pages on the Cherokee-language Wikipedia at http://chr.wikipedia.org/ give details on how to find the appropriate fonts.

When composing articles, if you don't have a keyboard with the characters you need, you'll find that many types of scripts, for example, Cyrillic and Chinese characters, can be copied and saved successfully onto Wikipedia pages from other documents. (This works because of Unicode character encoding, or w:UTF-8.) Most operating systems, including Windows, Mac OS X, and many Linux distributions, also allow you to change your keyboard layout virtually so you can type directly in another language. In Windows XP, for example, you can do this via the Control Panel under Regional and Language options. The w:Help:Multilingual support (Indic) page gives complete directions for inputting characters in Indic languages for several operating systems; these directions are also appropriate for other character sets.

Displaying Hieroglyphics

If desired, you can display Egyptian hieroglyphics in a Wikipedia article! See Help:WikiHiero syntax for this special image-based font; to use it, simply enclose the code(s) for the character you want to display in between the <hiero> and </hiero> tags.

Finally, the editing box below the main editing window (described in "Understanding the Edit Window" on Section 1.1, “Understanding the Edit Window”) gives easy access to many characters with accents and diacritics, as well as the Greek, Cyrillic, and IPA alphabets. Just click one of these characters to insert it in an article.

Interwiki links or interlanguage links are links to an article on the same topic in another language version of Wikipedia. These special links display in the left-hand sidebar under Languages, as first mentioned in Chapter 3, Finding Wikipedia's Content ("The Omnipresent Sidebar" on Section 2.2, “The Omnipresent Sidebar”). These links to other-language Wikipedias appear under the native spelling of the language (such as Français) and are ordered by the two- or three-letter code for that language (such as fr or ja). Clicking the link takes you to the appropriate article in the other-language Wikipedia.

Any page, not just articles, can be interwikied. For instance, if you have a user page in Russian as well as English, you can add an interwiki link to the Russian version of your page on your English-language user page, and vice versa. Many help and community pages exist in multiple languages and are linked to one another in this way. These links can be very helpful if you want to find an equivalent project or policy in another language; for instance, if you want to find spoken articles in German, simply go to the English Spoken Wikipedia project, which has an interwiki link to the German-language Wikipedia page, WikiProjekt Gesprochene Wikipedia.

Editors must add links to other languages, article by article, for them to show up. The links are created using special language codes. These codes are mostly two letters (a few are three letters) and are based on the international standard ISO 639, which catalogs languages. If no ISO code exists, a special code is developed and used; for instance, the Simple English Wikipedia (a Wikipedia written in simpler English) uses the prefix simple (according to w:Wikipedia:Multilingual coordination). These prefixes also appear in the Wikipedia URL for each edition: So http://en.wikipedia.org/ is the English-language Wikipedia. A table of all existing Wikipedia languages with their corresponding code may be found on the Meta site at meta:List of Wikipedias. These codes are also used informally on the projects to refer to the various language Wikipedias; you may see en:WP or enWP used to mean the English-language Wikipedia, de:WP to mean the German-language Wikipedia, ru:WP to mean the Russian-language Wikipedia, and so on.

Once you have found the two articles you wish to link and know their respective title and language code, creating the links is simple. Edit one article, and scroll to the end of the text. Interwiki links are placed at the very bottom of the article, underneath all article text.

The link takes this form: [[language code:article name in native language]]. For instance, if you're working on the article Cat in English, and you want to link to the article Chat in French, you would add the link
[[fr:chat]]
at the end of the English-language article. After saving the page, a link with the text Français will show up on the left-hand sidebar; if you click it, you'll be taken to the French article at http://fr.wikipedia.org/wiki/Chat. Similarly, to link to the article in German you would type
[[de:Hauskatze]]
which will give you a link to http://de.wikipedia.org/wiki/Hauskatze under Deutsch in the sidebar.

By convention, interwiki links are placed below category tags on pages, each on its own line. The most popular arrangement for ordering interwiki links on a page is alphabetically by code.

Broken Interwiki Links

Make sure you have the right article when linking. Especially for concepts with more than one meaning, finding an exact equivalency can sometimes be difficult—take care to not link to the wrong concept. Also be careful about linking to disambiguation pages, which may exist in one Wikipedia but not another. Obviously, not all articles exist in all languages; since the English-language Wikipedia is the largest, it often has articles that other languages do not, but you may be surprised at the coverage of smaller Wikipedias. If you have created an interwiki link that appears to lead nowhere, check to make sure you haven't entered the title incorrectly.

To be complete, you should also go to the articles in the other languages to add an interwiki link back to the first article (for instance, the page Hauskatze should also link to the English Cat). When creating interwiki links, add a simple +en: or interwiki as an edit summary. You may also find more interwiki links to that article to add to the original article you were working on. Today bots do much of this missing interwiki linking automatically.

To link to another language page without having it display as an interwiki link, use the same syntax but place a colon in front of the language prefix, as if you were linking to a category name. Typing fr:chat on an English-language Wikipedia page will display as a light-blue link just as you write it that links to http://fr.wikipedia.org/wiki/Chat, but the link doesn't appear on the left-hand sidebar. Some important general principles still apply: Prefer the internal link means don't use links to another-language Wikipedia to replace a redlink to an English article and Seek outside references means you shouldn't rely on another Wikipedia to source important facts in an article. Translation isn't by itself sufficient verification, and other Wikipedia pages—no matter the language—are not acceptable as sources.

The English-language Wikipedia has a global community of editors, and as an editor, you'll regularly collaborate with people from many time zones. A typical contributor to the English-language Wikipedia may well be a native speaker in an Anglophone country—the United States, Canada, the United Kingdom, Australia, and others—but many editors are neither native speakers nor in one of those places. Getting to know people from all over the world is one of the benefits of getting involved in Wikipedia. Because editors are relatively anonymous, you'll often have no idea where your on-wiki friends are from or even their nationality. To get around cultural differences, remember the guidelines on interacting politely with others online, and don't rely too heavily on regional slang or Internet jargon, which not everyone will understand.

The diversity of contributors is also reflected in the global breadth of subjects on Wikipedia. Notability is not culture- or language-specific; geographical features, important individuals, and other notable regional topics should clearly be included in Wikipedia, no matter where they are or relate to in the world.

Simple English Wikipedia

The Simple English Wikipedia is a separate project from the English-language Wikipedia. This Wikipedia aims to provide articles in simplified English and is designed for people learning English and children. The whole interface has been rewritten to use simpler language, so that, for example, the Random Page link is the Show Any Page link. Most articles are "translated" from the English-language Wikipedia version into shorter, simpler articles. These articles, in turn, can be a resource for people working in other languages. Simple English is an ideal project for those interested in teaching or learning English as a second language. Simple English lives at http://simple.wikipedia.org/, and as of mid-2008, had around 33,000 articles.

Several WikiProjects also focus on specific areas of the world. An example is WikiProject India, which focuses on writing articles about India, reviewing the existing articles about India, and supporting a community portal for editors interested in India-related topics. A list of WikiProjects that deal with geographical topics can be found at Wikipedia:WikiProject Council/Directory/Geographical.

Other WikiProjects focus on translating useful or interesting articles from other Wikipedias into English (other-language Wikipedias have similar projects that focus on translating articles into their local language). The place to coordinate translations in English is w:Wikipedia:Translation. Translation offers a double challenge: writing good English that is also good Wikipedia content.

Stylistic issues often appear in articles written in English by non-English speakers. Cleanup work on these articles helps make worthwhile material available. When evaluating an article according to the criteria laid out in Chapter 4, Understanding and Evaluating an Article, or in a deletion debate, take into account that the article may have been written by a non-native speaker with expertise in the topic.

When writing articles in English about topics from non-English speaking parts of the world, sources can be problematic. Finding source material in English can be much more difficult, for instance. Checking interwiki links to find the relevant article on other-language Wikipedias can be helpful for finding sources and more information.

Although citing non-English sources is not ideal, you can do this. You can use special templates to identify sources in other languages; for instance, placing the optional template Template:It icon before a link to an Italian website alerts the reader that the source is in Italian (the language codes are the same standard ISO codes already mentioned). Citing a source that is not in a Wikipedia's native language is better than not citing a source at all. Try to locate English-language sources as well, so readers can verify your facts more easily. (If sources in different languages disagree, this can be useful information to note and include.)