I wonder whether this is on topic? I think it's a valid question for this group.
–
deleteSep 14 '10 at 10:45

Hmm.. problem here is that it's entirely dependent on the content that's actually being conveyed. Some languages will have short expressions for a given idea, some will have longer ones...
–
Billy ONealSep 14 '10 at 16:02

2

This seems off-topic. The site is about English usage, not languages in general.
–
JohnFxSep 16 '10 at 14:03

2

Point is you have to reserve enough space for a doubling in length when translating a text, and also have to make it look good if the translated text becomes half the length.
–
Stein G. StrindhaugMar 22 '11 at 8:56

1

The last time I was in a hotel there was a Gideon bible in my room which had the New Testament in three languages, English, French and German in descending order of conciseness. Such texts are probably the best examples for comparisons.
–
klyposJul 24 '12 at 7:58

15 Answers
15

Speaking as a translator, I can share a few rules of thumb that are popular in our profession:

Hebrew texts are usually shorter than their English equivalents by approximately 1/3. To a large extent, that can be attributed to cheating, what with no vowels and all.

Spanish, Portuguese and French (I guess we can just settle on Romance) texts are longer than their English counterparts by about 1/5 to 1/4.

Scandinavian languages are pretty much on par with English. Swedish is a tiny bit more compact.

Whether or not Russian (and by extension, Ukrainian and Belorussian) is more compact than English is subject to heated debate, and if you ask five people, you'll be presented with six different opinions. However, everybody seems to agree that the difference is just a couple percent, be it this way or the other.

Now that's for complete texts, on average, as a rule of thumb. Obviously, when you are working on a GUI, you mostly have to deal with translating individual words, which changes the picture dramatically. I am not aware of any universal research on the subject, but I will go out on a limb and say that it would be worthless to you, precisely because of being universal.

First of all, let's have a look at English itself. A very popular estimate for the average length of English words is 5 letters (or 5.2, or 5.3, or 5.1). I will not expressly address the validity of that estimate here, though I will link to this tiny bit of intriguing research (executive summary: "the larger the dictionary, the longer the words that are contained in it"). Much rather, I will focus on saying that your mileage will always vary.

It all depends on what application you are writing, and for what target audience. You might be writing a text editor for children, a web browser for everyone, or a worst-case execution time analyzer for the aerospace industry. Sometimes, your menu entries will read "Open", "Edit", "Save" and "Quit". Other times, they will read "Crossing reduction" and "Simulated annealing". Add into the equation that "Quit" is not necessarily short in all languages, and "simulated annealing" is not necessarily long, and you've got yourself a complete mess, no matter what the universal research says.

Secondly, there is something to be said about the units in which one measures the average word/text length. Traditional research and urban legends alike focus on the number of characters. But for a GUI designer, that kind of information is rather useless, because he measures the screen real estate in pixels.

As a simple example, in terms of letters, "猫" is 66% shorter than "cat" (which is what it means) and 75% shorter than "neko" (which is its Kun reading). But in terms of pixels, you don't save anywhere as much space. So, whether or not your menu items in Japanese, Chinese, Arabic, Farsi or Urdu will end up being shorter than their English counterparts depends on how you define "shorter".

The tricky part is that to one extent or another, this is true for every pair of languages, even for those that use the same alphabet. You have the English word "illicitly", and you translate it into Phantasese, and you get "mamwowo". Now what? It's two letters shorter, yet it no longer fits. (Unless, of course, you are using monospaced fonts everywhere, which is highly unlikely.)

Lastly, I would like to specifically address the myth that German words are oh-so-long. All those awfully long German words are only that long because they correspond to many words in other languages. "Kontroll­fluß­graph­visualisierungs­software" is no longer than its English counterpart, "control flow graph visualization software", and the famous "Donau­dampf­schiffahrts­elektri­zitäten­haupt­betriebs­werk­bau­unter­beamten­gesell­schaft"
is considerably shorter than its English translation. Yep, you heard that right, that monster of a word actually saves space. German words can be long and succinct at the same time, and English recognizes that by borrowing (kindergarten, wunderkind, doppelganger, wanderlust, zeitgeist, schadenfreude...).

Edit: I want to add that when it comes to GUIs (or news headlines), English loves to cheat by dropping articles. "Export file" rather than "export the file", "import image" rather than "import an image", and so forth. Many languages can't do that because they don't have articles to begin with. If there's any advantage Russian does have over English in normal prose, it's not having "a"s, "an"s, "the"s (and "to"s, while we're at it) scattered all over the place. But when it comes to GUIs, Russian loses that advantage, and an English expression that was longer than its Russian equivalent might suddenly become shorter. German is even better at that game: it has lots of articles to drop, none of them shorter than 3 letters, and quite a few that are 4 or 5 letters long.

A good analysis of the problem. The point about "pixels" is especially relevant. But your point about German being shorter, in some cases, while accurate, is actually misleading depending on how you define shorter :) Those German words can be harder to break into multiple lines. Did you put those soft-hyphens into those long German words? Or did the website software do it for you?
–
Mr. Shiny and New 安宇Sep 14 '10 at 19:38

12

@Mr. Shiny and New: spot on. There are more &shy;s in the source code of my answer than there are holes in Swiss cheese.
–
RegDwigнt♦Sep 14 '10 at 20:02

A point of reference from the website I maintain. The files where we store the translations have the following sizes:

English: 200k

Portuguese: 208k

Spanish: 209k

German: 219k

And the translations are out of date. That is, there are strings in the English file that aren't yet in the other files.

For Chinese, the situation is a bit different because the character encoding comes into play. Chinese text will have shorter strings, because most words are one or two characters, but each character takes 3–4 bytes (for UTF-8 encoding), so each word is 3–12 bytes long on average. So visually the text takes less space but in terms of the information exchanged it uses more space. This Language Log post suggests that if you account for the encoding and remove redundancy in the data using compression you find that English is slightly more efficient than Chinese.

@Ray J: I'm not sure there is such a thing as a perfect translation. But if you had one, then you could theoretically translate it all into English and just compress THAT. :)
–
Mr. Shiny and New 安宇Jul 28 '11 at 12:41

1

@Ray J.: I'm not sure that a "perfect translation" necessarily compresses to the same number of bytes. Why would it?
–
Mr. Shiny and New 安宇Jul 29 '11 at 13:20

1

@Xie: Hmmm. Take a truly trivial example: the English word "girl" translates to the Latin word "puella". That's a simple, perfect translation. But the Latin word takes 6 bytes while the English word takes only 4. The length of the equivalent word in other languages surely varies widely. Why would you assume they'd be the same? You could speculate that one could invent some "perfect" language that represents all possible concepts unambiguously in the minimum amount of space, and then all actual human languages could be translated to this. But I doubt it's even theoretically (continued ...)
–
JayDec 7 '11 at 18:05

1

(... continued) possible, as one would have to make arbitrary decisions about anticipated frequency. Like, will we make the most commonly-used words use fewer bits than less commonly-used words? But commonly-used in what context? The word "bread" no doubt shows up a lot in discussions about food, not so often in discussions about architecture. Etc.
–
JayDec 7 '11 at 18:08

This is of course a big generalisation. I would say it differently: Supporting multiple languages can break the user interface, because for almost any language there will be a string that needs more space than English.

What I mean is, "the average world length" may be close to English, but some particular words/expressions might be surprisingly long when translated, and these are the strings which might get cut in GUI. There might be a few such strings in the whole project, but they will annoy users a lot.

So it's not about languages needing more space, it's about particular translated strings that cannot fit. See "Hello/Здравствуйте" example above.

From my experience working with an application translated to multiple languages, some languages do need a bit more space in GUIs than English. Comparing English, Swedish, Danish, Norwegian, the differences were not that big most of the time. In Dutch however, some labels did need a lot more space.

Just translate the terms 'Play Now", 'Instant Play', 'Visit Site' or even 'Click Here' - all common call to actions on the web, and see the results. some are much shorter and some are much much longer.

I am responsible for a large portfolio of multi language sites with some carrying 29 languages and although mostly the design side runs smoothly, there are times where stuff just simply doesn't fit.

I think one of the major problems is that we are always direct translating english to all the other languages, when in fact if we reworked the messages in our creative to properly fit each market, perhaps things would be a little different - 29 times the work, but perhaps a better level of conversion.

+1 for pointing out the difference between translation and localization. It's not that important in the context of desktop applications, but it is a huge issue on the Web and in print. There is an enormous difference between translating a phrase willy-nilly, translating it really well, and coming up with a set of completely different phrases that are tailored to various nationalities but still convey the same basic idea.
–
RegDwigнt♦Sep 16 '10 at 18:00

To agree with Kai: verbs in german are longer than in english most of the times - which also is an agreement to the GUI-remark of j-g-faustus :)

For complete texts I have quite a problem to answer your question, as I recommend to consider style / tonality. When translating an english text to german, I have to use more words most of the time. Otherwise, it wouldn't sound that polite or straightforward or even won't be as understandable as the english origin... I would consider english (in comparison to french or germen, the only other languages I know und use quite well) as a language who makes it easy to transport the emotion or attitude within... while usually saving words, compared to german.

Where german language uses adjectives and subordinate clauses (a lot!), english comes up with simply structured, well-sounding sentences. Maybe we need also a cultural apporach on that: germans tend to explain and are driven by complexity, americans really love short and smart sentences. Advertisement slogans in the US assure a certain benefit or performance - in Germany, advertisements aim on really subtle emotional seduction or have to declare why the benefit will happen. Otherwise, the ad is not as trustworthy (as it would be / is in angloamerican culture - I can only compare those 2 due to personal experience)

"Lerne das neue Twitter.com kennen
Eine einfachere, schnellere, und gehaltreichere Twitter-Erfahrung."
...really is a strange use of language for german users. There surely are slogans in german, being as short and powerful - but with totally different content. So I can not compare the both of them... and return to the beginning of my comment ;-)

I've studied foreign languages for a while, and in my experience I noticed that latin languages such as Italian or French, for example, have longer words and a more complex way to make sentences. Often the same sentence have an evident difference in length if you compare anglo-saxon and latin languages.
Furthermore the writing and speaking style change a lot. English is more colloquial and direct, having less complex grammatical structures, irregular verbs, pronouns etc...
Another big advantage of english (in terms of lightness) is the habit of taking a word and turn it into a verb. For example if I want to ask you to search a term on google for me, I would say "Google this term for me, please". In Italian I would say: "Per piacere, potresti cercare questo termine su Google per me?" or something like "Potresti farmi la cortesia di cercare questo termine per me su Google?".
As you can notice, italian way is more complex and long.
By the way, If you are running a blog, I think that 3 or 4 lines more in a post don't change the world. If your goal is to show a sentence into a box, yes it could be a problem.

Yeah, the key here is the "varies wildly". For example, daughter is дочь, exactly the other way round. And hello could also be translated as привет or алло, depending on context. :-)
–
RegDwigнt♦Sep 15 '10 at 16:15

2

Languages in general follows the rule that the more common the word, the shorter it is. A word as commonly used as hello can't possibly be that long in russian. To me, it seems that you would rather use привет or алло, as RegDwight Ѭſ道 notes, and Здравствуйте would rather be translated into good afternoon (not noted by RegDwight Ѭſ道).
–
ShathurAug 22 '11 at 12:48

In German, Hello is "Guten Tag" - but in speech, you often hear "'Tag". You'd still write it in full though.
–
Richard GadsdenMay 21 '12 at 10:55

1

I think you're misrepresenting Russian with this simple single example. Also Google translate gives привет as a Russian translation for "hello", it appears you picked the formal variant
–
boboboboDec 17 '12 at 16:47

2

I think that further proves the point that Russian varies wildly.
–
Ryan TenneyDec 18 '12 at 20:51

It very much depends on what languages you plan to use. I can give a clear comparison for example of the Gnome desktop. I am a native Bulgarian speaker and English is my second language.

Here in Bulgaria, many of the programs, targeted for government work are translated in Bulgarian (no option for English) and I can say that Bulgarian menus are nearly double the width of English ones.

For example the three common tasks: Cut, Copy, Paste are translated like: Изрежи (Izreji), Копирай (Kopirai), Постави (Postavi) or Вмъкни(vmakni) (this depends on the translator), but it's generally much longer.

Also Tools is translated as Инструменти(Instrumenti) and Settings are translated as Настройки(Nastroiki).

So you can see that simple words, used in English for common tasks are much longer in Bulgarian, and this is just a single example.

As I said, you can try it on the Gnome desktop or any other popular opensource application, which offers different languages, if you want. You can easily change languages there and compare for yourself.

Semitic languages in common writing don't have written vowels, so even if the words were of the same length as in English, they would take less characters. Hebrew, for example, uses letters that are not more complicated, visually, than Latin letters, and the words look shorter because they miss the vowels. For example ירושלים contains 7 letters and is read as ye-ru-sha-la-yim - 11 sounds.

Now, when the vowels are written, as in children books, the words are still shorter visually because the vowels are dots above or below the actual letters. So the previous example will look like יְרוּשָלַיִם

From my experience, in GUI, Semitic texts are shorter than English.

This answer looks not very related to English. Hope it is interesting anyway :)

Probably the most common example of this is the original Hebrew pronunciation of God's name יהוה(YHWH/JHVH) which could be pronounced in a few ways: Yahweh, Yehovah, etc. (commonly transliterated as Jehovah in English) All are longer than the four letter name itself.
–
ArmstrongestSep 15 '10 at 19:11

Japanese needs less space than English, but usually use larger text or require higher resolution displays. Chinese is similar.

An Example:

The fox jumped over the moon.

キツネは月を飛び越えた。

I think I'd rather eat eggs than fish.

魚より卵を食べると思います。

A few reasons for this:

Subjects are often dropped and understand in context

Pronouns are mostly used when context isn't clear

There are no spaces between letters

Grammatical particles are one character long を、は、が、の、へ

Shorter text is seen as more "beautiful" (Note: Haiku)

Saying all this, most websites originally written in English are translated into Latin languages. Asian language websites will have their own design anyway, so I think for "practical" purposes, the statement that most languages need more space than English is fairly accurate, in context.

There are better reasons to keep text short. Shorter sentences are much easier to digest on the web. Most users like short concise points on the web. In addition, English language websites are read by a large audience of non-English language speakers, so using shorter, simpler language makes your content more accessible to a wider audience.

Did you write the Japanese sentences yourself? They don't look natural to me.
–
deleteSep 15 '10 at 5:26

Sorry, no I did not. I just quickly ran them through google translate. I'm no Japanese master. Though if I wrote them, they'd likely be even shorter, as google tends to add words that aren't necessary. The second example, I'd probably write it 魚より卵が好き... but that means more along the lines of "I prefer eggs to fish."
–
ArmstrongestSep 15 '10 at 18:48

I was involved in translating a GUI from English to German, French, Norwegian and Spanish; we had to increase the size of most of the text labels to make the translations fit.

As far as I remember, all the languages had at least a couple of terms that were longer than the corresponding English.

I don't know whether "most languages need more space" holds for general text, but I'm pretty sure it holds for GUIs.

I think part of the reason is that computer and GUI terminology comes from English, and many other languages haven't settled on a local terminology yet and may need multiple words where English has one.

A few examples from Norwegian:

quit: "avslutt" (4 chars vs 7)

upload: "last opp" (6 chars vs 8)

URL: "nettadresse" (lit. "net address")

Plus of course that GUI layout is constrained by the maximum length of the term in any of the target languages, it doesn't help if other terms are shorter than the corresponding English.