Testing Android Apps with Pseudolocalization

Karol WrótniakAndroid Developer

Background

The length of translated text is likely to be different than the original. Depending on the source and destination language, it may vary significantly. Moreover, the same text translated to some languages may use acronyms, while in others there may be fully expanded phrases. You can find more details in this article on W3C. What’s more, most of the languages (e.g. English) use left-to-right text direction but right-to-left is also used (e.g. by Arabic or Hebrew).

The issues

All these facts make app localization far from trivial. Let’s say we have a piece of English text: Always allow Wi-Fi Roam Scans. Among others, there is a Polish translation. The final effect may look like Figure 1, below.

Figure 1. Switch with English and Polish texts

At the first glance, everything looks fine, especially if you don’t speak Polish. However, the Polish text is truncated! It should be Zawsze szukaj Wi-Fi w roamingu. Even if the whole text is visible, it may not look very good, as can be seen in Figure 2.

Figure 2. A button with the longest text in Hungarian

OK, so which language is the best choice for testing? Hungarian text was the longest in the latest example. However, it is not always the case! As you can see in Figure 3, the phrase Install translated to Polish is longer than Hungarian by 1 character.

Figure 3. A button with the longest text in Polish

Pseudolocalization to the rescue

These kinds of issues are not specific to Android projects. Web apps, desktop apps and virtually all products involving translations can be affected. It seems that someone must have invented some solution a long time ago… Indeed, it is called pseudolocalization or pseudo-localization.

What is pseudolocalization? Basically, we introduce a pseudotranslations to pseudolanguages, which have all the desired properties. Note that pseudolocalization may involve not only text length but also, for example, diacritical marks or text direction. This technique is commonly used in QA: for example, Microsoft has used it since Windows Vista.

How to use pseudolocalization in Android projects?

First, you need to enable the generation of pseudolocales in a build type configuration:

Then, you can change the device or AVD language to English (XA). Figure 4 shows how to do it on Android 8.0.

Figure 4. Changing the language to English (XA)

Note that the translatable texts are changed. Now, you can build and launch your app. In the case of text from the previous example, you should get something similar as shown on Figure 5.

Figure 5. A button with the longest text in English (XA)

As you can see, all the letters received extra diacritical marks. Note that the letters with diacritics can be taller than their base forms, like in this text: NŃEĘ. With text containing diacritics, you can verify whether there is enough vertical space for text and if there are no overlaps between adjacent lines, or text and other UI elements. Additionally, there is an extra word for each original one and the whole text is enclosed by square brackets. So, you can check if there is enough vertical space for text, if it is not truncated and whether the ellipsis is correct.

Pseudo-RTL locale

Yet another pseudolocale is Arabic (XB). At first glance, it may look just like English written backwards, but this is not the only feature. Let’s look at Figure 6.

Figure 6. Arabic (XB) in action

The most visible thing is right-to-left (RTL) text and layout direction. Note that there is an option to force the RTL layout direction in Developer options but it won’t change the text direction, e.g. English texts will still have its first letters on the left.

Secondly, we have Eastern Arabic numerals, not to be confused with (Hindu-)Arabic, which are most common in the world. It allows you to easily check how an app behaves with non-standard numerals. Usually, texts readable by users (especially displayed in UI) should use local number formatting. On the other hand, machine-readable strings like HTTP headers should always use (non-Eastern) Arabic numerals. Otherwise, they won’t be interpreted properly.

Finally, we can see that non-translatable text resources weren’t pseudolocalized. Their direction has not been changed.

Edge cases are everywhere

pseudoLocalesEnabled DSL documentation says (emphasis mine): “Whether to generate pseudo locale in the APK. If enabled, 2 fake pseudo locales (en-XA and ar-XB) will be added to the APK (…)”

You may think that adding an extra word for each original one, and enclosing it with square brackets, makes the given text longer than the longest possible translation. Nothing can be further from the truth! Look at Figures 7 and 8.

Wrap-up

Pseudolocalization can be helpful for QA. You can just change the language and test everything. No need to figure out which locale have the longest texts, which uses a lot of diacritics or which uses Eastern Arabic numerals. However, keep in mind that edge cases are also everywhere.