android2po: Managing Android translations

I’ve always liked gettext a lot. Rather than asking you to maintain a database of strings, assigning an id to each, it simply uses the original strings itself as the string id. To me, it’s a classical example of choosing practicality over purity.

The Android localization system, of course, uses the former approach. Each string is a resource with an id, each each language, essentially, has one or more XML files with the proper localized string mapped to each id.

For my apps, I initially used to have only the original English version, and a German translation, those being the only languages I speak, more or less anyway. Now, whenever I added a new English string, or changed an existing one, I immediately updated the German version as well – simply enough.

For A World Of Photo, I decided to ask the community for help with translations into more languages. Clearly, things were not so simple anymore.

See, with gettext, when the set of strings an app uses changes as part of a new version, you can simply “merge” the new string catalog into each of the translations. Strings that have been removed from the app are removed from the translations files, new strings are added, and strings that have been changed are flagged as “fuzzy”, at least to the extend that the merge tool detects it as a change, rather than a completely new string. That last part is possible because each translation file contains contains not only the translations, but also the original string that was translated. Remember, it’s the string that is the database key.

As a result, translators simply have to go through the list of new or fuzzy, update those, and they’re done.

Now, Android’s system has no equivalent tools. Frankly, I wonder how other people do this. I mean, you surely don’t want have your localization team go through the full list of strings every time you release a new version. Even if you decide you don’t need to ability to detect strings that have changed (you could simply have a policy of using a new id when such a change is necessary), you still need tools to merge changes in your main strings.xml file into each language’s XML resource with new/removed strings (do any such tools exist?).

I suppose you could also ask have your translators work off a diff, but that seems inconvenient. There’s this huge ecosystem around gettext with all kinds of desktop and web apps that could be utilized.

Google seems to use something internally, because Android’s own string resources are marked with msgid= attributes.

So, I decided the best way for me to deal with this would be to simply convert Android’s XML resources to gettext, do the translations, then import the result back to Android. I found out that the OpenIntents project was doing the same, essentially using a generic xml2po tool found somewhere in the depths of gnome-doc-utils. I kinda got it to work, but ran into a lot of little issues; in the end it felt just too hacky.
The final thing that convinced me that writing a special purpose tool might be worth my while was the fact that Android’s XML resource format has a bunch of different escaping rules and peculiarities (which I plan to write a separate post on), with which translators shouldn’t really have to deal with.

There’s also a README file which explains the basic usage; which is really just a2po init, a2po export and a2po import calls, though at this point there’s also various configuration options that should make it really quite flexible.

The biggest thing it doesn’t support yet are the <plurals> tags, mainly because I didn’t need them myself yet. Apart from that, I do believe it should work just fine for most projects.

First, the simple “string is the database ID” approach is, unfortunately, ambiguous. There are numerous cases where the same English string needs to have different translations depending on its context (e.g. a dialog title, “Search”, might need to be translated in a different way than a button titled “Search”). So, the “practical” approach leads to problems sooner or later.

Second, regarding “going through the full list of string” every time something changes. Professional translators use professional tools. Look up CAT – Computer Aided Translation. The tool not only automatically translates whatever has been translated before, it also suggests similar translations from the past for new strings. So, if a single word has been changed in a string, the tool offers the “previous version” of the previous string and identifies the differences in the source. In addition, it helps with a bunch of other things, e.g. terminology, consistency, spell checking, typos in numbers, …

Joe, gettext does support different contexts for the same string, via a mechanism called, actually, “message context”; it also has fuzzy support to recognize strings that only changed partly; and obviously tools are free to supporting things like translation memories.

I am of course aware that progressive updates of translations are possible in a resource key based approach as well; to my knowledge, no such tool existed for Android translations back then (now there various web tools).

While obviously not making a difference in case of Android, as a programmer, I still prefer the convenience of not having to come up with a resource key whenever I’m defining a new string.

You can do great translations of Android apps with the help of this online software: https://poeditor.com/. Translators can work with strings projects in as many languages as they wish and they can also use the automatic translation function which works with Google and Bing.