9 Signs You Need Help with Java Internationalization

Global business, global software. The people who designed Java were forward-looking. They oriented this programming language towards international use from the beginning. Eminently portable, Java lends itself well to internationalization and the adaptation of texts, numbers, dates, currencies, and any other culturally dependent dimension. With better overall java internationalization comes easier localization for each particular language or country.

Internationalization (like usability, testing, security, and more) is best tackled at the beginning of the application design and development process. The later you leave it, the more expensive it becomes to engineer it in afterwards. But with everything that Java has to offer to help developers go global from the start, internationalization should be a snap, right?

[Tweet “Internationalization is best tackled at the beginning of the app design and development process.”]

Well, yes and no. Once you’re in the Java internationalization groove, preparing your software for all kinds of exotic languages may seem like second nature to you in no time. On the other hand, unjustified assumptions, suboptimal design decisions, and faulty use of Java functionality can all conspire to make internationalization rather trickier. Here are nine warning signs that your Java internationalization could use a helping hand.

1. Assuming Java Internationalization Comes Directly with Portability

While Java can run practically everywhere (Java for Android apps, for mainframes), internationalization still needs time and effort for each application. If you think that formatting and parsing of numbers, dates, and currencies, not to mention postcodes, phone numbers, or weights and measures, is straightforward, it may be because you have only seen them in your own language. However, consider that even between US English and UK English, a date expressed as 01/06/2016 may mean two different things (January 6, 2016 in the US, but June 1, 2016 in the UK). Internationalization does not come automatically with portability.

2. Using only the Java Default locale

Java is designed to offer great flexibility in the use of locales, those object identifiers for specific combinations of languages and regions. Locale-sensitive Java classes return values that vary in terms of the different locale being used. Locales trigger the changes, although the methods of the locale-sensitive classes do the work required for formatting, detecting text elements, and so on.

Java localization requirements mean the use of multiple locales. There is also a default locale for Java applications that have not been built to explicitly manage locales. While you may make use of the default locale for one or other application, systematically and exclusively using it for all your applications suggests that internationalization has not been given all the thought it deserves.

[Tweet “Java localization requirements mean the use of multiple locales.”]

Similarly, one locale for a given language may not be enough either. For example, French with its variants of European French, Swiss French, Belgian French (Walloon), and Canadian French (among others) shows that one locale per a version of the language will often be needed.

3. Using Hardwired Texts and Labels

It may seem easier at the start for developers under pressure to hardcode text strings in just one language, whether those strings are “Welcome”, “Thanks”, “Click Here”, or any other instruction or language-dependent entity. However, the technical debt mounts rapidly. It is also compounded by bug fixes that only work for the hardcoded version and that will have to be retested and perhaps redone when internationalization starts.

Of course, it is possible that your decision to release in only one language is a deliberate marketing choice. After all, English is still the most widely used language on the Internet. Yet even the total of all English speakers now represents only one quarter of all Internet users. If you want access to the other three-quarters, you will have to locate and replace all those hardcoded texts with keys that reference Java properties files with different language contents.

Those keys are now placeholders that must always remain the same in the code and in the corresponding properties file. If translators modify any of those keys in the properties file, your program will no longer be able to retrieve (using getString, for instance) the specific language content because it will no longer find the matching key.

4. Compound Messages in Your Java Code

As soon as text messages start to grow in length or sophistication, they become more difficult to render properly by plugging in different language content. This is because the order of the words in the message may change. Statements involving currency are a simple example. Where US English may quote a price as “$10” for example, French will express this as “10 $”, assuming the dollar currency is also used. Thus, a text message in English of the form:

String priceQuote = “The price is $“ + priceItem.toString();

will not be in the correct format when translated directly into French.

Compound messages should be avoided, if possible. If there is no other choice and compound messages must be used, other techniques must be used in Java to handle them.

[Tweet “Compound messages should be avoided.”]

5. Thinking Java Uses Only Your Alphabet

The English alphabet has 26 letters. However, other alphabets may have more or less, or may modify different letters. French, for instance, can use five different versions of the letter “e”: e, é, è, ê, and ë. If it is important that your application checks that a character is a letter, for example, then by only checking the English alphabet in a French language context, you would miss four out of the five different versions of “e” and your verification could return an incorrect result.

Java offers Character comparison methods based on the Unicode standard to help return correct results, for example:

char ch;

if (Character.isLetter(ch))

However, even Unicode may not cover all the bases. Some corner cases and new symbols may have to be handled in other ways, depending on the languages into which you plan to localize. So if you need to take account of next generation Gujarati slang or Leet speak in your Java application, and you are not proficient in either, it might be a good idea to find a native speaker or another expert who is.

A row of question marks on a user’s screen, or some intriguing but undecipherable combination of non-alphanumeric symbols, often means something was lost in translation. This may be the tip of a larger iceberg, where functionality is lost as well as meaningful screen displays. Java methods such as indexOf and compareTo in the String class are not internationalized, and no simple internationalized replacements are offered by Java either. The answer may be to build personalized matching routines, using the Java resources available (in the CollationElementIterator class).

7. Strings All Over the Place

Sometimes, the problem is not with the characters themselves, but with the way they are laid out on the screen. Translated strings may be shorter or longer than the original versions, leading to positioning problems or other awkwardness. The issue is the absolute (X,Y) coordinates used for positioning the different components of the display. The use of a Java layout manager allows the positioning to be done in a relative way instead, helping to keep components properly located and allowing developers to better handle expanding or shrinking string lengths.

8. Forgetting about the Underlying Operating System

While the whole point, many would say, about Java is that applications are freed from specific OS considerations, there are still cases where the operating system provides functionality necessary for localization of Java apps to work correctly. The rendering of frame titles in a Java program is one example. For this to work properly for different localized versions, the operating system must support Unicode and offer a suitable font for displaying the text in the title. Otherwise, display will be limited to the languages supported by the OS, possibly defaulting simply to English.

9. Search Routines Fail to Return Correct Results

Java’s Unicode base means that some language characters can be encoded in different ways, and must therefore be searched for accordingly. For example, searching for the family name “Schönberg” in a German language file may not work properly if the search is limited to precisely this version of the name, checking only for the “ö” character.

Instead, the search method also should be able to detect:

“ö” and “oe”, as one of these may be substituted for the other

The presence of the 16-bit Unicode value \u00F6, representing ö

The presence of the two 16-bit Unicode values \u0060 and \u0308, together representing ö.

Conclusion

Java’s universality and portability may lull you into a false sense of security when it comes to readying that version in Chinese or Brazilian Portuguese. You will still have to make sure your Java internationalization is done properly so that:

The same code can run anywhere in the world, by plugging in the corresponding localized data

Text elements to be localized are stored separately, and are not hardcoded into your Java application

Culture-dependent data is displayed as people from that culture expect to see it

Localization can be done rapidly and efficiently.

[Tweet “Localization can be done rapidly and efficiently.”]

On the other hand, Java comes equipped with tools to make your internationalization work easier, especially if you start it when design and coding start. Knowledgeable consultants and an experienced service provider can also help identify and remediate the blind spots that can be harder to see from the inside, making your Java internationalization an all-round success.