Internationalizing Code

Your programs may need to adapt to different encodings depending on the languages and operating systems of your users.

Note: In widget programming, displaying languages other than English is only supported in Windows.

Terminology

Term

Definition

I18N

The term internationalization is often abbreviated as I18N.

Multibyte character sets

Non-Unicode encodings commonly used for non-English languages. An example is the Japanese encoding Shift-JIS.

UTF-8

An encoding of the Unicode character set. Characters encoded in UTF-8 may be one, two, three or four bytes in length, but when we use the term multibyte it does not refer to UTF-8, as the term multibyte is reserved for non-Unicode encodings such as Shift-JIS. Many web browsers support UTF-8 and many web pages with international characters are displayed using UTF-8.

Wide characters

Unicode characters that can be represented in an unsigned integer value. These can also be described as UTF-16.

How IDL Uses Encodings

Item

Description

Console

Accepts input and displays output in both UTF-8 and the default multibyte encoding for the current system.

Workbench Editor

Uses the UTF-8 encoding by default. The encoding can be changed in the preferences section General > Workspace > Text file encoding.

IDL Graphics PLOT function

Displays strings in both UTF-8 and multibyte encodings.

Object Graphics

Displays strings in both UTF-8 and multibyte encodings.

Widget Programming

Supports strings in the default encoding for the operating system as described below in I18N Windows Preference Settings. For example, on a Japanese system the default encoding is Shift-JIS; strings with byte sequences representing Shift-JIS characters can be displayed in IDL Widgets. On the same Japanese system with Shift-JIS set as the encoding for non-Unicode programs, byte sequences for a different encoding such as Simplified Chinese would not display properly.

IDL Widgets are not able to properly display strings with UTF-8 byte sequences. The I18N_UTF8TOMULTIBYTE routine can convert UTF-8 strings into multibyte strings that can be displayed by the IDL Widgets.

Windows Input Method Manager

The Windows Input Method Manager (IMM) is enabled only on East Asian (Chinese, Japanese, Korean) localized Windows operating systems when an Asian language pack is installed.

Characters entered in the IMM composition window are returned in the Key field of WIDGET_DRAW keyboard events and the KeyValue argument of WINDOW and WIDGET_WINDOW keyboard handlers as unsigned integers representing a Wide Character (Unicode value). The I18N_WIDECHARTOMULTIBYTE routine can convert these characters to multibyte strings.

I18N Conversion Routines

IDL provides the following routines to convert strings from one encoding to another. Click on the routine links to see more information.