IDNs leverage Unicode to display various non-Latin scripts, such as Arabic or Chinese, within computer applications. An encoding syntax called Punycode bidirectionally transforms the Unicode that is needed to represent these scripts into the subset of the Latin script that is used for domain names. This essentially reduces the scripts of the world into a form suitable for processing by applications that have no understanding of Unicode. This, for example, transforms the newly minted TLD for Saudi Arabia, ‏السعودية, into xn--mgberp4a5d4ar so that it can be processed similarly to any ASCII-based domain name.

Punycode has several advantageous characteristics. For example, it encodes the discrete components of a DNS name individually making it possible to encode only part of a DNS name. Encoded name components are prefixed with xn--. One such partially-encoded DNS name is xn--vckfdb7e3c7hma3m9657c16c.jp which, with one encoded and one unencoded label, represents the Japan Registry Services. This partial encoding has allowed the use of local languages in parts of the world for several years without support for IDNs at the DNS root.

Allowing users to connect with one another or online resources without the constraint or burden of Latin characters is certainly a good thing. However, there are security risks to be understood.

What you see may not be what you get. It is possible to represent many of the world’s scripts using Unicode. This makes it possible to present characters from different scripts that appear identical to one another. However, when these characters are compared by a computer they are as different as ‘A’ and ‘z’. This particular risk is not new and already existed within the Latin script where, for example, the digit ‘1’ and letter ‘L’ can be appear identical when using some fonts. Discussions on this topic began in 2002 and ultimately presented two visually similar, albeit fake domains for paypal.com and Microsoft.com.

Foreign languages produce foreign URLs. We have been educating those around us to avoid following links that look like hxxp://tvxwoajfwad.info and to use preview tools when faced with a shortened URL. However, is a URL composed in a foreign language somehow more trustworthy? How can I as an American monoglot discern a legitimate URL in Chinese from the Chinese equivalent of hxxp://tvxwoajfwad.info (one of the pseudo-random domains used by Conficker)? I cannot, and so I must place all IDNs into the category of URLs that I do not trust.

New TLDs create registration activity and opportunity. When new TLDs are deployed there will often be a rush to create desirable domain names within that TLD. It is expected that as the deployment of new TLDs continue, this trend will also continue. It is common for organizations to register domains in multiple TLDs and this process will be more complex as a result of the disparate scripts involved with the introductions of IDNs. Furthermore, the range of available scripts will present opportunities as phishers look to capitalize on unclaimed international brands.

Any time there are advancements in technology we should take a minute to understand the associated security risks. Internationalized Domain Names are no different. The mindset that we apply when faced with untrusted URLs and odd URLs from trusted sources should also be applied to IDNs.

Leave a comment

We'd love to hear from you! To earn points and badges for participating in the conversation, join Cisco Social Rewards. Your comment(s) will appear instantly on the live site. Spam, promotional and derogatory comments will be removed.

All comments in this blog are held for moderation. Your comment will not display until it has been approved

Good piece Tim. While those who have fluency with a Latin Alphabetic language may view the introduction of TLDs as both an operational and security challenge, for those whose language is not comprised of a Latin Alphabet have had to address this challenge since the introduction of the internet. From my optic, this is an opportunity to widen the accessibility of the internet, and no doubt utilities and tools will evolve to help those on both sides of the linguistic challenge to successfully cross the chasm.

Some of the individuals posting to this site, including the moderators, work for Cisco Systems. Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not of Cisco. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Cisco or any other party. This site is available to the public. No information you consider confidential should be posted to this site. By posting you agree to be solely responsible for the content of all information you contribute, link to, or otherwise upload to the Website and release Cisco from any liability related to your use of the Website. You also grant to Cisco a worldwide, perpetual, irrevocable, royalty-free and fully-paid, transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any original content you provide. The comments are moderated. Comments will appear as soon as they are approved by the moderator.