Posts Tagged ‘non-ASCII email’

With the IDN work for Internationalized Domain Names using characters beyond ASCII, it is only natural to tackle the problem of Internationalized Internet eMail.

Some smart people have been working on an IETF working group to figure out how non-ASCII email would work, and I encourage people to take a look: http://www.ietf.org/html.charters/eai-charter.html. That page has the charter, a list of drafts and RFCs that have already been produced, and links to the IMA working group mailing list.

Assuming you’re an ASCII/Latin character user, imagine having to type all your URL’s in Chinese, or Cyrillic (or if you know those, imagine typing everything in Klingon, eg:  ) In many cultures, that’s what it’s like to use the web. Some users may not be literate in Latin letters, or may have to do a lot of hunt-n-pecking. EAI should help address that problem.

How EAI/IMA Works

The basic idea of the EAI working group is to stick email in UTF-8 instead of ASCII. UTF-8 works pretty well in many systems, and many mailers already handle 8 bit encodings, so this is a pretty “simple” solution. Unfortunately email touches a lot of places, so there’re a lot of protocols that need updates (eg: STMP, POP, mailto:, etc.) Additionally everyone knows that UTF-8 email can’t happen instantly, so there needs to be a system for existing servers to talk to UTF-8 aware ones, which leads to a few more RFCs.

UTF8SMTP allows the servers to make decisions about the “local” part of the email address, which allows for groups to fit their own needs. The backwards compatibility means that users also need ASCII addresses, as they do today. The server would alias from one address to another so mail to @microsoft.com could map to my normal mailbox, and I’d only have one mail. Unfortunately that simple concept means that places that didn’t have to worry about aliasing before may now have to consider aliases and fallback addresses. Contact lists may need to have both forms, etc.

Current Status of EAI/IMA

Currently there are several experimental RFCs, and several people have created interoperating systems that work with each other to demonstrate the feasibility of UTF8SMTP. The next step is to move towards a standards track process, which could happen “reasonably quickly”. I’m optimistic that the standards will move quickly, but sometimes these things take a while.

So Who’s Gonna Use It?

There are a lot of markets where ASCII doesn’t work very well for various reasons. Even when people have ASCII aliases, it may seem artificial, and there may be a desire for an email that reflects them or their country. There are many ISPs in countries like Korea, China, & Japan that are very eager to be able to send email in a native script. Some governments like Russia and China are weighing in on the importance of being able to send mail and use the Internet in their script.

What’s IMA Mean To Me As a Software Developer? (who cares?)

If you are a developer, then you may run into IMA addresses. Even if your app doesn’t explicitly deal with mail, there may be a place for email to sneak into your app. For example, IDN and domain names don’t really have much to do with Word or PowerPoint, yet they often show up in documents and presentations. I could imagine an author address in metadata, such as a photographer contact in a photo’s metadata. Many apps probably will run into IMA addresses whether they realize it or not.

Anyway, I have been thinking about this space for a while and thought I’d share my observations. It’s worth considering what impact IMA will have on your application (while you’re at it, how’s IDN behave?)