I'm sure many have already heard this news, but I'm going to put it up for those who've not heard about it. The inevitable is coming... URLs that no longer require Latin based characters. "When?" Sometime mid next year URLs with different languages will become available. Japanese will be one of the languages available. This technology isn't new and has been worked on for several years. "...The English dominance over the internet is about to come to an end..."

血まみれ剣術師 wrote:I'm sure many have already heard this news, but I'm going to put it up for those who've not heard about it. The inevitable is coming... URLs that no longer require Latin based characters. "When?" Sometime mid next year URLs with different languages will become available. Japanese will be one of the languages available. This technology isn't new and has been worked on for several years. "...The English dominance over the internet is about to come to an end..."

I'll say it's not new.I wrote a paper mentioning it over 5 years ago. The .jp registry has allowed domain names in Japanese for about 6 years.

In fact non-Latin URLs has been MUCH slower taking off than people expected. A lot has been the fault of browser developers who haven't been that friendly in their implementations. Another is that once you have a domain name in, e.g., Chinese, it's very difficult for anyone outside China to access it unless they know how to key in Chinese. The experience in Japan was that a lot of companies, etc. dashed in and got Japanese domains, and then on reflection decided not to use them.

Let's not confuse URLs with domain names. URLs with Japanese characters are already available and probably have been for quite a long time (though in some browsers, the characters will be converted to unreadable numeric codes after you actually enter the URL). The Japanese Wikipedia, for example, uses Japanese characters in all of its article titles and therefore the URLs of all of its articles.

Allowing non-Latin characters in domain names opens a can of worms beyond just accessibility, because some non-Latin characters have the same letterforms as Latin characters, allowing a form of domain name spoofing where a domain name looks the same as another domain name. For example, Greek capital letter alpha looks exactly the same as the capital letter A, but is encoded differently since it belongs to a different alphabet. I think some browsers already have defense mechanisms against this, but I still don't like giving potential ammunition to scammers for little practical advantage.

My guess is that most people the world over are perfectly comfortable with Latin domain names as it is. It'd probably suck if you're in, say, Greece or Russia, though, since unlike China (which has pinyin) and Japan (which has romaji), I think they simply don't use the Latin alphabet in their day-to-day lives.

- Kef

Last edited by furrykef on Fri 10.30.2009 10:35 am, edited 1 time in total.

jimbreen wrote:I'll say it's not new.I wrote a paper mentioning it over 5 years ago. The .jp registry has allowed domain names in Japanese for about 6 years.

In fact non-Latin URLs has been MUCH slower taking off than people expected. A lot has been the fault of browser developers who haven't been that friendly in their implementations. Another is that once you have a domain name in, e.g., Chinese, it's very difficult for anyone outside China to access it unless they know how to key in Chinese. The experience in Japan was that a lot of companies, etc. dashed in and got Japanese domains, and then on reflection decided not to use them.

Jim

Do you know any examples Jim? I'd be curious to try some out.

We were discussing this in the chat today... and I got the feeling that such domain names really wouldn't be used outside of that specific country or at least language group, thus really limiting their use and appeal.

I can see some countries doing it for nationalistic purposes (China, North Korea spring to mind, but I'm sure many others would do it for the same reason)... but the reality seems to be that this would only limit exposure and splinter the global nature of the web with a Babel effect.

Like you said, most people outside of a given country or language group aren't going to know how to key in different alphabets, syllabaries, logographies, etc, and thus wouldn't touch those websites.

(Perhaps they would simply be best used as aliases to existing sites, so that you could reach the site by different domain names in different languages, but would still get the same content.. or perhaps you might get your own localized content by going to a website by using your local version of the website's name... it will be interesting to see if/when it takes off.)

Looks like you already had an article up that touched on this as well... just stumbled across it while searching for some other examples. (unfortunately it seems most of the examples in this don't work either)

On a side note, this raises another issue here on the forum... the URL code within the forum software doesn't handle Japanese script... so you can't make actual links out of either links such as http://えび田.jp/ , or even the Japanese Wikipedia links without first encoding them...

phreadom wrote:We were discussing this in the chat today... and I got the feeling that such domain names really wouldn't be used outside of that specific country or at least language group, thus really limiting their use and appeal.

I can see some countries doing it for nationalistic purposes (China, North Korea spring to mind, but I'm sure many others would do it for the same reason)... but the reality seems to be that this would only limit exposure and splinter the global nature of the web with a Babel effect.

Marshall Unger had an interesting view, expressing certain irony of the situation:"Even today, the vast majority of those who use Japanese script on computers input data in romanization; to that extent, even though they may refuse to read data in romanized form, they already, in a psychologically fundamental way, make use of an alphabetic representation of Japanese words and phrases."

I can see how it would be very helpful in Japan, because there are so many different ways to Romanize Japanese.

If you advertise your site as "しゅうしょく ドット コム", people wouldn't know if it's shuushoku.com, shyuushyoku.com, shūshoku.com, or something else. Shi/si, tsu/tu, zu/du would probably create the most problems. And if you have to explain how to romanize it, there's no real advantage to naming your site something easy to remember like 就職.com (which is now [er, has been] a valid URL).

I don't think the accessibility argument is that big of an issue. It's generally understood that Japanese URL's would only be used for sites that are targeted specifically to Japanese natives (i.e. the people with the hardest time remembering non-Japanese spelling!), and if an international site was needed a second domain could/would be set up (kind of like the companies that register both a .co.jp and .com name).

I do think the spoofing problems Furry mentioned would be a problem though... I hadn't thought of that.

hyperconjugated wrote:Marshall Unger had an interesting view, expressing certain irony of the situation:"Even today, the vast majority of those who use Japanese script on computers input data in romanization; to that extent, even though they may refuse to read data in romanized form, they already, in a psychologically fundamental way, make use of an alphabetic representation of Japanese words and phrases."

Apply a lot of salt to Jim Unger's outpourings, especially on romanization.

You're somewhat correct on the URLs, but the the URLs to be released are entirely in foreign characters. The major difference is you can do everything without Latin characters. That's what makes this slightly different. They'll no longer need Latin characters to use email or surf the web. Many thought this idea was impossible to do, but it's going to happen next year.

Not really. A person will probably have the option of typing all three ways sooner or later. Completely localizing the characters will help the rest of the world connect. I highly doubt that minor obstacle will become a barrier.

Examples(Remember that the actual TLDs have not been released yet)mofa.go.jp外務省.go.jp外務省.試験.日本がいむしょう.しけん.にほん (多分)

hyperconjugated wrote:

phreadom wrote:We were discussing this in the chat today... and I got the feeling that such domain names really wouldn't be used outside of that specific country or at least language group, thus really limiting their use and appeal.

I can see some countries doing it for nationalistic purposes (China, North Korea spring to mind, but I'm sure many others would do it for the same reason)... but the reality seems to be that this would only limit exposure and splinter the global nature of the web with a Babel effect.

Marshall Unger had an interesting view, expressing certain irony of the situation:"Even today, the vast majority of those who use Japanese script on computers input data in romanization; to that extent, even though they may refuse to read data in romanized form, they already, in a psychologically fundamental way, make use of an alphabetic representation of Japanese words and phrases."