I was recently contacted by the domain holder for
手机应用商店.cn
手机应用商店.com

Which is mobileappstore.com in chinese that he had just registered ????

How the heck is this going to work???

eg. I could register the domain for Paypal.com in Hebrew tomorrow and anyone who uses the Hebrew version of their keyboard would be going to a totally different website that i control and as long as i'm not phishing there is nothing paypal can do??

Cheers,
Dean

P.S.
UPDATE: There are much bigger issues than trademark at stake. As you can see from the facebook conversation below – my phishing concerns are worse than you think. It's way worse.

If you cut and paste the following text into your browser bar;

> раyраl.com <

it looks like paypal.com BUT ITS NOT!!

Some of the letters are Cyrillic and lead to this webpage - http://xn--yl-6kcb1fc.com/
(it's not registered yet - so it will lead to your isp's default null dns page).
This is a major screwup by ICANN

Regards,
Dean Collins

Alastair Bor
Yea and what about > раyраl.com < (I just typed the p and a in Cyrillic characters).

It's a well known issue with the internationalisation of DNS. Wikipedia has a good summary: http://en.wikipedia.org/wiki/IDN_homograph_attack . Pesonally, I'm not too worried, as it's not particularly secure anyway to rely on DNS registations for verification of identity. If you want to establish trust for a site, it is better to use a certificate, and even better to combine it with the "Web of Trust" (http://en.wikipedia.org/wiki/Web_of_trust).

I look forward to the (unlikely) day when there is no DNS on the net! That way, people won't be tempted to place their trust in it. For example, Freenet replaces URLs with hashes. I've got memories of reading a Tim Berners-Lee quote, whereby when he invented the WWW he envisaged people clicking on textual hyperlinks and never actually viewing the URL behind it. The trust comes from the digital signature retrieved from the URL, rather than from the URL itself.

Just as a notice, the redirect to http://xn--yl-6kcb1fc.com/ is a fuck-up of your/my browser (I am using Chrome) which just isn't ready for UTF-8 urls yet. But that doesn't change the fact that this is evil.

The fact that this is a surprise to you all is what is really scary.As an example, my bank's website opens a new window with the size of the screen without the address bar.Do you see my point?All that is required is that browsers honor IDNs (unlike firefox, at least) and at the same time unequivocally mark the url/domain as IDN.What is required is a culture of responsibility not some feature that prevents people from understanding what they are doing.English is not everyone's language.

In which way can alternative languages be used without this problem arising?What is required is a convention for marking the language the domain is in and the practice of aknowledging it. Sorry for the harsh tone.pedro

I totally think that alternative languages apart from latin text should be allowed on the web (actually until a week ago i didn't realise they couldn't - i incorrectly assumed Indian domains could always be written in Hindi).

What i'm surprised about is that icann hasn't more widely publicised this issue.

I have enough trouble trying to explain to my wife about phishing emails as it is, when even i cant tell the differnce this is a bigger problem.

Spread the url of this blog post so that hopefully someone can explain to us how to solve this problem (outlook and browser plugin maybe?)

@Dean - I got a different take on the information you linked to. What it looks like they are saying, to me anyways, is that they will offer ways to get to the *same* domains that exist now using only international keyboards/characters. For instance, the WSJ post says this:

"The change will allow the suffix -- known as a top-level domain -- to be expressed in about 16 other alphabets."

It doesn't say that new tlds using international characters will be introduced, but that the current ones can be expressed without the need for a Latin keyboard. It's a translation thing rather than actual new domains.

As for the guy offering you your domain in Chinese, I don't see that any different than someone offering you your domain in, say, Welsh (symudolcaisstore.com). If you actually go to the address he offered you, it redirects to a domain name that is similar to yours, but not it exactly.

re"It doesn't say that new tlds using international characters will be introduced, but that the current ones can be expressed without the need for a Latin keyboard. It's a translation thing rather than actual new domains"

I understand what you are saying but last night i registered hotmail.com with a cyrillic 'o' on godaddy.com (no you cant cut and paste this example as the comments are plain text)

basically if i sent you this url in an email you would have no way of distinguishing it from a regular latin ‘o’.

(and yes I understand what you are saying about the welsh ‘similar translation’ – that was my original concern as well but now my primary concern is the phishing issue)

That's old news, and was essentially fixed over 4 years ago in a similar variant. The issue a few years back, was that a browser URL bar would display a unicode encoded paypal.com , but direct a browser to something like "xn--pypal-4ve.com" -- which is the ASCII encoding of the the character set.

In any event, that issue -- and this one -- is not an issue with ICANN, but with the browsers and OS.

Also some stuff from Shmoo ( security think tank featuring directors of Apache, PGP, etc ) - which first published the paypal example. http://www.shmoo.com/idn/

The shmoo page contans the IDN (interrnational domain name) advisory papers. The issue dates back to 2001 when "Homograph Attacks" were first identified.

The underlying issue, is that homographs look the same, but are not the same. ie: a cyrillc c vs an ascii c.

There have been a number of proposed fixes, which haven't been adopted by browsers and os, particulary that Browsers / OS should be saying when there is mixed-code characters, or when a character is in a non- native character set. this was actually part of IETF rfc 3490, which is the base IDN standards rfc ie: given A = ascii , C= Cyrillic warn if CCCCCCC on ascii browser warn if AAAAAAAA on cyrillc browser warn if AAAACCCC on any browser

the one thing that ICANN did drop the ball on -- and its kind of unfair saying that, as it would have been very hard to implement equitably -- is that they didn't enforce one of the smarter DNS level security concepts -- that possible characters for IDN domain names be locked down by TLD.

in any event, the issue is much less at the fault of ICANN than it is with the browsers and operating systems.

Digital certificates on their own won't fix it. Verisign doesn't care who applies for a certificate. That's why I would combine it with the web of trust. Ultimately the only people you can trust are those who you have a relationship with. The WoT relies on exactly this (friends of friends of friends of...). A dodgy site will get a low trust rating in a WoT, irrespective of what some central "authority" thinks.

PS. Marcel. The URL "http://xn--yl-6kcb1fc.com/" is not an error. Mozilla (and others) intentionally mangle non-Latin URLs to combat precisely the attack Dean has highlighted. What you are looking at is the solution, not an error!

Yeah, it opens new attack vectors. Practically every non-trivial feature ever added to anything does that. I'm not sure what the solution is, but here are the requirements:

- non-latin characters must be allowed in domains- users must have a way to verify the identity of a site/site operator

Some ideas:

- ICANN could disallow mixing char sets in domains- a landrush period during which Paypal et al could have found and claimed all their sites' homographs would have been nice...- browsers could alert the user when a domain's char set doesn't match their system char set (somebody else said this already)

Browsers and email clients are already starting to report possible phishing attacks and other scams. The heuristics for detecting a possible homograph attack seem pretty straightforward. Should still be workable.

Honestly, the biggest problem with this whole thing is the lack of publicity and coordination on the part of ICANN. They should have made a statement saying, "On {DATE > 1 year in future} we'll start allowing non-latin domains to roam freely." This would give companies time to lock down the homographs and browser makers time to implement new security warnings.

Spent several hours with Godaddy level 2 technical support to work out why my registrations for paypal and twitter and Godaddy where not working (where i substituted 1 Cyrillic letter in place of an English letter)

It turns out basically Verisign already worked out the soution to this - you cant mix letters from multiple languages.

So you cant substitute a cyrillic a and use English letters for > p ypal<

There are very limited letters that match English with all Cyrillic (lol ebay is one of them - and it's already registered.....hmmmm).

....oh and you definitely cant register paypal, not sure where Christina got her 'L' in Cyrillic from :P - http://mashable.com/2010/01/01/idn-phishing/).

So unless anyone has anything to add i think this issue is a dud.

Considering even L2 support people at the registrar didn't know this was implemented i dont feel so bad.

Considering even L2 support people at the registrar didn't know this was implemented i dont feel so bad."

Looks like you found out the hard way. A lot of people are claiming that this is ridiculous, but it has in fact been something ICANN has been working out since 2000. It isn't some "hey lets get Japanese names out... TOMORROW!" type issue that a lot of people think it is.

There are a lot of security measures in place by ICANN to prevent something like this. For example, the banning of mixed-script domain name registration. That is why you cannot get your "Godaddy" with Cyrillic characters. Hell, they even don't allow certain symbol domains to be registered.

I've spoken to people at ICANN and this is definitely an important issue to them, as it was a concern to me when I first read about this.

The purpose of allowing people to use their languages as domains is to start promoting content build-up and Internet use for many countries.

Imagine in countries like Somalia where they can start using the internet in their own language? I'm pretty sure not many people there speak English since they don't have the resources, but since thy speak Arabic, they will be able to start browsing and searching and visiting websites PROPERLY. The example I've used many times is the way Google sounds in Arabic. A friend told me sounds like "jojol" or something along those lines... so someone even familiar with the language will have a hard time figuring out HOW to spell something like Google. They just don't have the same letters that match the English letters phonetic sound.

As a native English speaker, having Japanese URLs will not affect me. I already DO NOT visit their Japanese sites nor do I try to read anything in Japanese. Adding this will only benefit them and won't exactly hurt me. In fact, I think this is a great way to start promoting open-mindedness and technological advancements in these types of countries.

Well, I've written a lot and I'm not even sure I made any sense. Hope I've helped someone!