UTF-16 is an encoding which explains how to map bytes to code-points (what you call characters), like UTF-8. UTF-16 encodes data in chunks of 16 bits, while UTF-8 encodes the data in chunks of 8 bits. UCS-2 was an encoding where only the 2^16 first code-points could be encoded, in the same way that ASCII is an encoding where only the first 2^7 code-points can be expressed, and ISO-latin only encodes the 2^8 first code-points. UCS-2 was an attempt to encode the "most common case" as you describe it. The problem is, in order to achieve this, Chinese and Japanese characters were crammed together (look up Han Unification) and were basically not usable. We are talking about around 1.5 billion people here. The fix was to add back the characters that had been removed, and go above the FFFF line.

As to why we need trading cards and smiley in Unicode, the reason is pretty simple: compatibility. The goal is to be able to convert all existing text data into Unicode, this is why DOS area block drawing are defined as codepoints. Emoji were added to add compatibility to the Japanese systems so that companies like Apple could enter that market with the iPhone, without this, iPhone users would not have been able to exchange messages with other users.

Remember that at one point in time, ASCII was the extended character set with unnecessary symbols like curly braces, this is why C++ compilers still have trigraph support

"For example when faced with the decision to crash into a pedestrian or another vehicle carrying a family, it would be a challenge for a self-driving car to follow the same moral reasoning a human would in the situation."

No, a self driving car shouldn't get into that situation in the first place. The right thing to do here is to anticipate events and slow down. Self driving cars have a huge advantage here, in that they don't get tired or lose attention over time.

snowtigger writes: The New York Times reports that on Wednesday, a federal judge unsealed documents in the case (covered here), allowing the tech entrepreneur to speak candidly for the first time about his experiences. Among other things, a court order required provide the F.B.I. with “technical assistance,” which agents told him meant handing over the private encryption keys, technically called SSL certificates, that unlock communications for all users.

SSDs are slow in that they rely on old school disk protocols like sata. Sure, you'll get better performance than spinning disk. But if you want screaming fast performance, you should look at flash devices connected through the PCIe bus.

Products from Fusion IO would be an example of this. Apple Mac Pro would be another: "Up to 2.5 times faster than the fastest SATA-based solid-state drive".

I wired my house with cat5E cables, thinking it would future proof the house. In hind sight, I would have chosen cat5.

10G may not work, even if you've chosen the right type of cable, as 10G is much pickier about the terminations. So you can always try and if it doesn't work well, go for prefabricated cables for the 10G connections.

If you want to play with fast (10G+) networking at home, the smart way is to buy infiniband gear on ebay. There's quite a supply from compute clusters being torn down. Older SDR (10G) cards run $30-50. DDR (20G) a bit more and QDR (40G) for a few hundred per card. Buy a cheap copper cable for cross connect and you're done. Or preterminated fiber cables if you need distance, the cards usually handle that too. Some cards also handle 10G and 40G ethernet as well. Need a switch? 36 port QDR switches typically go for $1000. That's 1.4 Tbps worth of bandwidth.

I bought a couple of Mellanox cards that do both 40G ethernet and FDR (56G) infiniband. Between my two linux servers, I get about 37Gbps when using 2+ tcp connections. While bandwidth is about the same, infiniband latency is about half that of ethernet, so I run IP over infiniband.

Apart from being fun (this is slashdot after all), why would you want this? Because it remove the network as a bottleneck and changes the way I think about resources. File transfers are limited by disk performance, there's never network congestion, etc. The only thing that could saturate the link would be memory to memory copying (think VM migrations). Either way, it will be a long time before I worry about network performance again...

Security researchers Biondi and Desclaux have speculated that Skype may have a back door, since Skype sends traffic even when it is turned off and because Skype has taken extreme measures to obfuscate their traffic and functioning of their program.[26] Several media sources have reported that at a meeting about the "Lawful interception of IP based services" held on 25 June 2008, high-ranking but not named officials at the Austrian interior ministry said that they could listen in on Skype conversations without problems. Austrian public broadcasting service ORF, citing minutes from the meeting, have reported that "the Austrian police are able to listen in on Skype connections".[27][28] Skype declined to comment on the reports.[29]

snowtigger writes: Google’s Public DNS service, behind the well-known 8.8.8.8 and 8.8.4.4 IP addresses, now supports DNSSEC validation. Previously, the service accepted and forwarded DNSSEC-formatted messages but did not perform validation.

Effective deployment of DNSSEC requires action from both DNS resolvers and authoritative name servers. Resolvers, especially those of ISPs and other public resolvers, need to start validating DNS responses. Meanwhile, domain owners have to sign their domains. Today, about 1/3 of top-level domains have been signed, but most second-level domains remain unsigned. From the daily 130 billion DNS queries the service receives, only 7% of queries from the client side are DNSSEC-enabled (about 3% requesting validation and 4% requesting DNSSEC data but no validation) and about 1% of DNS responses from the name server side are signed.

It is not really once code per country, ISBN started with a code per language zone, and switched to countries when they realised it could not scale, so codes 978-0 and 978-1 are for english (this includes the mysterious lands of united kingdom and australia), code 978-2 is for french, and so does 979-10, 978-3 is for german, the followin 978- prefixes are assigned to various countries. Note that the code is not assigned to the language of the book, but the dominant language of the country / publisher. So a swiss publisher can have a 978-2 book in english.

If prices of ISBN codes were really a problem, people could just publish in France, where ISBNs are free. Anyways nowadays ISBN are just a particular class of GTIN/EAN so I suspect one could just buy an EAN (UPC) code.

If we hit the reset button, can we also fix ASCII? it is by no mean the minimal set most english speakers think it is.

Why do we need a character to represent to 'v' one after the other? You could write 'w' with to 'v' and handle the ligature where it should be handled, at display time. There are so few words in English with the sequence vv that it makes no sense to have the special case coded in the encoding.

Also could we handle the dots on the characters 'i' and 'j' like the diacriticals they are? there should be first the the dotless 'i' and 'j' and the some character to add the dots, like all other diacriticals. Also move out the currency symbols ($ and £), they can be represented as text (USD and GBP), no point in have silly symbols in there. Also remove BELL (11), having a symbol for a bell (2407) might be bloated, but having one for the sound of a bell is absurd.

By the way, why do we need different code points for upper and lower case? They are just variants of each other anyways

Unicode is certainly messy, but plain ASCII is not much better: the most precious 127 code points of utf-8 are basically wasted to display 32 characters and a bit of punctuation, that is pretty bloated for me, we are just used to it