Saturday, January 07, 2006

UTF-8

So, I have started to use utf-8 as default character set on my primary laptop.

Why did I change from iso-8859-1 (aka latin 1)?

I don't have much use of the huge amount of different characters that I'm able to write with utf-8, of course it is fun to be able to look at Asian or even Braille characters, but all languages I know use western characters, and 8859-1 gives me full coverage for those. (Sweden has not switched to the Euro currency, so I don't have much use of the euro sign (€), and if I had 8859-15 would have been enough).

But I want to contribute to making utf-8 an universal standard characterset. Even if it isn't perfect it actually has the potential of becoming THE characterset. Wouldn't it be nice to never have to care about what encoding is used?

Ever since reading http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt I have appreciated the design of utf-8 (especially the "Criteria for the Transformation Format" is must-read). But it has taken several years until I finally switched. The main reason for the switch was a reinstall of the operating system because of a dying harddrive. Ofcourse the recode-support in new irssi helped in making the decision too :)

The thing I find most annoying at the moment is that the locale isn't propagated across ssh-sessions. So I have to change the .bashrc on all machines I log in to to setup the locale correctly (or am I missing something?). Otherwise everything seems to just work.

Sidenote:My blog appears on planet.xmms.se, but I stillhaven'tt written a single word about xmms2 :)