"Why Unicode Won't Work on the Internet:
Linguistic, Political, and Technical Limitations

Summary

Unicode, the commercial equivalent of UCS-2 (ISO 10646-1), has been
widely assumed to be a comprehensive solution for electronically mapping
all the characters of the world's languages, being a 16-bit character
definition allowing a theoretical total of over 65,000 characters. However,
the complete character sets of the world add up to over 170,000
characters. This paper summarizes the political turmoil and technical
incompatibilities that are beginning to manifest themselves on the
Internet as a consequence of that oversight. (For the more technically
inclined: Unicode 3.1 won't work either.)