Re: Emacs 23 character code space

From:

Eli Zaretskii

Subject:

Re: Emacs 23 character code space

Date:

Wed, 26 Nov 2008 22:26:03 +0200

> From: Kenichi Handa <address@hidden>
> CC: address@hidden, address@hidden
> Date: Wed, 26 Nov 2008 13:58:26 +0900
>
> I'll explain it a little bit more. To decode a character
> sequence to a byte sequence, Emacs actually does two kinds
> of decoding as below:
>
> (1) (2)
> characters <-----> (charset code-point) pairs <-----> bytes
Can you give a couple of examples, for some popular charsets, and how
we decode bytes into characters thru these pairs of charsets and code
points?
> For the decoding of (1), Emacs uses infomaiton of coding
> system to decide which charset to use, and then uses
> informaiton of the selected charset to get a code point.
>
> For the decoding of (2) Emacs uses only information of
> coding system.
Thanks. What confuses me is that, roughly, there's a charset in Emacs
23 for every coding-system, and they both have almost identical names.
For example, the code point of a-umlaut in the iso-8859-1 charset is
exactly identical to the byte value produced by encoding that
character with iso-8859-1 coding-system. So I wonder why we need
both in Emacs. Why can't we, for example, decode bytes directly into
Emacs characters?