long-char, kanji

To: Common-Lisp@su-ai.arpa

Subject: long-char, kanji

From: hagiya@kurims.kurims.kyoto-u.junet

Date: Thu, 5 Jun 86 19:17:09+0900

I think that all the discussions on extending character code
are made with the intent of defining the `international' version
of Common Lisp. The discussions, therefore, seem to keep the
pure version of Common Lisp intact and add some alien features
to it by introducing such (ugly) names as "FAT-...", "PLUMP-...",
etc.; as a result of the extension, the complex type hierarchy
of Common Lisp becomes more complex.
Complexity is sometimes unavoidable. However, I hope the simplest
solution will also be permitted, as Moon argues:
Another solution that should be permitted by the language is to have
only one representation for strings, which is fat enough to accomodate
all characters. In some environments the frequency of thin strings
might be low enough that the storage savings would not justify the extra
complexity of optimizing strings that contain only STRING-CHARs.
Common Lisp (or Lisp) is a very flexible language, and in an extreme
case, we can replace all the special forms, macros and functions of
Common Lisp with their Japanese (or Chinese or any language) counterparts
by preparing appropriate macro packages. In that case, the doc-strings
will also be written in Japanese. In such a completely Japanese
environment, it's silly to treat 8-bit characters in a special way.
I think that the discussions should be done along with how the extended
code will be used. We can consider several situations.
1. Programs written in one language (other than English) manipulate
data (characters or strings) in that language.
2. Programs written in English manipulate data in one language (other
than English).
3. Programs written in English manipulate data in more than one language
at the same time.
I myself like to program in Japanese even if the program does not
manipulate Japanese characters or strings. It's easier to device
function names in mother tongue than in a foreign language.
By the way, no one seems to stress the importance of extending symbol
names. If an extension, for example, allows simple strings to hold
only 8 bit-characters and symbol names should be simple strings, then
I completely disagree with that.
Masami Hagiya