Character Sets

Four types of coding conventions are currently supported in the Korean
Solaris software:

N-byte code - This single-byte code has each byte represent
a consonant or vowel. These are combined together to build Hangul characters.

Johap or Packed code - This two-byte code consists of a leading
bit followed by three 5-bit fields. These three fields contain the codes or
a leading consonant, followed by a vowel, followed by a final consonant (if
any) for a Hangul character. This two-byte code is specified in Korean Industry
Standard KS C 5601-1992-3.

Wansung code - This two-byte code is specified in Korean Industry
Standard KS C 5601-1987 for Hangul, Hanja, and other characters. In the Korean
Solaris software these KS C 5601-1987 characters are in EUC codeset 1.

ko.UTF-8 - Korean Universal Multiple Octet
Coded Character Set (UCS) Transmission Format. ko.UTF-8
supports all the characters of KS C 5601 and 11,172 characters from Johap,
as well as all Korean-related Unicode 2.0 characters and fonts. ko.UTF-8 supports the following subset of Unicode: