Typing and displaying Unicode Coptic texts under X11

1. Displaying Coptic Unicode text in an editor

Suppose you have already a Coptic
text which is for example in UTF-8 format and you would like to display it, take for
example the following sample, (right click with the mouse to download and open with an
editor) then, assuming that you work under X11, you must have:

a Unicode capable editor (like gtk2edit, gedit) or a word processor such as
abiword or OpenOffice

an appropriate Unicode font installed in your system, in which the Coptic glyphs
are included

Since there are many Unicode formats, you must know in which format your text is, in
our case UTF-8, and tell the editor when you open it, in which format your text is. Refer
to the section: installing Unicode Coptic font
for a detailed description of how to install the font.

2. Entering Coptic text in an editor

Now suppose you further would like to
modify this text, or write your own text, then you must be able to switch your current
keymap to Coptic. For this you have two possibilities:

you either define a new X11 keymap.

or you use a graphical virtual keyboard, like xvkbd to enter your text.

3. Displaying Coptic text in an X-Terminal

If for example you would like to
display the above sample using "more" from an X-terminal, then your
terminal must have a Unicode support (like gnome terminal) and of course you must set the
font of the terminal correctly. For a simple command like "more", this should be
sufficient.
But there are a lot other applications that use your current locale of the glibc. A lot
of applications also use the ncurses library. In such cases you will further need to:

The key map file "cop"

XFree86 and other X11
implementations define the keyboard layout in a file that normally gets the same name as
the abbreviation of the language, for example "de" for German or "us" for English. In my
systems these files reside in the directory:

/usr/lib/X11/xkb/symbols/pc

In the/etc/X11/XF86Config file, there is
one section for each input device describing also the keyboard parameters (language,
variants, models...), in XFree86 it looks like:

It is the task of the file /usr/lib/X11/xkb/symbols/pc/de in
the example above to define a default key mapping to German and in addition all the
different variants (keyboard model, variants,...).
I have prepared a similar file for Coptic. It follows the encoding suggested by Logos Research
Systems. You can get a detailed document describing their exact layout from:http://www.logos.com/support/lbs/fonts/CopticKeyboard.
They also offer a Coptic layout prepared for WindowsXP for downloading. My X11
implementation follows the same layout for the normal, shifted, AltGr and shift+AltGr
states. To install it, follow the following steps:

download the cop file (if you are used to the old CS-Coptic encoding, you can
alternatively download the filecop_CS,
though I would recommend to stick to the cop file that follows the Logos
encoding).

as root, copy it to /usr/lib/X11/xkb/symbols/pc (or similar, depending on your
system)

If you would like to try it out in your current X11 session, type in (as same user
who owns the X11 session): setxkbmap -layout "cop",
ATTENTION: once you have typed this, you will
not be able to type anything in latin anymore!! It is better to type before this in a
terminal something like: setxkbmap -layout "us", so that
you can use the up-arrow key in your X-terminal later and set the mapping back to "us".
Alternatively, you can type: setxkbmap -layout "us,cop" -option
"grp:alt_shift_toggle" which should allow toggling between "us" and "cop" key
maps with the simultaneous pressing of both <Alt> + <Shift> keys.

To have this every time you start your X session, as root, add the following lines
in the InputDevice section of your /etc/X11/XF86Config file:

Virtual Keyboard

There is another alternative way
to make entering Unicode text possible, without the need of the above steps. Instead of
hitting the keyboard with your fingers, you would rather click a virtual keyboard with
the mouse. One such virtual keyboards is xvkbd (written by Tom Sato, and is
distributed under the terms of the GNU
General Public License). The main problem with this tool is, that it uses the Xaw (or
Xaw3) widget tool kit (which is by itself very fine), but the Xaw tool kit does not seem
to have a Unicode support. I have succeeded in implementing a workaround, that at least
in the case of Coptic Unicode works fine tough with some limitations. So patching the
source code will be necessary.

apply the patch, for example if you change directory one level above xvkbd-2.7a,
type in:

patch -p0 <
xvkbd-2.7a-Coptic.patch

follow the steps described in the directory xvkbd-2.7a for making and installing
xvkbd ( xmkmf; make
install)

copy the file XVkbd-coptic.ad into the app default directory (if it should be
accessible for all users: /usr/X11R6/lib/X11/app-defaults, otherwise if only for you:
$HOME/app-defaults, and define this directory in your rc file of the shell you use,
i.e.: setenv XAPPLRESDIR $HOME/app-defaults for (t)csh or export
XAPPLRESDIR=$HOME/app-defaults for bash).

Make sure that the file XVkbd in your app default directory includes both
lines:#include
"XVkbd-common"#include
"XVkbd-coptic.ad"

If the key labels looks very weird when you start xvkbd, then either
the font is not installed correctly, or the patch was not applied. Try "xlsfonts | grep athena" , if
this does not output anything, then the font is not installed.

glibc support for Coptic

The glibc library defines the locales (internationalization files) in your Linux
operating systems. A lot of applications depend on it. It defines a set of attributes for
every different country or region, like the language character set, the collation
(sequence) of characters, currency, date format,...

Since there is no "Coptic" territory, it does not make sense to define a dedicated locale
for Coptic. It would be even absolutely sufficient to extend your current locale by few
more capabilities, so that at least the Coptic characters are defined.

If you have the extension UTF-8 at the end of you current locale parameters, then your
current locale actually knows UTF-8, and there is almost nothing you have to do. I say
almost, because the UTF definition will probably be lacking the Coptic range, since it is
new (introduced in Unicode 4.1.0). In other words: if you wait till a new version of
glibc is available, you will probably have to do nothing. But if you are inpatient, read
few lines later.

What about, if your current locale does not have the UTF-8 extension? Try to list all
available locales by typing:

locale -a

If there is no single locale ending with .utf8 then you should consider updating
your glibc (which is really very critical because of the dependencies with other
applications). Maybe it would be more convenient to update you whole distribution!

Otherwise, you need to consider modifying the following files:

/usr/share/i18n/locales/i18n

/usr/share/i18n/charmaps/UTF-8

optionally

/usr/share/i18n/locales/i18n/iso14651_t1

Updating the i18n file can be systematically done using the utility "gen-unicode-ctype", which is
included in the tar ball of the glibc library. You can also get it directly here. Compile it with "gcc gen-unicode-ctype.c -o
gen-unicode-ctype" then download the latest Unicode definition file
(UnicodeData.txt) from the Unicode.org server. Generate a new version of i18n with:
"./gen-unicode-ctype
UnicodeData.txt", rename the output file to i18n and copy it to the directory
/usr/share/i18n/locales/. You can
also get the version, which I generated that way here.

Updating the file UTF-8 requires more "hand work". I have prepared a version that only
adds the Coptic range, you can get it here.

So far I did not update my iso14651_t1 file.

After updating these files, you have to compile your current locale to reflect these
changes, if for example you current locale is en_US.UTF-8, then as root type:

localedef --charmap=UTF-8
--inputfile=en_US en_US.utf8

Make sure that the subdirectory en_US.utf8 under the directory:
/usr/lib/locale is now updated.

If you would like to test, if your modified locale now works, try to compile and run this
test code. It tests the conversion of the upper
case Coptic character alpha to lower case. It should output: 0x2c81

Re-encoding Coptic texts with iconv

GNU-libiconv is a library (including the command
line iconv) which is
distributed among the glibc package. It can convert between a lot of different encodings
and UTF formats. The chances that you have the command line iconv on your Linux are almost 100%.
Try: iconv -l, to see
all encodings that iconv can deal with.

This is actually the perfect tool to use to re-encode your older Coptic texts into
Unicode. I have already prepared a patch which extends the encodings of
iconv by the
Coptic Font Standard (CS Coptic) as established
few years ago by CopticChurch.Net. You should apply the
patch to libiconv-1.9.2
(you can download it here). After
applying the patch, compiling and installing, you should now get the cs_coptic encoding
when you type in: iconv
-l
The good news are: patching iconv means, that many other applications that rely on it
will now be able to understand the cs_coptic encoding. This applies for example to the
font creation tool: fontforge.