Revision: 19970526 (Currently reflects VITRUS-1.2, to be updated soon)
"Latinized" (ASCII-transliterated) Russian Cyrillic.
---------------------------------------------------
This document contains a sample description and implementation of
Visual International Transliteration for the modern Russian Cyrillic.
Its generic description that declares the basic VIT principles can
be found in a separate document that should be read first.
Code name used in this document --
Visual ASCII-Transliterated Reversible Russian Cyrillic.
Code name used within the UNICYR prototype --
ETRC = visual English-Transliterated reversible Russian Cyrillic
The following is a draft description of current rules.
These rules may slightly change or cause variations,
but all the modifications should conform to the generic
principles. So, for instance, if some different variant
of a digraph proves to be better and more practical,
the table may change.
This document does not give detailed reasons, considerations
and information sources which would explain why each specific
representation of a character was chosen.
Just to mention a few of them:
- Widely used KOI-8 mapping of Russian letters to Latin ones
- Established traditions
- Readability and "writability"
- Relative frequency of letters in Russian texts
- Compactness
- Slavic transliteration methods already used by non-Russian
libraries, such as the Library of Congress (USA) or
the British Library (Great Britain)
- Other Russian transliteration projects
- Transliteration methods used for other languages
- User responses, comments and suggestions
Russian alphabet (VITRUS Version 1.2):
A B V G D E xO W Z I J K L M N O P R
S T U F xK xZ C xS xT xH Y H xE xU Q
There is not much to learn. It shouldn't take more than
15 minutes to read and memorize this brief information.
1. The monographs
abvgdeijklmnoprstuf are obvious. Most of them are derived
from the most natural and already established transliteration.
Monographs (traditionally)
e: esli(if) (e,je,ye)
2. Other monographs: (traditionally)
z: zabor (fence) (s)
y: mylo (soap) (i)
w: wuk (beetle) (zh)
q: qdro(nucleus, kernel) (ja,ya,q)
h: math (mother) (',x)
c: caj(tea) (ch)
are traditionally transliterated a few different confusing ways.
Here there are just a few ambiguous examples:
- "e" and "sh" in the words "veshalka", "voshod", "eto"
- "i" and "y" in the words "pil" and "pyl"
- "s" and "z" in the words "sonnyj" and "zonnyj"
...
3. The digraphs are based on some simple ideas that make them
easier to remember. As it can be easily noticed, the letter x
is used as an escape character that starts all digraphs.
xo: xolka(kind of tree), xow(kind of animal) (jo,yo)
xu: xug(South) (ju,yu)
xe: xeto (this) (e')
xk: xkrabryj (brave,courageous) (kh,ch,h)
xz: xzifra (digit) (ts,tz)
xs: xskaf(kind of furniture), xsnur(rope) (sh)
xt: xtuka(kind of fish), xti(kind of Russian soup) (shch)
xh: obxhxom(volume) (")
It's that easy !
To toggle from English to Russian and back,
any special character may me used. Typically,
it's the "backapostrophe" character which is
located on the same key as ~ (tilde).
All this should be enough to read the following and
to type any mixed English/Russian text in this
format.
`Nacalo russkogo teksta
Esli Vy v sostoqnii procesth xeto, znacit vsxo normalhno,
vizualhnaq standartno-transliterirovannaq obratimaq
russkaq kirillixza (uff !) Vami osvoena i mowno
poprobovath cto-nibudh napisath v takom-we vide.
Nikakogo softvera dlq xetogo srazu ne potrebuetsq.
Dlq nacala dostatocno potratith kakixk-to 15 minut na
znakomstvo s instrukxziej (sm. vyxse) -- i vperxod !
Konexz russkogo teksta`
Comment:
A very similar transliteration scheme may be employed by some
other Slavic languages such as Ukrainian and Belorussian.
See the following revisions of UniCyr.
UNICYR by Vitaly Blokhin
------
UniCyr is a sample working prototype that performs all
the conversions from/to different Russian encodings including
VITRUS (see VIT.TXT) which currently has a code name ETRC --
English (ASCII)-Transliterated Russian Cyrillic.
It's a simple MS-DOS console application written in ANSI C,
so it should work in MS-DOS, any version of Windows and can
be easily ported to any other platform like UNIX or Mac.
This version of UNICYR uses hardcoded built-in tables.
Unix, Mac and Unicode Cyrillic encodings are not implemented
in this version. In future, UNICYR will be replaced by a more
generic internationalized piece of software which will use
external scripts written in plain text format and provide
many other features like the support of multilingual
documents etc. This software is already being implemented.
UniCyr is a copyrighted freeware program and can be freely
used and distributed as a package without any fee (at your own risk).
The author is neither responsible for any problems caused by use
of this program, nor for its technical support, although any
reasonable questions, comments, suggestions etc. would be appreciated.
(See Vitaly Blokhin's home page at http://www.asanet.com/vblokhin,
or e-mail: vitinfo@asanet.com)
The UniCyr v 1.2 distribution package file list:
VIT.TXT Generic Visual International Transliteration document
VIT-RUS.TXT Sample implementation of VIT for the Russian language
and a description of UNICYR
UNICYR.EXE MS-DOS executable
UNICYR1.BAT MS-DOS sample batch file to run UNICYR
ALPHETRC.TXT Russian alphabet (CAPITALS/smalls) encoded ETRC
ALPHWIND.TXT Russian alphabet (CAPITALS/smalls) encoded Windows CP 1251
ALPHDOSA.TXT Russian alphabet (CAPITALS/smalls) encoded MSDOS ALT CP 866
ALPHKOI8.TXT Russian alphabet (CAPITALS/smalls) encoded KOI-8
TEST-CYR.TXT A test file for all the encodings above
As an example of UNICYR usage, the following are detailed instructions
how to convert a text message containing ETRC to Windows encoding.
This file may be used as the source message.
1. Create a directory (for example, work1)
2. Copy unicyr.exe and unicyr1.bat to the work1 directory
3. Extract the entire text of the source into a text file
(message1.txt) to the work1 directory
4. Open an MS-DOS window (in Windows) and make the work1 directory current.
5. Make sure it contains unicyr.exe, unicyr1.bat and message1.txt
using dir command.
6. In that MS-DOS window, run the attached batch file unicyr1.txt
simply typing its name and Enter.
7. As a result, you'll have a Windows-encoded English/Russian text in
messwind.txt
8. Open messwind.txt with MS Word, WordPad or WinWrite
9. Highlight the entire text and change it's font to any Windows Cyrillic font
you have (1251, not KOI-8 !) -- you'll be able to see and print normal Cyrillic.
The contents of unicyr1.bat:
unicyr etrc wind single_backapostrope_character <message1.txt >messwind.txt
where single_backapostrope_character is literally just one character located
on the same key as ~ (tilde).
Running unicyr with no arguments will bring up a help screen which will show
what arguments to use for other situations. Of course, it can be used for any
conversions like Windows to/from DOS-ALT to/from KOI-8 etc. but its main point
is a practical implementation of VIT-RUS (ETRC).
This is the contents of UniCyr help screen:
Universal Cyrillic convertor for the Russian language
UniCyr 1.2 19961229 Copyright (C) 1996 Vitaly Blokhin
Usage:
UniCyr coding_inp coding_out toggle <inp.txt >out.txt
toggle=English/Russian toggle character(BackApostrophe is recommended)
coding={etrc,erac,dosa,wind,koi8,unic,unix,maci,dosb}
etrc - English-Transliterated reversible visual Russian Cyrillic
Russian alphabet:
A B V G D E XO W Z I J K L M N O P R S T U F XK XZ C XS XT XH Y H XE XU Q
(an implementation of Visual International Transliteration VIT-RUS)
etrc10, etrc11 implemented in old UniCyr are still supported
erac - English/Russian mixed Alphabetic Cyrillic code (only for output test)
dosa - Dos Alternative CP 866
wind - Windows CP1251
koi8 - KOI-8
unic - Unicode (not implemented yet)
unix - UNIX (not implemented yet)
maci - Macintosh (not implemented yet)
dosb - Dos Base (not implemented yet)
Appendix 1. FYI -- Historical versions of VITRUS.
Version 1.1
A B V G D E qO W Z I J K L M N O P R
S T U F X C qC qS qW qH Y H qE qU qA
Version 1.0
A B V G D qE qO W Z I J K L M N O P R
S T U F H C qC qH qS qX Y X E qU qA
Appendix 2. "Evolutionary" versions of VITRUS (to be released soon).
Version 1.5 (Apr 1999)
A B V[=W] G D E JO ZH Z I J=JH K L M N O P R
S T U F X[=KH] C CH SH TH QH Y Q EH JU JA
Notes:
1. Square brackets [] show that this may be optionally
accepeted for input but not used for output.
Equal sign = is used to show aliases which are used
only to resolve ambiguities.
2. A digraph is considered 'capital' if at least one of
its letters is capital, for instance:
ja is a small letter
JA = Ja = jA is a capital letter
3. Q is a 'soft sign', QH is a 'hard sign', TH is 'shch'.
4. J is used in all the cases except these rare situations:
JHO = J + O, JHU = J + U, JHA = J + A where the alias JH
is used to distinguish from digraphic letters JO, JU, JA.
A few funny examples:
jo-jho: rajhon, strojhotrjad, Nqju-JHork
ju-jhu: strojhupravlenie
ja-jha: rajhapteka, rajharxitektproekt
Version 1.4
A B V G D E JO ZH Z I J=JH K L M N O P R
S T U F KH C CH SH TH XH Y X EH JU JA
Version 1.3
A B V G D E YO ZH Z I J K L M N O P R
S T U F KH C CH SH TH XH Y=YH X EH YU YA
Appendix 3. Interactive demo prototype for VIT (under development).
VIT Editor *** Java should be enabled in your browser. May not work with firewalls. ***