Mapping Files

Lorna Evans, 2014-09-29

Note:

We have moved all the TECkit mapping files to a GitHub repository. That repo is found here: https://github.com/silnrsi/wsresources. You are welcome to browse and find mapping files or to submit your own mapping files.

The mapping files on this page continue to be available. However, they now come directly from the GitHub repo rather than a static download.

These mapping files were created for converting data to Unicode. Some, but not all, will also convert data from Unicode back to the legacy encoding. These mapping files are for use with TECkit applications. You can do this using the program TECkit or SILConverters 4.0.

Chad

Legacy font encoding: Tchad2000 and Chad95 fonts

Compiled (and uncompiled) conversion tables for each set of fonts is included here, which map Tchad2000 and Chad95 encoded data to Unicode (not vice versa). Although each mapping file is bi-directional, conversion from Unicode back into the legacy encoding isn't recommended, as there would be some poor diacritic choices back in the legacy encoding. Each mapping file is currently only intended to support legacy to Unicode conversion.

Other Eastern Congo resources are a Unicode Keyman keyboard found here.

Ethiopia

Latin to Fidel Unicode mapping

Three TECkit mapping files (compiled and uncompiled) are included in this package. They are intended for use where text has been input “phonetically” as a syllable (be, ppii, etc.) and conversion to fidel is desired (, , etc.).

Ge'ez-1/Ge'ez-2/Ge'ez-3

PUA to Unicode 6.0 mapping

If you happened to obtain an early unreleased version of the SIL Abyssinica font, and used some of the Private Use Area (PUA) codepoints in your data, you may wish to use the following mapping file to convert your data from PUA codepoints to Unicode 4.1 (many of the PUA characters were added to Unicode 4.1).

Liberia

Three TECkit mapping files (compiled and uncompiled) are included in this package. SILVai is for use in converting text encoded with the SIL Vai font to Unicode. SILVaiE is for use in converting text encoded with the SIL Vai Extras font to Unicode. SILVaiNTsfm is intended for use where text has been input “phonetically” as a syllable (ba, dle, etc.) and conversion to Unicode Vai characters is desired (U+A552VAI SYLLABLE BA, U+A514VAI SYLLABLE DEE, etc.).

Other Vai resources: Two Vai Unicode fonts are available here. Dukor is based on SIL Vai. Windows 7 provides a Unicode font with Vai support called Ebrima. A Windows Keyman keyboard is available here: Known Unicode Keyman Keyboards. A MacOSX keyboard is available here.

Mali

Legacy font encoding: SIL Mali standard fonts

Compiled (and uncompiled) conversion tables for each set of fonts is included here, which maps SIL Mali encoded data to Unicode and vice versa. The legacy and fonts will be posted at some future time.

Europe/Middle East

PitchContours

Compiled (and uncompiled) conversion tables for each set of fonts is included here, which maps legacy encoded data to "Unicode" (SIL's Corporate PUA) and vice versa.

These are experimental mapping files for the PitchContours fonts. Nine codepoints for the pitches were placed in SIL Corporate PUA at U+F1F1..U+F1F9. Each of these mapping files require the use of Charis SIL or Doulos SIL version 4.1 or greater. They will turn into contours if used in Graphite applications. It is unlikely that any other fonts will contain these codepoints.

It is possible these mapping files will not work for you since the fonts sometimes were used inconsistently. At times data from one melody was placed on two lines in order to make it stretched out. With that anomaly, the brackets in the fonts were also much larger to span several lines.

In order for these mapping files to work, data for a melody must be on one line, it cannot be on two separate lines or the data will not be converted properly. If you put your data on several lines we have no way to automatically convert that data for you. It is also impossible to maintain the large size of the brackets in an encoding conversion. If you wish to have large brackets you will have to manually go through your document and resize the brackets by changing the font size.

Note that several of these fonts contain characters that probably do not exist in any one font. These mapping files have only been minimally tested. Feedback is requested (see the readme file which is part of the download).

SIL Apparatus

Legacy font encoding: SIL Apparatus font

A compiled conversion table called SILApparatus.tec is included here, which maps SIL Apparatus data to Unicode and vice versa.

There are 20 characters from SIL Apparatus that are not
in Unicode. We have output U+FFFDREPLACEMENT CHARACTER in front of the base character. If you search for you can use text markup to make the following character superscript.

SIL Ezra

Legacy font encoding: SIL Ezra font

These mappings (and the included documentation) will help you convert some of your old Hebrew data to the Unicode codepoints, so that you can use the Ezra SIL fonts without re-typing your data. It should be particularly useful to those who have made a significant investment in their data using the SIL Ezra fonts.

SIL Galatia

Two draft conversion programs are included with this package. One is for Consistent Changes and the other is for TECkit. Neither of the Greek mappings have been extensively tested. These are intended for converting texts which have been encoded for the SIL Galatia legacy font to Unicode (Galatia SIL is a Greek Unicode font).

SIL IPA93

A compiled conversion table called SILIPA93.tec is included, which is set to map ipa93 data to Unicode IPA and vice versa. This mapping has been updated to reflect up to Unicode 5.1. The only changes were for the downstep and upstep. Unicode 5.1 now supports all characters that were in IPA93 and Doulos SIL and Charis SIL support this.

SIL IPA 1990

A compiled conversion table called SIL-IPA-1990.tec is included, which is set to map SIL IPA 1990 data to Unicode IPA and vice versa.

The conversion back to legacy will not be a clean round-trip, because there are duplicates of many diacritics in the SIL IPA fonts.

This mapping has been updated to reflect up to Unicode 5.1. The only changes were for the downstep and upstep. Unicode 5.1 now supports all characters that were in IPA93 and Doulos SIL and Charis SIL support this.

SIL PUA to Unicode 9.0 Mapping

The Unicode 9.0 standard includes 221 (not including Hebrew) characters that were previously allocated to codepoints in the Private Use Area by SIL’s PUA committee.

All processes (input methods, mappings) that create Unicode data should be revised to generate the proper Unicode values instead of PUA codes.

If you have data that contains these PUA codes, it should be updated by replacing each PUA character with its official Unicode counterpart. This will facilitate data interchange and the use of standard fonts and software.

Charis SIL and Doulos SIL fonts currently only support Unicode 8.0. Therefore, unless you have fonts which support the Unicode 9.0 codepoints you may wish to wait to update your PUA mapping until Charis SIL and Doulos SIL are updated.

Translator’s Workplace miscellaneous fonts

Compiled conversion tables are included here, which map data using these fonts to Unicode and vice versa.

Note that several of these fonts contain characters that probably do not exist in any one font. These mapping files have only been minimally tested. Feedback is requested (see the readme file which is part of the download).

Pacific

SIL Papua New Guinea Standard Font to Unicode mapping

Legacy font encoding: SIL Papua New Guinea standard fonts

A compiled (and uncompiled) conversion table is included here, which maps SIL Papua New Guinea encoded data to Unicode and vice versa. All PNG characters are now Standard Unicode 5.1. No more PUA characters are needed!

Symbol-encoded font to codepage 1252 transliterator

As discussed in Display Issues – FAQ, Microsoft Word data that is formatted using a symbol-encoded font such as SIL Galatia or SIL IPA93 is stored using a range of PUA characters, typically U+F020 .. U+F0FF. When using File / Save As... to create an 8-bit plain-text version of the data, such PUA characters are converted to question marks ('?'). One solution is to use this Unicode-to-Unicode mapping file, called a transliterator, to convert the PUA character codes to codepage 1252 before saving to plain text.