Introduction to Transliteration and Transcription

All human languages have complex histories with individual words often stemming from a bewildering but discernible array of borrowings from other languages in an ongoing process of development. In addition to such complex genealogies of individual words – genealogies quite unknown to most speakers of the language – speakers of any given language will often have occasion to self-consciously utilize another language’s terms within their speech. Obvious examples are personal names, toponyms, products or items unique to the culture in question, and so forth.

In determining the original source “language” of a given term, that is, whether it is an original “Tibetan” term or a rendering of a foreign word, it must be realized that the line between “etymology,” “loan words” and simple representations of a foreign word are quite arbitrary. Many words of a given language can be shown historically to stem from other languages originally, but by now are completely integrated into the language and the typical language user has no sense at all of the word being “foreign.” In contrast, words typically thought of as “loan” words tend to be more recent borrowings, and/or their use is more selective and limited within the language so that language speakers retain a clear sense of the word’s “foreignness.” In contrast to both of these situations, in many contexts we quite deliberately refer to foreign terms in our own texts without any sense that they are part of our language – examples are language instructional text books, essays (journalistic or academic) on a certain culture, and so forth. While in many cases the boundaries between these three phenomena is very clear, in other cases it is not – the important thing in a process of scholarly documentation is that one is consistent and explicit in application. The issue then of transliteration and transcription – rendering a term from another language in one’s own script with a focus on its native spelling or pronunciation – is most pertinent to the third such case, namely self-conscious representations of another language’s terms as deriving from that language.

In fact, the incorporation of foreign words into a language also has significant differences in terms of the different types of words, and different languages in different socio-political circumstances. For example, words for concepts tend to be translated rather than borrowed, perhaps, while words for new products – such as tofu, for example – tend to be more directly incorporated wholesale in a rapid fashion. In a situation where a culture is dominated politically by another culture with a separate language, you also find a more rapid explosion of new loan words and a tendency to borrow rather than translate for obvious reasons.

One of the more interesting types of words is toponyms – names for specific places. For obvious reasons – and especially for travelers – we tend to try to learn the toponyms employed by the people who live in the place in question, albeit rendered in our own script in a way that we can pronounce easily (popular uses) and/or reproduce the native orthography (scholarly uses). In addition, because the place in question continues to be anchored in a particular geographical part of the world inhabited by specific groups of people speaking a specific language(s), references to a toponym from outside of a given culture tend to be very conservative in retaining their distinct status as a foreign language term even after extensive and widespread use. Thus whereas “tofu” might have quickly become in essence an English word, “Lhasa” or “Beijing” remains clearly a foreign language toponym even after years of use.

When self-conscious references to foreign language terms in a given language happens in written documents, one is faced with a challenge as to how to best represent those terms in one’s own script. This process of representation is termed transliteration, defined as “to represent or spell in the characters of another alphabet,” or transcription, defined as “to represent (speech sounds) by means of phonetic symbols.” There are two distinct methodologies one can pursue: one can attempt to represent the native spelling of the term in question, or one can attempt to represent the sound of the term in question, utilizing one’s own script. We refer to these two different methodologies as orthographic transliteration and phonetic transcription, respectively. The former aims to represent terms’ standard spelling, or orthography, while the latter aims to represent terms’ standard pronunciation, or phonology. For some languages, a single transliteration system can simultaneously serve both orthographic and phonetic needs because orthography reasonably approximates the pronunciation. An example is Sanskrit, which can be easily represented in orthographic transliteration in Roman script that in turn can be easily pronounced and remembered by a foreign reader without special training or knowledge. In contrast, Tibetan requires two distinct schemes – transliteration and transcription – due to the considerable divergence between spelling and pronunciation.

Transliteration in general is relatively straightforward. The major issues are threefold:

Disambiguation

Comprehensiveness

Graphical particularities

The first is to ensure that the system is unambiguous, so that a given transliterated letter and/or word cannot be contextually reconstructed back in the original language in two alternative ways. The second is to ensure that the system is comprehensive, and deals with all possible characters and modifications that appear in the original language. This is the more difficult issue, especially when the transliteration system is meant to account for the full history of the language, as well as special uses. Often transliteration systems only account for “standard” uses of the language in question, and offer no guidelines for more uncommon uses of the language – such as how the source language and its script might be used itself to represent other languages, special diacritic marks inherent to the script, and archaic forms. The third issue pertains to the need not only to indicate the spelling proper, but also how the details of the original script. For example, a given word with the same spelling might in different centuries represent a given conjunction of two letters graphically different, or it might utilize abbreviations. The transliteration system then must be able, in addition to indicating the spelling, also indicate these graphical particularities. It should be noted that many languages don’t have traditions of using their own script to represent foreign language words in a way that documents the original orthography of the term, but rather only represent these words phonetically.

Transcription is somewhat more complex. The main issue is that unlike the black and white issue of representing spelling, pronunciation is not an exact science. To begin with, authors have different degrees of accuracy at which they are aiming. Thus a popular publication may desire simply a rough approximation of pronunciation using simple characters that anyone could pronounce, leading to schemes often referred to as “simplified phonetic schemes,” or “simplified transcription.” In contrast, dictionaries and in general linguistic research aims at more precise modes of representing the sound, which often use special diacritic marks in conjunction with the relevant alphabet in order to allow for more precise indication of sounds. Despite the increased accuracy, such schemes have several problems: they require some degree of study that can entail neglect by many readers/users, their degree of divergence from ordinary practices in the target language entail many if not most readers have difficulty in remembering and thereby utilizing them, and finally the use of diacritics can hinder the use of these schemes in various digital contexts.

An international standard has evolved for precise phonetic transcription that goes by the abbreviation IPA, or International Phonetic Alphabet. However, while used widely by linguists, few others are able to utilize this technical scheme and hence a wide variety of other phonetic schemes have continued to be used in various contexts. Finally, representing sound with scripts leaves room for disagreement as to the best representation in any given context. In addition, it should be noted that each language is characterized by diverse spoken practices based upon regional location and social class. Thus any phonetic scheme must be capable of representing the full spectrum of sounds possible in all dialectical variations. Just as importantly, it means that any transcription scheme cannot possibly represent the language in general, but rather can only represent a specific dialect of the language at a specific time period.

Practically speaking, written sources are overall inconsistent in the degree to which they employ systematic transliteration or transcription schemes. There are three broad tendencies:

Inconsistency: any given term can be rendered in multiple and inconsistent ways within the same source without rhyme or reason.

Term-based consistency: the spelling for any given term – a place name, for example – is consistently the same in a given source, but there is no system in place that is using consistent guidelines for determining the spellings of each term/toponym overall in a regular process; for example some might be transcriptions, while others might be transliterations, while the practice for transcribing phonetic renderings might be non-systematic and haphazard, or even involve different systems for different terms

Systematic consistency: a scientific system is being used consistently for the representation of each term from the source language in the target language, so that not only are consistent spellings be used for each term, but those spellings are commonly determined by a single unambiguous system with explicit principles.