While there are many pages with resources for learning Pali, there is very
little available on the web today to ease the transition from (reliance upon)
Romanized phonetics to indigenous scripts. In order to make Pali seem easier to
learn, many websites and textbooks seem to suggest that the Roman alphabet is
all that you'll ever need to know. I hope I'm not the first to tell you this is
not true.

Learning at least one traditional Pali system of orthography will actually
make understanding the language easier, and learning several different
systems of writing will open up a vast world of published materials to you --not
to mention the beautiful and ancient traditions of palm-leaf manuscripts and
stone epigraphy.

The Pali language has not one script but many; the fact that there are so many scripts is hardly a pretext for learning none of them. The greatest number of books and manuscripts are found in Sinhalese, Burmese, Khmer-Muul, or closely related scripts of South-East Asia (Lao-Dhamma, Lanna, etc.). There are also some modern Indian publications that typeset Pali in Devanagari (i.e., the same script used for modern Hindi and Sanskrit), and, of course, the modern vernacular script of Thailand has been adapted to print Pali (although the classical tradition uses Lanna in Thailand’s North-West and and Khom throughout the rest of the country).

This "Rosetta Stone" file will provide a basic overview of how three Asian
writing systems relate to one-another, along with a short
quotation from a Pali sutta to be comparatively examined with the Romanized phonetics
provided. While this may seem daunting at first, consider that most of what you need to know is here displayed on a single page.

There’s a further table of Pali alphabets provided here. This is not an exhaustive manual for the writing systems in question, however, this website as a whole provides sufficient information so as to enable you to make use of vernacular textbooks (viz., for modern Khmer, Burmese, etc.) with a degree of certainty as to how the writing systems apply to the classical language (and, e.g., an awareness of the differences between modern and classical phonetic values assigned to the glyphs). There are certainly some drawback to this method, and I recall sitting down with a copy of Learn Yourself Sinhalese [sic!] years ago, and trying to figure out how the modern ligatures compared to the classical in writing Pali (as opposed to Vernacular Sinhala).

There is yet another leap for the imagination in moving from ink to the scratches found on palm leaf manuscripts. Practice in reading printed editions needs to be supplemented both with one’s own penmanship, and, eventually, with reading the word as scribes set in down on palmyra in various eras and regions.

Without exception, all of the writing systems described on this page have, in the last 200 years, made an imperfect transition from glyphs wrought with knives to the "cold type" of the modern era. In each of these countries, the current generation is more familiar with the forms of letters produced by typewriters (and found in newspapers) than the ligatures required by the anicent language. Whereas a total outsider may find these differences small, native readers of the vernacular tend to come to a complete halt at an unfamiliar consonant cluster (so too, the difference between Latin and Greek is small, yet most native English speakers would be stumped in reading a text peppered with occasional Greek consonants). In some instances, the simplifications that have made the modern language easier to render in metal type are obstructions to the classical form, and, resultantly, we now require entirely separate typography for Pali; in other instances, the differences are negotiable.

Three "Renditions" of the Loka Sutta

The following set of three files will also be useful to beginners: each
presents an excerpt of text from the Loka Sutta (with an English translation) in
a different South East Asian script, with parallel Romanized phonetics on the
right. Thus, for practise, you might want to download all three, and compare them. This translation ("perhaps more provocative than precise") is now many years old, and so there is a temptation for the author (gradually advancing in his own ability) to remove it from the internet; however, I don't think the function of these files would be improved by smothering such a rendering in grammatical observations, and so the reader may enjoy it all the same.

Pali works are increasingly available in (imperfect) e-texts, almost all of them Romanized. In the few years since this website first appeared, a greater and greater portion of these electronic editions have come into conformity with the Unicode standard for the peculiar glyphs that Romanized Pali requires (ṭ, ḍ, ṇ, etc.) --ensuring that the text is at least stable and can be properly manipulated in various formats.

However, in all of its native scripts, Pali combines consonants and vowels into clusters, with one cluster representing one syllable; it is thus neither a strictly alphabetic system nor truly a syllabary. The Unicode standard is not nearly so well adapted to this Indic tradition of stacks and ligatures joining complex sounds together; most of the problems are dealt with "at the software level", which is to say that a line of Unicode Pali text is unlikely to display properly across platforms. Unicode ensures that the sequence of letters is recorded, but the way they combine and display as sets of syllables is left to the software and the font to resolve. This means that separate software engineers will have to come up with separate solutions for separate platforms (Xenotype did it for Mac, but quite a lot of work has to be done and re-done with each new version of the O.S., just to keep these complex languages working on a single platform).

This is fundamentally different from the situation with (e.g.) Chinese. While Chinese may be difficult to write with a pen, from the computer’s perspective it follows a simple one-to-one correspondence between symbols and encodings (the sequence doesn't matter, and the symbols do not modify one-another by adjacency, neither combining nor stacking nor even joining with ligatures). Conversely, Sinhalese, Burmese and Cambodian are quite well suited to the pen, but the computer has to figure out how to assemble (e.g.) k+kh+i into a syllable, so that the i is on top of the k, and the kh is properly aligned underneath them both. Of course, the computer "figures it out" by following a set of instructions written by a human being.

Thus, if you want to convert Romanized Pali into any of its native (Asian) scripts, you will need to write up a long series of instructions for every possible syllabic combination. The pattern would need to replace kkha with ក្ខ, kkhā with ក្ខា, and so on, organized in descending order of complexity, so that the program substitutes the longer sequences of combined consonants first (then moves on to shorter, simpler syllables). It is not possible to substitute isolated consonants (nor vowels) because the "combining mark" itself must be encoded (as a separate keystroke) in every syllable of more than one consonant. Obviously, to replace Latin a with the Pali initial vowel a (අ, အ, អ) would be uniformly incorrect, for any or all of the native scripts.

In theory, this can be accomplished through any transliteration application or programmable editor (e.g., BBEdit). In practice, your time might be better spent copying out palm-leaf manuscripts in longhand.

Every Pali textbook opens with a description of the language’s phonology in theory, but few students will be aware of how huge a gap there is between those theoretical values and what they will encounter in Pali recitation, conversation, and cognates. A large part of this will make sense (and can only make sense) if you understand the way in which the same systems of orthography inconsistently express classical and vernacular phonetics; thus, e.g., while Cambodian textbooks rightly claim that the vernacular language is "entirely phonetic", and Pali textbooks claim that the classical language is "entirely phonetic", there are two different systems of interpretation applied to one system of orthography (viz., Khmer script, in this instance) to yield two conflicting sets of phonetic values.

The influence of the Mon and Khmer scripts spread over a huge range of mainland South-East Asia, adapting to serve as the medium of an amazing array of (vernacular) languages; when used to write Pali, the great variety of these writing systems are revealed as stylistic variations on one and the same "system" inherited from India (with two predominant sources of inspiration in the Mon and Khmer respectively) but with very different phonetic assumptions arising from the habit of reading the same script as used for the local vernacular. Thus, e.g., in modern Thai orthography, the reader has to interpret a Pali loan-word quite differently from a word of Thai origin (or, indeed, a word of Khmer origin); to pronounce one as the other would sound absurd --and, nevertheless, many absurdities have crept into the pronunciation of Pali in Thailand. Generally, the modernization of orthography has exacerbated the tendency to conflate the classical and the vernacular (in Thailand); the old Lanna and Khom scripts helped to keep the pronunciation of "proper Pali" discrete from its cognates in the spoken language.

Many Europeans have a similar problem with Romanized Pali, e.g., mis-pronouncing the Pali digraph th as indicating the English eth sound (ð), the Greek theta (θ), or the Norse thorn (Þ); but in Romanized Pali "th" always indicates a "hard t" sound of the aspirated, dental variety. I continue to meet Western PhD candidates who cannot pronounce "Theravada" correctly. I would imagine that if no-one in their PhD program has informed them that θɛrəvɑdə is wrong, their thesis examiner may owe them a refund.

This suffices to say that many of the problems of phonology (in praxis) are closely related to issues of orthography --as the latter often entails phonetic assumptions that either directly stem from the local vernacular, or are related to some prior stage of the language’s development and interchange with spoken and written forms in the region.

While I've called these phonological "problems", they aren't "problems" at all if you're properly prepared for them. With practice, you can simply adapt to hear "gissami" and interpret it as "gacchāmi" (in this instance, the monk speaking would be Burmese); but the available Pali textbooks do little or nothing to help the student prepare to deal with these issues, and it is rare to find a vernacular language textbook written by someone with enough specialized knowledge to say anything useful about Pali.

Worse, many Westerners (and Westernized Asians) seem to assume that monastic orthodoxy can be expected to adapt to the aesthetic norms set out by the "Western tradition" (viz., A.K. Warder). The study of the language as it is has been perceived to be of little importance compared to the dogmatic assertion of what it "should be". What Western Buddhists imagine they know about the language, and assert as a trans-historical ideal is certainly not informed by the Pali language’s own grammatical and prosodical literature; we should look to the latter literature if we want to speak of an historical ideal, or else direct our attention to the real (in all its quizzical imperfections) as we encounter it in the extant, indigenous traditions.

Although the observations I can offer below are by no means exhaustive, they may be numerous enough to seem rather discouraging (depending upon the attitude of the student). If you have "grown up" on Romanized Pali, imagining that you would hear the langauge (in Asia) the way it looks on the printed page of European editions, you may well wonder how it is possible to undertand Pali chanting at all; the answer is that it is quite easy, provided you have already memorized the passage being recited.

The struggle to reform the Khmer pronunciation of Pali is ongoing, along with the general concern for the restoration of Cambodian culture, literature, and religion, after the devastation of several sequent cycles of war (ending only in 1998). The notion of what constitutes a refined accent (in Phnom Penh) has utterly changed in recent decades, and will likely change again in the decade to come. The way Pali is recited, and that classical loan words are pronounced, must be expected to change also, though all of the same forces that ensure the mutability of the spoken word in Cambodia ensure that nothing shall change the way these words are written on the page. While derivative spellings in Thailand and Laos are slipping further and further into gobbledygook, we find the original terms preserved on the street-signs of Phnom Penh, just as they're spelled on Cambodian stone inscriptions of centuries past.

Some will wonder why Sanskrit is not mentioned before Pali in the Cambodian context. Contrary to popular belief (and much of the history as manufactured for tourists) the advent of Sanskrit influence (and its preeminence as a source of vocabulary) in the 7th century is now known to be coeval with the earliest extant Pali inscriptions (variously found at Angkor Borei and Go Xoai; Peter Skilling, 2002, JPTS XXVII, p. 160 et seq.). While Sanskrit certainly had the greater eminence (though not precedence) in that early period, Pali has been of increasing linguistic importance (in providing loan-words, etc.) since the 11th century. (Judith Jacob, 1993, Cambodian Linguisitcs, Literature and History, S.O.A.S., London. p. 151) It is thus a bit misleading when sources suggest that the Theravada has merely dominated the last millennium, as it is naturally this relatively recent "legacy" that the modern language is burdened with. While the early history of Buddhism and Pali in Cambodia remains speculative, there is a widespread assumption based on the epigraphic record (or lack thereof) that Pali had supplanted Sanskrit as the literary language of Cambodia from the 14th century onward; M.V. wrote in to emphasize to me that this assumption is poorly founded, and indeed the reader should be warned that this is the case. Thus, e.g., Goonatilake remarks that the succession of 1327 "…saw the abrupt end of Sanskrit inscriptions giving way to Pāli as the official language." (Hema Goonatilake, 2003, “Sri Lanka-Cambodia Relations with Special Reference to the period 14th-20th centuries”, J.R.A.S.S.L., XLVIII, 2003, p. 201). This sounds reasonable, but it is really an argument ex silentio. To this, we may contrast Vickery’s interpretation of an inscription of 1308 as as the first appearance of formal, royal patronage for Theravāda Buddhism in Cambodia. (Vickery, 2004, Cambodia and its Neighbors in the 15th Century, Asia Research Institute, Working Papers Series № 27, N.U.S. [available online], p. 5).

Of the 31 contrasting vowel sounds that Huffman considers vernacular Cambodian to employ, the (mis-)pronunciation of Pali should (logically) only stumble over 16. (Huffman, 1970, p. 8-9, downloadable here) This is because Pali employs 8 vowel glyphs (a, ā, i, ī, u, ū, e, o) that are then interpreted with reference to the context provided by the two classifications of Khmer consonants (none-too-memorably dubbed "series 1" & "series 2"); these classifications each entail their own implicit assumptions (or "rules") governing adjacent vowel sounds. Thus, 8×2=16. To this must be added the reduction of more complex sequences of vowels and consonants to sounds inferred from familiar vernacular usage (and these are not so easily enumerated).

With time, changes in phonetic assumptions can result in changes in lexical semantics. Two different Pali words are now conflated as Khmer piel (ពាល), namely, vāla, "malicious", and bāla, "youthful; ignorant, foolish".

The Cambodian word for grammar reads veyyākaraṇa (if transcribed letter by letter, as we would read Pali in the same script), but is pronounced something like weɪ-yɪə-kɔː (retaining its v-, unlike vāla in the former example). Simplifying classical polysyllables like -karaṇa to modern monosyllables such as -kɔː is a difficult habit for a native speaker to break from, and harder still for an outsider to imitate.

English has similar vices; we are no less arbitrary in our appropriations from ancient languages. The reduction of the Greek prefix sun- to English "syn-" changes the vowel sound, but not the meaning; the adaptation of Old English āwiht into both "aught" and "ought" is, just as inexplicably, pronounced as ɔt in either case. The digraph gn appears with the same etymology and even the same modern meaning in "pugnacious" and "impugn" --yet the correct pronunciation is utterly different in the two [pəgneɪʃəs vs. ɪmpjun]. In English as in Cambodian, wherever readers see familiar patterns of spelling in alien languages (ancient or modern) they tend to impose the assumptions closest to their own experience. Untold millions of English speakers continue to mispronounce the Latin loanword "mores" as if it rhymed with "stores" (whereas the correct pronunciation [mɔreɪz] rhymes with "rays" and "faze"). So too, Pali and Sanskrit must be forgiven such idiosyncrasies as they’ve developed during their long period of retirement in Cambodia.

The sixteen principle possibilities may thus be summed up as follows, with some clarifications following after the chart:

We must clarify the information provided in the chart under two headings: the vowel sounds implied by the phonetic symbols, and the rules (along with their exceptions, sometimes not indicated above).

The sound implied by the simple vowel symbol o (e.g., #5 in the chart above, c + u = co) is not quite so simple. Huffman (1970, p. 10) remarks that his use of o sometimes tacitly indicates a "slightly dipthongized [oᵘ̯]" . Possibly reflecting a real difference in the dialects observed, this same vowel sound is identified as IPA ɔ (viz., as in the pronunciation of the English word "awe" as a whole) by Jean Michel Filippi, et al., 2004, Everyday Khmer, Editions Funan. In general, Filippi’s textbook uses the IPA system to describe Khmer with a high degree of accuracy, and is thus much more useful than its modest binding might make it appear (especially as a comparative study with Huffman, 1970, as the latter is not entirely precise in its use of phonetic symbols).

We have something of a modern mystery as to exactly what sound Judith Jacob intends with the symbol ɤ (e.g., #2 in the chart above, j + ā = iɤ. Jacob’s articles on things Cambodian were collected into one rare volume as Jacob, 1993, op. cit. supra). This symbol is now technically denominated as "the ram’s horns"; the IPA modified the Greek consonant gamma (ɣ) to represent a vowel sound not easily indicated by the Latin script. Unfortunately, this was a relatively recent change to the IPA standard, and one has to raise an eyebrow at ɤ for most sources before 1993. Apart from the symbol, various published authorities differ as to the actual sound observed (i.e., likely reflecting different dialects and even idiolects found amongst Khmer speakers). Filippi (2004) prefers ɜ, whereas Huffman (1970, p. 26) simply uses the schwa (e.g., iə for the same combination given as iɤ on the chart above). In every case, it is an unrounded vowel sound of some kind, perhaps most easily imagined by English speakers as an "eu" or "uh" sound. All sources indicate that it is somewhat more "open" than the central, unrounded ɨ, but they seem to disagree as to how far back toward the throat the sound may originate.

Another example of such a problem of imprecision is the sound denoted as εe (#7, c + e = cεe); this could arguably be written as εi instead, and Huffman transcribes it simply as ei (1970, p. 9, though the definition of the sound suggests that neither e nor i is quite accurate here).

Rule 5 notes the possibile addition of a glottal stop in parenthesis (thus, j + u = jʊorjʊˀ). These glottal stops are in no way marked in the Khmer script, but arise as if tacit in the vernacular formation of words; they are fundamentally alien to the logic and phonetic range of Pali. Huffman tries to describe their arising in terms of a distinction between stressed and unstressed syllables; if this is a working hypothesis, it would entail the need for a separate article broaching the logic of imputing stressed syllables to Pali loanwords and Khmer-Pali chanting. If not, a better hypothesis will be wanting. In either case, the question goes beyond the scope of this website.

Rule 3 gives an instance whereby the addition of a glottal stop also changes the vowel sound before it: c + i = ci, however, c + i + glottal = cεˀ. This lends some further interest to the question of whether or not glottalization (in Pali and Sanskrit loanwords) arises through pure idiosyncrasy.

The question of how a palatal consonant modifies the outcome of rule 7 is raised by Huffman, 1970, pg. 27 (№27). The modifications Huffman observes are (prima facie) highly plausible, as they both move the vowel sound further back in the mouth (in advance of a consonant sound similarly located); thus, e.g., permuting eː to ɨ eases the transition to a palatal consonant, such as a "ch" sound (NB: Pali c = IPA tʃ). However, no such change is observed in the Phnom Penh dialect by Filippi, et al., 2004, p. 292 et seq.

The rules, above, are merely illustrative, yet they are sufficient to demonstrate the tendency of Cambodian speakers to modify Pali and Sanskritic vowels (in accordance with the strange logic implied by the adjacent consonants they are clustered with, in the Khmer orthography). Unless an effort is made to overcome this reflex, the tendency prevails as "naturally" as the imputation of tone by native speakers of tonal languages. An outsider’s reaction to all this would likely be, "Wouldn't it be simpler to pronounce Pali as Pali, and Khmer as Khmer?"

Indeed, native English speakers may well be asked why they do not simply pronounce Latin as Latin, or Greek as Greek; evidently, the vast majority of us cannot even perceive these as separate phonetic systems at work within our own language. It would indeed be simpler to extricate the ancient from the modern, and address ourselves to them as discrete, yet what is simple cannot always be easy. Inevitably, a very small minority of Cambodians will develop an interest in Pali (either as a language and literature unto itself, or for its strange role within the vernacular Cambodian tradition) --just as a small minority of English speakers must wince as the majority go on mispronouncing mɔreɪz.

We begin with the caveat that modern Burma is comprised of many different cultures belonging to several fundamentally different language groups: what may be said (in general) about the pronunciation of a "Burman" monk may not hold true for a Shan, Mon, or Arakanese monk.

Obversely, Burmese orthodoxy has had a massively centralizing (and standardizing) effect over several centuries --so much so, that (e.g.) the Shan no longer consider their own script(s) suitable for Pali, and even prefer to use Burmese letters for tattoos (perhaps the most tangible form a dead language can assume).

My summary of Burmese pronunciation of Pali has been wholly based on audio recordings; as such, it is perhaps the weakest part of this website, because my observations have not been verified by the experience of living with the spoken language (as of 2010, I've never spent any length of time in Myanmar, cf. the map below). At long last, I received some feedback from a native speaker of Burmese (A.K.) who evidently had a very good understanding of I.P.A., and who offered some corrections, based on his own experience (and some of these suggestions are now incorporated into the text following, with thanks).

One of the problems with phonetic observations is that the use of Pali loan-words (as assimilated into the vernacular language) inconsistently correlates to the formal recitation of Pali texts. Conversely, the tiny minority of monks who have achieved a high level of literacy in Pali are likely to pronounce the language in unique idiolects (influenced by their studies, at home or abroad, by a comparative knowledge of Indian languages, or a number of language attitudes and ideologies). In the ensuing discussion, we're limited to imprecise heuristics --though my informant (A.K.) insists that the vast majority of Burmese monks formally chant the language with the same phonetic patterns applied to loan-words.

The Burmese pronunciation of Pali can be summed up in two aspects: the fairly consistent consonant-substitutions, and the inconsistent, context-sensitive vowel changes.

The consonant substitutions will not present any special difficulties for the student, as they are perfectly arbitrary, and therfore make perfect sense:

The Pali c is pronounced as "s" (& ch becomes "sh").

The Pali j is pronounced as "z" (& jh becomes "zh", with no audible distinction, I am told, from "z").

I note there is a nice coincidence here, as it may well have been that 2,500 years ago the Pali j was pronounced as "z"; I believe K.R. Norman suggested this on the evidence of comparing Avestan to Vedic, with some corroboration derived from the earliest extant transliterations of Pali and Prakrit words into Greek.

The Pali p is pronounced as "b"; there may yet be some audible distinction from the the Pali b proper, but I am oblivious to it. Geminates involving b, p and their aspirated counterparts seem to be mutually-indistinguishable.

The Pali r is inconsistently pronounced as "y". It may be that sometimes an initial r is sounded out as "r", but a medial (especially subscript) r seems inescapably reduced to a "y" sound.

In the audio recordings of Pali being formally chanted, the s (သ) is normally pronounced as "t" or "d" (but, exceptionally, as "g", explained below). In normal usage, Burmese-Pali loan-words transform this "s" to (IPA:) θ (i.e., a "soft th" sound, as in the English word thought). Correspondingly, my informant tells me that Burmese monks recite (Pali) sukha as (IPA) "θṵ kʰa̰", i.e., consistent with its usage as a loan-word. The audio recordings I have heard are not consistent with this, but this may be a point whereby educated monks distinguish their own learning (I have no direct experience within Burma, but I recall a Cambodian monk telling me proudly how he reproached younger monks for failing to pronounce the Pali a vowels correctly, i.e., that he displayed his own learning with this shibboleth).

The Pali ss (သ္သ) is often pronounced like a "g"; my informant writes in to correct me, that this is more accurately a glottal stop (IPA: ʔ). As an example, he offers manussa appropriated as (IPA:) manouʔθa̰. However, the audio recordings, I could hear this glyph instead pronounced as a double "s" sound from time to time, i.e., again showing that formal Pali chanting is not entirely consistent with the pattern of interpretation that native speakers apply to loan-words.

Even the (single) initial Pali s is sometimes spoken as a "g/ʔ" sound, especially where it follows after a complex sound/glyph, such as a velar-n compound. In the recordings, words starting with sam- are frequently indistinguishable from gam-, and this seems almost to be a form of vernacular euphony, following on the final niggahīta (ŋ) sound of the prior word.

It is needless to say that the Burmese rules for the simplification of a final consonant sound could be erronenously applied to a medial Pali consonant, but I have found this to be very rare in the Burmese recordings, most likely because the Burmese/Mon script so clearly shows the difference between a medial and final consonant (whereas modern Thai script does not) --and a competent Pali reader should be aware that the language has no final consonants except the niggahīta. I would expect that more errors of this kind are made where the reader relies on a text in simplified modern script that uses the "Thating" as a substitute for the classical ligatures and stacked glyphs; this sort of simplification is the exception, rather than the rule, in Burmese editions of Pali texts, I am pleased to say (this is an unflattering contrast to the increasing reliance of new Sinhalese editions on the "Hal Akuru", a mark that serves the same purpose, viz., serving as a substitute for properly joined letters).

The vowel changes are inconsistent in that they arise from the context, and follow the tendency (common to the Tibeto-Burmese family of languages) of eliding complex sequences of sounds; the logic of these simplifications is internal to the modern language, and applied inconsistently to classical texts. I observe that in several recordings made of one and the same Burmese monk chanting Pali, he will make some vowel errors consistently in some suttas, but then not make the error even once in reciting another sutta, likely reflecting that he learned the texts in question from different masters, and retains their shibboleths respectively. Initial and final vowels tend to be preserved, but the medial vowels are transformed most adventitiously; of these, the most striking are:

The Pali short a is pronounced as an "i" especially when interpreting the implicit vowel prior to a geminate or in any syllable involving the letter c (the latter is, recall, itself mispronounced as "s"). The pronunciation of paccayo as "pissayo", and gacchāmi as "gissami" are frequent examples.

Similarly, my informant (A.K.) points out, Pali u-vowels transform into a dipthong along the lines of (IPA:) oʊ when they appear in the context of a cluster of consonants (such as Pali ddh, cch, etc.).

Both the Pali short a and the short i are sometimes pronounced as a hard "e" sound; this seems to be most often the case prior to a geminate, and does not seem to be directly determined by the medial vowel’s antecedent consonant. ☛ My informant differs on this point, writing, "I think you may be confusing the 'e' sound with the creaky toned versions of those vowels (since the short a and i in Pali are equivalent to the creaky tone in Burmese): [ḭ] and [a̰]…". This is certainly a useful observation (and my thanks, again, to A.K.) but cf. my concerns at the opening of this section about the asymmetry between loan-words and Pali recitation.

The sequence yi is also very frequently spoken as a hard "ye" sound.

A short "a" sound is sometimes inserted between compound consonants where there is none to be found in the text (e.g., tasmiŋ read aloud as "tasamiŋ"; perhaps an especially significant example given the frequency of the word, and the clarity of the Burmese orthography on the subscript sequence sm-).

In loan-words, my informant would add, the standard substitution for Pali "o" is (IPA:) ɔ (the vowel sound formed by the entire word "awe" in English).

Further feedback is welcome, but this section is unlikely to improve significantly if I do not relocate to Myanmar myself (a possibility I am open to considering). When this website was first posted, there was hardly any material of this kind on the internet (and Unicode Burmese was very rarely seen, and just barely working for Pali) but it is now more and more likely (with each passing year) that Burmese authors and websites will supersede this sketch that I've set out.

The Lao pronunciation of Pali cannot be explained without reference to the orthography; it is both the case that some orthographic changes have been imposed upon the phonology, and, obversely, the phonological changes made to Pali cognates (appropriated into vernacular Lao) are now foisted onto the pronunciation of the classical language. Such confusion is both natural and inevitable in the interchange of two radically different languages. Lao is comprised of tonal monosyllables, whereas Pali is non-tonal and polysyllabic. The classical language is synthetic and grammatically complex, whereas vernacular Lao is in some measure analytic and agglutinative, with a grammatical system of protean simplicity (e.g., Lao neither distinguishes words according to gender, nor number, nor case, nor declension).

As discussed in the Thai section below, nine tenths of what you'll learn here is applicable to Thai, and you can combine what you'll learn in this section with the Burmese (above) to try to sort out Shan, Lanna, and Lao-Tham scripts.

In future, I may or may not find time to publish a more extensive article on this subject; for the time being, I'll provide the following observations in brief.

The modern language has fewer consonant sounds than the classical, and so both the modern orthography and the vernacular phonetic assumptions are imperfectly mapped onto the full grid of classical consonant sounds. This yields certain, consistent misapprehensions, such as:

In the modern alphabetic order, a single "j" sound (by the English "j" I mean the I.P.A. phoneme "dʒ") now stands where the classical language formerly had a range of four consonants "c-ch-j-jh". The sibilants (viz., two "s" glyphs, distinguished according to tonality) have here been interposed (as if to fill in the gap left by the collapse of these four distinct sounds into one!). One direct result of this is the imposition of a vernacular "s" sound (writ ຊ, never ສ) onto words with classical "j", e.g., jāti → sāt, and jarā → salā. This can apply equally to medial j, jj, or jjh, e.g.: vijjā → visā.

The classical distinction between k and g has largely disappeared; the modern use of the three "k" glyphs remaining in the vernacular (viz., ກ, ຂ, & ຄ) primarily distinguishes them in respect of tonality (although some lowland speakers insist on the first "k" [ກ] being spoken as "g", as with Khmer, minor consonant-sound variations of this type may or may not distinguish them in any given local dialect; this can only be considered as part of the language a posteriori, and with inconsistency).

The vernacular "d" (ດ) and "plosive d" (ຕ, translit. đ, distinguished by a phonetic criterion that did not exist in the classical language) are inconsistently used to indicate the Pali t and the aspirated th respectively. Obversely, aspiration is a distinction that does not exist in the modern vernacular, with confusion ensuing. Thus we find ຕ (đ, the "plosive d") used where we might logically expect to find d (ດ) in Pali cognates, viz., representing classical dental-"t" in the initial position after monosyllabization, e.g., kataveditā → ກະຕະເວທີ; kattari → ກະໄຕ/ກະຕັດ.

The glyph ຕ, now used to express the "plosive d" sound is, unfortunately, the same as was used in classical times to represent one of the "k" sounds (this can still be seen, e.g., in the Lanna "k", and in some styles of Lao-Tham script), and is largely similar to certain forms of the Khmer "b" (ព) --this opens another possible avenue for confusion, albeit arising rarely outside of the study of epigraphy.

The two vernacular "t" glyphs (ຖ & ທ) are a complementary source of confusion, being associated with classical d and dh, and also serving as inconsistent substitutes for the retroflex sounds (ḍ/ḍh) that exist in the classical language, not the modern. While the the second (low tone-class) "t" ("ທໍ-ທຸງ") would be theoretically equivalent to Pāli dh (and as a substitute for Pāli ḍh), we find in fact that it is often used to represent the classical (unaspirated) dental "d" in cognates, e.g., dāna → tān (ທານ) --this invites further confusion.

Similarly, the paired "b" and "abrupt/plosive b" (ບ & ປ, translit. b & ƀ) are now used to represent the classical "p" and "aspirated p" sounds in an uncertain and inconsistent manner (e.g., pandita becomes ປັນດິດ with the "plosive b", but pañha becomes ບັນຫາ with "b"); even the name of the language itself (Pāli) is sometimes written in Lao with one, sometimes with the other character (ປາລີ vs. ບາລີ).

Confusion on this point will also arise with loan-words and cognates from modern languages. Despite the fact that Laotians are quite capable of distinguishing "b" from "p", the two cities of Paris (France) and Bali (Indonesia) are both transliterated into Lao with precisely the same spelling (ປາລິ), using the "Plosive b" for both (and, following Phumi Vorachit’s system, reducing "r" to "l" in phonetically rendering "Paris").

There is ever the possibility of confusion between ñ and y in the contact between the classical and the vernacular, with the proximate causes being:

The graphical similarity between the two in Lao (ຍ vs. ຢ).

Confusion over which glyph to use due to rules (internal to Lao) concerning the representation of the y sound in initial vs. final position.

Confusion due to similar looking characters in Thai (ย) and "Tai-Noi" scripts, that do not follow the same logic (viz., ย = ຢ, but ย ≠ ຍ).

Confusion in the transcription of classical "subscript-y" forms into vernacular scripts that either lack such forms entirely, or may employ the equivalent symbols for them with a logic differing from the method used in writing Pali (as in the use subscripts of Lao-Tham for old vernacular Lao, or Lanna script for vernacular Northern Tai; the subscripts are graphically the same as those used for Pali, but their signification is different, especially so far as implied vowels are concerned). The common concomitant of an alteration arising from this cause would be the insertion of spurious medial vowels in-between (classical) compound consonants where a formerly-subscript y was misinterpreted.

As an example (of historical confusion of ñ vs. y) Prapandvidya proposes that the Sanskrit word kriyā entered Thai as krayā (กระยา) from krañā, with the unusual vowel change explained by reference to the medium of a (supposed) Khmer pronunciaton of the Sanskrit as kreya; thus, Prapandvidya’s semantic claim is that the modern Thai meanings "Mode, thing, edible," derive from the ancient (Sanskrit) meaning "rite/offering". [Chirapat Prapandvidya, 1996, "The Indic Origin of Some Obscure Thai Words", Proceedings of the 6th International Conference of Thai Studies, Theme IV, Vol. I, p. 415-426] It seems more likely to me that the implicit vowel "a" (in-between the first two letters) has been lost in appopriating one of the various Pali words starting with kara- and (semantically) indicating the means of action, mode, or grammatical instrumentality; thus, any number of Pali words (or compound words) related to karaṇa would provide a semanitcally appropriate origin for a sequence of substitutions along the lines of karaṇā → karañā → krañā → krayā. Whereas the latter sequence does not make sense if we assume the source must be Sanskrit transmitted via Khmer, it makes perfect sense for a Pali loan word transmitted via Lao (or one of various "Laoesque" pre-modern scripts of Thailand, then transliterated into modern Thai). The classical retroflex "ṇ" is commonly enough supplanted with Lao "ñ" (e.g., the realted Pali term karaṇā → ກຣິຍາ, kariñā), and the latter could then be mis-read as "y" due to Thai confusion when reading a Lao (or "Laoesque") "ñ" (ຍ) as if it functioned as modern/central Thai’s graphically identical "y" (ย).

Although the modern and vernacular alphabets have maintained the pattern of ending each row with a nasal, the dental "n" (ນ) that concludes the third row must also serve to represent the classical retroflex ṇ, as the vernacular affords no closer equivalent. Generally, the (classical) retroflex sounds have (modern) dental substitutes, but the nasal sounds are especially prone to being simplified, especially where a modern reader would interpret them as being in the final position of a monosyllable, dropping the final vowel thereafter.

Confusion between b and v has both ancient and proximate causes. The similarity between the figures used for these glyphs in Fa-Kham script may be a proximate cause (Fa-Kham is a script adapted from Cambodian and used extensively in inscriptions in central Thailand from the Sukhothai period, or earlier); confusion about the separate existence of the classical v seems to have prevailed in all Khmer/Khom-related scripts from a very early period, and may derive in part from the South-Indian pronounciation of b & v in transmitting Sanskrit to mainland South East Asia [see: Michel Ferlus, 1997 ,"The origin of the Graph b in the Thai script", in South East Asian Linguisitc Studies in Honour of Vichin Panupong, Arthur S. Abramson [ed.], Chulalongkorn University Press, p. 79-82]. While Ferlus’s article on this subject is very useful, it overlooks the fact that substituion rules and variant spellings within Pali already indicate some mutability between v and b before undertaking the passage to Cambodia, and (as Ferlus notes) no similar confusion can be seen in the Monic scripts (he posits that the solution was finally to derive a new b-glyph in the Khmer group from a Mon source/inspiration, replacing the pre-11th century square/blob b-glyph that, up to that time, still resembled the form used by Aśoka). Ferlus’s article also omits to mention the source of confusion in the use of vernacular "v/w" as both a consonant and a semi-vowel in Tai-Kadai langauges, and that this was sometimes an impetus for orthographic changes (note, e.g., that in Lanna script this entails an orthographic distinction between two subscript forms of v & w respectively).

The classical language has no "f" sound whatsoever (so the two vernacular "f" glyphs do not enter into the confusion), but either of the (tonally distinct) vernacular "p" sounds may now be found representing the classical b sounds, or, less often, will be found where we should expect a v in Pāli (for the reasons outlined above).

The labial row of the alphabet presents a relatively simple instance of the "inversion" of of the sequence of sounds (viz., the order of classical "p" and "b" are exchanged, reading the vernacular equivalents from left to right) more uncertainty will be found in praxis, as the moderns have had to resolve many complex geminates and consonant clusters (involving classical "p", "ph", "b", "bh", or occasionally "v") into simple monosyllables with these mutually-confusing symbols. Thus, so far as initial consonants are concerned, we observe the general transformation of classical b/bh into the two (tonally distinguished) vernacular "p" sounds, and, vice-versa, classical p becomes modern "b" or "the plosive b", but with less consistency than the inversions of former rows (as discussed above). Thus, e.g.,

bhāsā → pāsā (ພາສາ)

pañha → banhā (ບັນຫາ)

padesa → ƀatet (ປະເທດ), though one might instead expect to find the latter as ບະເທດ (as per the pattern of the preceding examples).

One of the two Lao "h" glyphs is derived from (and often graphically identical to) the Khmer "v" (ຮ vs. វ), and closely resembles "r" in either script (Lao: ຣ, Khmer: រ, Thai: ร). This allows the conflation of a range of Pali and Khmer loan-words (in both Lao and Thai). [Updated, 2009:] Apart from the obvious orthographic difficulty involved, there is a native source of confusion in the Phnom Penh dialect; it recent centuries, it seems, the residents of Cambodia’s capital regarded it as very refined to accent their speech by eliding the written r (រ) and supplying both an h-sound and a change-of-pitch in place of the elision. [Naraset Pisitpanporn, 1999, "A note on colloquial Phnom Penh Khmer", in: The 9th Annual Meeting of the Southeast Asian Linguistics Society, Arizona State University.] This seems to belong to the artificial changes of language that denote social status (and thus, as Pisitpanporn observes, it became less widespread during the Communist period) --and was influential far beyond Phnom Penh. At any rate, the logic of these changes is innate to Khmer, and difficult to anticipate in its effects on other languages; thus, one has to watch for the interchange of "h", "v" and "r" (with "r" sometimes reduced to "l" thereafter), with no consistent direction or pattern to the permutation. Vernacular Lao frequently has h (ຣ) where Thai preserves the written r (ร) scilicet, in imitation of old Khmer cognates, that would continue to be written /r/ (រ) in Cambodia, but given a lilting h sound among the sophisticaes of pre-revolutionary Phnom Penh. These changes do not consistently reflect confusion between classical v and h.

In Filliozat’s catalogue of the Wat Po collection of manuscripts (viz., poorly copied from Sinhalese sources, into Khom script, in the second quarter of the 19th century, in Bangkok), she comments on the confusion of r vs. h as the most peculiar of the numerous transliteration errors made by the scribes, such as a scribe reading hemāyavatthu and writing out romāyavatthu. In sum, the abysmal quality of the MS demonstrates "…that the scribes had no real understanding of Pāli language or were not paying attention to the meaning…", but she remarks that confusion of r & h is especially inexplicable as they "cannot be confused in any script". This only seems true if the scripts under consideration Khom and Sinhalese only; however, the confusion is easier to understand if we keep in mind that the Bangkok court was (at that time) brim-full of educated slave labour brought back as captives from the total depopulation, sacking, and incineration of Vientiane in 1828. (This may also give some context to the lack of zeal on the part of the scribes!) Even if they were put to the task of copying Sinhalese into Khom, "r" in the latter would have resembled "h" in the script they were most familair with --viz., Lao. [See: The Pāli Manuscript Collection Kept in the Vat Phra Jetuphon Vimol Mangklaram (Vat Po) the Oldest Royal Monastery of Bangkok, Jacqueline Filliozat, École Française d'Extrême-Orient, 2002-2003].

Tertiary patterns of simplification of geminate morphemes, and substitution of dental sounds for retroflexes, etc., are pretty well self-evident, and are not much worse than the attempts of Europeans to pronounce Sanskrit.

Much that applies to Thai has already been explained in the section on Lao (above). The problems that are unique to the Thai recitation of Pali can be most easily (although not with great certainty) explained by orthographic developments that were ancient in their causes, but modern in their effects.

From about the 13th century until the modern period, central Thai vernacular languages were written in a tradition of "Fa-kham" scripts. Notwithstanding several nationalistic myths to the contrary, these scripts were, originally, derived from classical Cambodian, and imperfectly modified in response to the phonetic (and tonal) needs of Tai-Kadai languages; this vernacular development can be thought of as a separate line of succession from the scripts used to write Pali. At the dawn of the modern era, Khom (classical Khmer) script was still used for Pali in the majority of Thailand’s land-mass, with the Lanna script used in the North-West. There are some remarkable exceptions to that generalization, with (ornate) local variations in Pali scribal traditions (extant both on Palm leaf and on stone), presumably fostered by the patronage of local rulers, and monasteries that acted as educational institutions.

For the vernacular, this mix of influences has had mixed results. There are now many monstrous problems of interpretation that Thai speakers encounter both when reading Pali (or Sanskrit) cognates in their own language, and also when attempting to learn or recite Pali (or Sanskrit), precisely because of the vernacular modifications that the Fa-Kham orthographic tradition underwent have now been foisted back onto the classical language.

While Lao orthography has been modernized to be almost perfectly phonetic, Thai has moved somewhat in the opposite direction: official spellings are heavily Sanskritized, as if to draw special attention to both the Indic and classical Cambodian origins of much of its vocabulary. This only makes it more difficult for native Thai speakers to pronounce Pali (or Sanskrit) correctly, as they are accustomed to eliding so many Sanskritic elements that appear in their written language, but are not now (and likely never were) part of the spoken, vernacular form. For example, "…confusion may arise because there is no indication if two consonants are a compound or not, such as: candra may be pronounced can-tha-ra, can-thra, or even can-thom." ["Changes of Pali-Sanskrit Loan-words in Thai", Prof. Visudh Busyakul, in Sanskrit in Southeast Asia, 2003, Sanskrit Studies Centre, Silpakorn University, Bangkok, pg. 522]

The example just given also shows that Thais are burdened with an inherently confusing system of implicit vowels, and, when faced with Pali text (or cognates) in their own script, will frequently misapprehend where the implied vowel is "a", "o", or none at all. A native Thai reader will be accustomed to guessing where to reduce a consonant to a nasal sound, or where to treat it as non-final and assign an implicit vowel sound following it (note, above, "can-tha-ra", vs. "can-thom"). On the whole, this entails that Thais are highly inclined to omit/elide morphemes from Pali words, ranging from the simplification of initial compound consonants, to the reduction of medial geminates to terminal consonants, or, very often, the omission of the entire terminations of polysyllabic words, i.e., making it impossible to determine the grammtical significance of any/all words in a sentence.

So far as Pali is concerned, it may be complained that these problems are not endemic to the Thai orthography (per se), but merely arise from the inappropriate (vernacular) assumptions of native speakers in reading it. Naturally, the overlap of the modern and the ancient in the form of cognates used in everyday language has a powerful influence over the interpretation of the classical language (as the script used for both is now one and the same, at least in the Royal editions, and the government-controlled monastic education system). In reading classical cognates (etc.) the reader has no clear direction or consistent rule to follow (in modern Thai script), and so inevitably develops a habit of anticipating what is left indeterminate by the script. Needless to say, these "anticipations" (that serve to fill in the unwritten portion of the phonics of ancient words) are subject to variations of dialect and locality, and project social status and ethnicity within Thai society.

I have already made reference to a very short article on this subject (fewer than four pages) titled "Changes of Pali-Sanskrit Loan-words in Thai", by Prof. Visudh Busyakul [Op. cit. supra]. One of the peculiarities of the article is that it describes the changes in the Thai pronunciation of Sanskrit (and, thus, by extension, Pali) as if they were part of a quite intentional plan carried out by king Ram Khamheng (who may well be fictional, N.B., as according to Michael Vickery’s articles in The Ramkhamheng Controversy, published by the Siam Society). Busyakul thus regrets that no phonological distinction was assigned to first four consonants of the classical alphabet by the latter king, as the Khmer system of distinguishing the consonants by means of vowel changes was lost, and nothing to replace it was devised. Thus, the sequence that was originally k, kh, g, gh, now appears to the Thai reader as a nearly undifferentiated sequence of four "k" sounds; this would indeed be an astounding error if we beleived that any such change was actually devised by a single man’s conscious intention, and we can even less believe that this was the grand plan of a mythic king.

Ferlus instead presumes that at a remote date (both unspecified and unknown) ancient spoken Thai distinguished "a non-voiced dorsal fricative" and also a "voiced dorsal fricative", and that these have since dropped out of the spoken language, leaving their fossils (so to speak) in the odd array of "k" glyphs that were modified from Khmer to make up the first row of the Fa-Kham alphabet. [Op. cit. supra, p. 79-80] This is certainly a more plausible explanation, but relies to an uncomfortable extent on speculation.

Busyakul’s account of Thai phonological simplifications (of classical, Indic phonemes) provides another detail of significance in contrast to Lao: "As a rule," he writes, "the unaspirated sonant and aspirated sonant of all five series [i.e., rows of the alphabet] are pronounced as the aspirated surd of the corresponding series". [Op cit., pg. 521] Thus, e.g., he would define the Thai pronunciation of the second line of the (pseudo-pali) alphabet as "ca-cha-cha-cha-ña". This is a significant difference from the Lao interpretation of the equivalent row of glyphs (see above), and my experience would tend to affirm that the central Thais do apply a hard "cha" sound to many Pali/Sanskrit loan-words where a Laotian would read "s" (both being equally incorrect, as the classical spelling of the words in question is j or jh).

Another example in the history of Thai phonetic and orthographic shifts is examined at some length by Michel Ferlus, op. cit. supra. Ferlus provides some interesting illustrations as to how the Fa-Kham scripts (that were later reduced to modern Thai) both initially diverged from Cambodian (to suit Thai phonetic requirements) and then changed over time with the vernacular.

The interchange of classical "t" for modern "d", and "p" for "b" (described at length for Lao, above) is very simply accounted for by Busyakul as follows: "...these words have been imported into Thai not directly, but through the Khmer medium." [Op cit., pg. 522] Although there is some small measure of truth to this, I honestly do not see how the Cambodians can be blamed any more than Ram Khamheng for the change; this seems to merely defer the question (petitio principii). Briefly, the Khmer system provides vowel-sound distinctions as substitutes for classical consonant distinctions (i.e., the listener can distinguish one classical consonant from another by hearing a difference in the associated vowel sound). More likely by accident than by design, the Thais dispensed with this system and (as mentioned briefly above) have instead created new grounds for confusion as to which vowels are associated with which consonants (both for cognates and classical texts rendered in modern script); but even so, the particular problem discussed would not have existed before the mid-19th century, when vernacular Thai script was suddenly foisted onto the ancient language, and a combination of Western model schools (created by Christian missionaries) and Thai state education (following the former "model") replaced the monastic transmission of literacy, with predictable results for the Pali tradition.

So far as listening comprehension of Pali chanting, the issues in Thailand are largely similar to those with modern Lao in the "Buddhist heartland" of Thailand, viz., the upper Issan plateau in the North-East, where the predominant language remains lowland-Lao (but state education is entirely in central-Thai). Although the Issan country is among the regions least often visited by tourists in Thailand, all the quantitative measures of Buddhist education and religious activity seem to affirm what many would report anecdotely, i.e., that the Issan remain (disproportionately) the staunchest supporters of Buddhist monasticism in Thailand. Thus, while some students who are new to the field may find it odd that so much attention is given to Lao on this website, the fact is that the language spoken in the environs of the monasteries (in modern Thailand) where so many Westerners ordain is not Thai, but Lao (e.g., Wat Nanachat outside of Ubon Ratchathani, or the famous Dhutanga monasteries along the Mekong, both to the west and east of Nong Khai; notably, Ajan Chah spoke Lao as his first language, and, for the sake of Thai nationalism, editions of his work tend to euphemistically mention that they were translated from lectures given in "a north-eastern dialect"). In brief, the Lao section will be of more utility for those intending to ordain in Thailand than they might expect.

Although I have more enthusiasm for adapting my ear to dialectical changes of this kind than most, it must be complained that the paucity of (mutually-distinguishable) consonant sounds in the vernacular (without the Khmer remedy of systematically-associated vowel changes) when combined with the tendency toward "monosyllabization" (e.g., omitting final sounds, and so depriving the classical language of its marks of declensions and conjugations) has resulted in the real incomprehensibility of Pali as it is recited in most of Thailand today.

This reduction of the languge to un-grammatical, mutually-indistinguishable, and genuinely incomprehensible monosyllables in the context of ritual performance has encouraged the tendency of religious followership to presume to take the source texts as tabula rasa, attributing to them both pre-Buddhist myths that are wildly at variance with the explicit meaning of the texts in question, or, with equal ease, taking the texts as a corroboration for relatively recent innovations in the popular faith. Obversely, I must imagine that it is discouraging to a native Thai reader to have to figure out the obtuse way in which the familiar (vernacular) script is made to express the classical sounds, with an unfamiliar system of both implicit vowels and explicit consonant values --though it is a very small minority of monks in Thailand who learn even this much about the ancient language. The tradition of Thai word-for-word glosses (which is the one part of the Pali tradition that is indispensible for sermons and rituals) effectively severs the study of lexis from grammar or even pronunciation; in modern Thailand, it is primarily the ability to gloss Pali words in isolation that is cultivated among the clergy.

Introductory note & recommended reading

Considered as a text, the Pali canon is unwieldy; there is more than one system of organization to be found within it, and all of them aim at ease of memorization --not ease of reference.

One of my first experiences in grappling with the canon arose when a Sinhalese monk (who briefly tutored me) asked me to find and photocopy a particular passage in the monastic library. I was told only the name of the Sutta (and of the Nikaya) and proceeded to sit down in front of the (Sinhalese-script) B.J.E. Tipitaka, to leaf through the lengthy tables of contents, searching for the correct volume, correct chapter, sub-chapter, and, eventually, the correct page. I found the sutta, but, more, I gained a direct appreciation of how difficult it is for someone without a very rigorous introduction to the organization of the texts to find anything in them whatsoever. Although I was still pre-occupied with very elementary questions of the language, I began to study the structure of the canon as a separate matter. With the advent of computer indexing, many will eschew such an introduction to the difficulty of searching the canon on paper; to search through stacks of palm leaves is an even more visceral demonstration of how puny a single scholar is against the mass of text that we would presume to study.

The single most useful resource (if you can only have one) is Oskar von Hinüber’s A Handbook of Pali Literature; however, this work was planned as a chapter in a larger volume, and so (as its foreword explains) it has certain explicit limitations. The survey of Pali literature in Sri Lanka by G.P. Malalasekera approaches much of the same material from a different perspective, and works within a different set of limitations
(e.g., Malalasekera’s work includes a brief history of Pali grammatical literature, but, as its title suggests, does not treat Thai sources). I have not yet found a copy of M. Bode’s survey of Pali Literature in Burma; although the book is a century old, it has not yet been superseded, except for the usual tide of academic articles. It was suggested to me that Bode’s work will soon be reprinted by the B.P.S. in Kandy. I do not recommend Bhante H. Saddhatissa’s articles in this genre, such as his survey of the history of Pali literature in Thailand, and so on. In this wise, I also do not recommend the Guide to the Tipitaka authored by an anonymous council of monks in Burma (and reprinted by White Lotus), although the latter is instructive as a kind of cross-section of Burmese monastic opinion on some of the more renowned passages.

The map here provided shows the sub-divisions of the Tipitaka as simply as possible. By far the most confusing part of the Theravada canon is the Vinaya, which was re-organized after its arrival in Sri Lanka (now providing a striking contrast to the organization of the Mahasanghika, Mulasarvastivadin, and Dharmaguptaka recensions).

The map makes use of bracket to indicate optional aspects of titles and, in some cases, the relationship between volumes:

Square brackets mark inconsistencies in titles, such as alternate spellings, prefixes and suffixes that are sometimes omitted. For example:

[Sutta-]Vibhaṅga

Parivāra[-pātha/-pāli]

So-called "curly brackets" are employed to indicate an embedded text, i.e., a text that appears distributed within a larger work, but may (or may not) sometimes be extracted and presented as a separate chapter, or even as a wholly separate text. The presentation of these will vary considerably from one edition to the next (thus, they are well worth drawing attention to in this fashion), e.g., while Kammavācā texts have grown into a broad genre of books and manuscripts independent from the Vinaya, any particular edition of the Vinaya may not list "Kammavāca" as a separate section in its table of contents or index (as it is "embedded" in the flow of the text). This is a recurrent feature of Pali literature, most often discussed in terms of the Mātika ("Matrix") system of organization found in the Abhidhamma Pitaka; however, similar embedded texts (with similarly confusing patterns of being extracted and presented as stand-alone works, sometimes as a preface to their source) are found in the Vinaya, grammatical and paracanonical literature. Debatably, e.g., the last two suttas of the Digha Nikaya could also be considered part of the Mātika tradition.

In just a few cases, the embedded texts marked with these brackets were formerly independent texts that are now subsumed within other units of organization (and so should not be described as a Mātika), e.g., the {Bhikkhuniikkhandhaka}.

A few works are marked with an asterisk, to indicate that their inclusion in the canon is disputed, e.g., the Jātaka may (or may not) be considered paracanonical.

In just a few cases I have provided further information such as the chapter number or total number of chapters in parenthesis.

My thanks are owed to Sebastian Krauß for making the computer program that allowed me to generate this flow-chart; as I presume it will be of use primarily to students at an early stage, I have provided it in Romanized text.

The organization of the commentaries is not too terribly confusing, however, unresolved questions as to their respective authorship and dates of origin are a considerable area of study unto themselves. This map does not list all Pali commentarial/exegetical literature, but the primary commentaries corresponding to each of the major sections of the canon, and the primary sub-commentaries that relate to the former. I have excluded the titles that Hinüber informally groups under the heading of "Later Subcommentaries" (this category consists primarily of the 15th century works of Bhante Ñāṇakitti in Chiangmai); I have also omitted to mention (both Sinhalese & Burmese) sub-commentaries that are as late as the 18th or 19th centuries.

The orthodox Pali commentaries (per se) are considered historically "closed", and are mythically associated with a single generation that assisted Buddhaghosa in expanding the Visuddhimagga into a systematic gloss of the first four Nikayas, and then, in uncertain stages, further developed the literature to cover the entire Tipitaka. As I have said, this is mythic; assigning actual historical dates and authors to these texts is another matter entirely, and Hinüber provides an inspiring introduction to the questions of philology that remain unresolved.

The assumptions surrounding these texts are a source of confusion both for followers of the religion, and for non-specialists who develop an interest in Pali, e.g., crossing over from Sanskrit or other Asian studies. All that I need to make clear in this place is that each canonical text has only one commentary per se, but an unlimited number of sub-commentaries or other explanatory works can (of course) be written about it --and many have been composed, but they are not strictly called "commentaries" (Aṭṭhakathā). Thus, a student may be confused that we often speak of the commentary to a given Nikaya, and abbreviations to this effect are frequently found in scholarly articles (e.g., DA = Dīgha Aṭṭhakathā, viz., the Sumaṅgalavilāsinī), while there seems to be a profusion of such texts available under various (rather vague) Pali titles.

Similarly, when scholarly sources speak of the sub-commentary to a given work, they will invariably mean the corresponding work written at the Mahavihara in Anuradhapura (at any time after Buddhaghosa), although not all parts of the canon have a sub-commentary by this definition, and many have additional sub-commentaries from Burma, Thailand, or other sources, that would not be cited in this fashion as "the sub-commentary".

However, the voluminous commentaries and sub-commentaries are not the whole of the Pali exegetical literature. Hinüber refers to hermeneutic works that are outside of these traditional categories as "Handbooks" (a term he applies even to the Parivāra and Nettippakaraṇa); we thus seem to have a working definition whereby any exegetical work that is excluded from the semi-historical narrative of the authoring of the commentaries at Anurādhapura is called a "Handbook" --and this would exclude works that pre-date Buddhaghosa as well as most of those that follow after.

In popular belief, the extent to which the commentaries were all written at one and the same place and time is exaggerated (viz., at the Mahavihara in Anuradhapura, during Buddhaghosa’s lifetime), and this tends to be supplemented with the belief that this huge bulk of literature was produced (so quickly) by directly translating it from (earlier, no longer extant) Sinhalese commentaries into Pali. Hinüber’s account is an excellent antidote to these and many other misconceptions about the commentarial literature; he reviews (in brief) all of the information available as to assigning dates to these texts, and, more generally, in reading his descriptions one gains a more balanced appreciation that while Buddhaghosa’s commentaries do (selectively) quote earlier material from Sinhalese sources, these quotations are most often contrasted against other opinions, and are frequently enough refuted. The composition of the commentaries should be considered as acts of original authorship, although (as with all religious orthodoxy) they worked closely from earlier sources. In our time a large portion of the beliefs and practices of the popular faith originate from the commentarial literature, and this often enough stands at odds with the original (Buddhavacana) text. Many of these commentarial sources may be described as the invasion of (wildly spurious) Jātaka-type stories into the earlier layers of the canon --and these narratives have done more to obscure the point of the source texts than to elucidate them. As with interpretations of our own times, the commentaries tend to depart from the reflective (and at times provocative) tone of the source texts by insisting that (spurious) dogmatic preconceptions (contemporaneous with the authors) are implicit therein, with the primary effect of discouraging the reader from doing their own thinking as to what the sutta (read in isolation) suggests, and with the pervasive secondary effect of foisting a lot of later (popular) religious accretions onto earlier layers of the tradition (e.g., refusing to allow that the Buddha did not already know the answer to every question before asking it, including practical questions such as "How many monks are staying here?", or "What were you talking about before I arrived?", but insisting that the Buddha only asked such things out of the coy omniscience of one who has foreknowledge of the reply, etc.).

In the map below, it will be observed that the commentaries on the first four Nikayas are grouped together as subsidiary to the Visuddhimagga, whereas the last Nikaya has a more-or-less miscellaneous list. This represents both Visuddhimagga’s claim to be a commentary on the entire Sutta-pitaka, and the widely-held assumption that it served as the common foundation for the composition of the commentaries on the first four Nikayas (with Dhammapala and other authors later supplying the material for the fifth). Refer to Hinüber’s handbook for further arguments as to how these texts relate to one-another. This chart would be somewhat complicated if we followed the traditional practice of attributing all anonymous commentaries to Buddhaghosa (and accordingly grouped them under the heading of the Visuddhimagga).

I have noted the titles of the canonical texts glossed in parenthesis; this will be found useful because of the common practice of citing the commentaries by the name of the text glossed rather than by title (e.g., a reference to "the Dīgha-Aṭṭhakathā", is more common than stating the commentary’s own title as "the Sumaṅgalavilāsinī").

I have included the commentaries to the Milinda-pañha and the Jātaka in this map for the sake of convenience; as noted in the first map (above), they are considered either quasi-canonical, or para-canonical, by the Burmese and Sinhalese respectively.

The implicit suggestion of the title for this section is that electronic editions of the canon should only be used as guides to
printed editions. This is not solely because the electronic texts contain errors: printed editions and manuscripts contain errors, too,
but there is a significant difference in that they make errors for more interesting reasons, and it is therefore more rewarding for the
reader to discover and correct them. The difference between a good edition and a bad one (be it on paper or not) is not the absence of
errors, but the relative significance of the errors: in the comparative reading of Asian editions and manuscripts, one can encounter
errors of real philological and historical significance, whereas the electronic texts (and many modern, western editions) present us
with errors that are merely in need of correction.

With that caveat stated, many thanks are owed to the Sri Lanka Tipitaka Project, which is not the only project to produce
canonical etexts, but is all the more significant beause they are the only one to distribute the results freely. This is a striking contrast
to several of their "competitors" in Thailand, who charge hundreds of dollars for the privilege. I visited their offices on my first trip
to Colombo, and met the venerable Mettavihari, who has been the main inspiration (and technical advisor) responsible for the project reaching its current stage of development.

The nature of the project would seem to be one of perpetual incompletion, as the task of proof-reading the texts,
adding notes on variant readings, and consolidating available translations, aims at a very distant horizon --and traverses the distance
with a very small staff of unpaid volunteers. Periodically, the etexts available on the web are updated by the project, and various
secondary websites (such as GRETIL and the JBE) re-format and re-distribute the texts.

For my own use, I found it necessary to re-format the texts as Unicode compliant PDF files; although the latter format is "larger" than TXT,
it is currently superior for automated search features, and has all issues of font and glyph assignment resolved internally. Below are
links to the four Nikayas and the complete text of the Vinaya; the Khuddaka Nikaya is excluded simply because I found too many problems with
the current version of the source texts (e.g., one of the Sk. recensions of the Dhp was erroneously included instead of the Pali!) and it may be
added in the future. Note that the romanization standard followed is the same one set down in the various charts
above, and that the file sizes listed below reflect the final size of the file after decompression (or "unpacking"), not during the download.

I seem to find the time to create some extra files to post to this part of the website every few months. These are "exceedingly miscellaneous" in their subject-matter, as many of them began as side projects or prospective appendixes for the textbook I'm writing; others developed in response to an opportunity to bring together disparate (hard to find materials) into a reference of some kind. Although you may not find all of these files immediately useful, they tend to supply deficiencies in the available textbooks, and may become more useful as a student becomes more demanding of his (or her) resources.

The file below will be of more interest to advanced students, as it provides
a fairly extensive list of "indeclinable particles" (in both Sinhalese and
Burmese script) with English definitions. The Avyaya & Upasagga
("indeclinable particles") are short words equivalent to "but", "if", "and",
etc., as well as prefix and suffix syllables equivalent to english "con-", "syn-",
"para-", and so on. I have never seen an attempt at a complete list of these particles before; it seems to be a strange omission in both modern and classical sources. My advice would be to memorize them all.

Pali grammatical terminology has shifted around over the 2,500 year history
of the literature --and so students should expect some variation when comparing
various classical or modern sources. I here provide a short list of Pali terms
and "suggested abbreviations"; this is most of what a Pali student would need to
know, but it is neither exhaustive nor definitive for the terms that will arise
in all sources (e.g., Buddhaghosa uses significantly different terms from
Kaccayana and his followers). It is fairly important for every Pali student to
know "at least" these grammatical terms; English grammatical terms (such as
"continuative past participle") are not much use outside of a university
classroom, and are not even used consistently by modern Pali lexicons and
textbooks.

If the Ashokan-script abbreviations in the left-hand column are too
obscure, by all means ignore them and invent your own set of scribbles; every scholar may freely invent their own way to jot these things down in brief. However, the prevalent abbreviations of English terms such as "pr. pt. p." tend to be confusing to native English speakers --and are even more confusing for those who have studied English as a foreign language. An innovation of some kind is in order, and Ashokan "Brahmi" seems to be a more likely candidate for an international system of Pali annotation than any one "national" script.

This set of files will continue to grow as I gather more salient material from various manuscript traditions. These charts have a twofold inspiration: I was dissatisfied with the lunar calendar provided by the Pali-English Glossary of Buddhist Technical Terms, and I became increasingly concerned that no single source had brought together the disparate material on concepts of time and date found in Pali manuscripts from across South-East Asia. By putting these preliminary (but moderately fascinating) files on the web, I hope to encourage a few of my expert correspondents to send me some further sources to continue the series (update: many thanks to M.L., the first to help me out in this respect).

I am distributing revised, reformatted, and corrected editions of three Pali textbooks in the public domain. Some of these files have been revised and re-issued in several stages, roughly once each year since their advent on the website, with corrections and improved formatting (such as tables).

Narada’s work offers a somewhat simplified description of the language, accompanied by rote exercises, whereas Duroiselle’s is quite compendious in its description of the language, but offers almost nothing in the way of exercises for those engaged in self-study. The third, De Silva’s "Pali Primer", provides a series of graded exercises intended for beginners, with very succinct instuctions, rapidly expanding your vocabulary, if you've already worked through the grammar with the prior two textbooks.

Each of these textbooks has a separate, small web-page for direct downloading, with further description of the books' respective origins and contents as follows:

A few years ago, I also created a new edition of F. Mason’s 1868 Pali textbook Kachchayano’s [sic] Pali Grammar with Chrestomathy & Vocabulary, however, I am not inclined to distribute it on the internet. Each of these digital editions has involved a large volume of exceedingly minute labour on my part, but the revision of Mason’s work required typing out the full text (ex nihilo) in both Burmese and Sinhalese script, with innumerable corrections. If any scholars have a special interest in this work of Mason’s (or in Kaccayana generally) they may contact me by mail to arrange receipt of a manuscript.

…and a Cambodian Textbook (free to download)

If you're interested in learning more about Cambodian/Khmer/Khom, either as a modern language or as a Pali tradition, there’s a great deal of useful information in Huffman and Proum’s textbook, that I've scanned in and made freely available for download here. The book is, of course, mentioned in the section on Pali in the Cambodian tradition, above.

If you put my name into google, it should be pretty easy to find a list of my recent articles and public lectures (I maintain such lists elsewhere, so that this website isn't updated too frequently).

My public e-mail address can be inferred from the tag on the map below, showing some of the places I may have been seen during my first five years in Asia (2004–2009). During that period, updates to this website normally stated my "current location" somewhere along the dotted line. If you're not sure whether or not you recognized me down at the archives the other day, you can consult this portrait --a photo taken by Bhante Nyanatusita in Sri Lanka. I'm smiling somewhat ruefully, as the Pali manuscripts I'm "surveying" have been stacked up behind glass, on top of the statues, where we can be certain nobody will read them.

Author: Eisel Mazard.

History:

2010-09: Revisions to the sections on Burmese and Cambodian phonology.

2009-03-03: Added the long-belated section on Cambodian phonetics, with one interesting clarification under the Lao rubric, and something of a general overhaul to the opening sections. For the first time, due to stated demands from readers, a public e-mail address was made available (here).

2007-10-30: Added Dr. Lily De Silva’s textbook to the selection; new editions of both Narada and Duroiselle (PDFs) rolled out ("…for 2008"). The mention of the author’s current location was (none too mysteriously) removed.

2007-10-16: Changed "Bokeo" to "Yunnan", reflecting my recent exit from Laos; a few new charts and minor additions; the page’s encoding was (laboriously) changed to UTF8.