The abbreviation "NLP" stands for natural language processing. 'Natural language' does not mean any programming languages. In this case, we refer to Myanmar language processing on computers.

Wednesday, September 12, 2007

Syllable Segmentation

Syllable segmentation is the process of determination of syllable boundaries in a piece of text. Since Burmese is the tonal and analytic language and Burmese writing system is a syllabic writing system, the fundamental building blocks of a language are the syllables. In determination of syllable boundaries in Burmese Script, there can be of two types; 1) phonological boundary of a syllable, and 2) orthographic boundary of a syllable. Since Burmese script is a phonetic script, the phonological segmentation of a syllable is the basic segmentation. The phonological boundary of a syllable is defined, as the name goes, according to the phonological manner whereas the orthographic boundary of a syllable is defined according to the orthography. The orthographic syllable need not correspond exactly with a phonological syllable. The orthographic syllable is just the combination of phonological syllables and the non-breaking rules. Example: In a မႏၱေလးword, it has 3 phonological syllables, မန္, တand ေလး. But, for orthographic syllable, it has just 2 syllables, မႏၱand ေလး.