As has been mentioned, normal Arabic text written for adults does
not contain vowels and other phonological markings necessary to
expand the orthography to a reasonably phonetic form. This is
in some sense analogous to the grapheme-to-phoneme problem for
English; the correct pronunciation of an English word is not
often obvious from its spelling, and there are many words for
which multiple pronunciations are possible. For English, however,
we can rely on electronic lexicons that provide the correct
pronunciation for an orthographic string. A comparable body of
work does not exist for Arabic.

For synthesis, we must know what the correct vowel is. Diacritics
indicating the correct MSA vowel are shown in religious texts and
literature for children, and are known as the vocalization or
the voweling. The process of adding all of the diacritics
to an unmarked text is called diacritization.

There are two obvious approaches to solving the
voweling problem for spoken language: inferring the vowels and
enumerating the lexicon. The former has been applied with some
success in recognition; vowels were guessed with 80%
accuracy [6]. Synthesis requires a much higher
level of accuracy than recognition, however, and we have selected the
enumerative approach to voweling.