Arabic letters and diacritics

The original Qur'an and many other Islamic sources (some of them not translated yet) are in the Arabic language. Also when discussing Islam in English, there are many Arabic words used, like salah, Allah, masjid, Muhammad and others. Many Arabic words cannot be properly transliterated into English, due to the incompatibility of the alphabets.

Even if you don't want to have some deep knowledge of Arabic, when you know the alphabet (and diacritics), at least you can correctly read the Islamic terms and be less confused about their pronunciation (and also look more professional). This article is for people who speak English and have little to no knowledge of Arabic.

The Arabic Letters

Arabic letters often change form and connect to each other, when they are in one word (that is when there is no space between them).

Arabic letters don't have upper case and lower case forms.

The Arabic "alphabet" is not compatible with the English alphabet. There are letters in English, which cannot be transliterated using the Arabic alphabet and vice versa.

The word "alphabet" is derived from the words "alpha" and "beta", but Arabic doesn't have a letter like "a" (it has a letter "alif", which is however more complicated). Also many Arabic words consist of only consonant letters and the vowels are added by diacritics. So "alphabet" might not be the best way to describe the Arabic letters and abjad is used instead (الأَبْجَدِيَّة العَرَبِيَّة‎‎, al-abjadīyah al-ʻarabīyah, the Arabic abjad).

Although there are letters which represent vowels, they are used only to represent long vowels. The short vowels in words are either indicated by (optional) diacritics, or just not indicated at all and the reader just has to know how to pronounce it. There are two "levels" of diacritics. The first is I‘jām (إِعْجَام), small dots which are considered to be a part of the letter. They were not present in the earliest Qur'anic manuscripts. They are absolutely important as they determine the consonant. For example:

ب b

ت t

ث th

Then there is the next "level" of diacritics, the harakat (حَرَكَات), which is optional and often not present in the Arabic text. The Arabic word "harakat" حَرَكَات without the harakat, would look like this: حركات. The three little lines above the letter determine the three "a" in harakat.

Missing Arabic letters

There are English letters, which cannot be transliterated using the Arabic abjad. There are many words which could be used as examples. The examples used are the cases where the words are similar in Arabic and English.

E like in America

Arabic abjad cannot express the word "America", since it doesn't have a letter corresponding to the "e" sound. So Arabs drop the "e" between "m" and "r" (and prolong the "i") and pronounce this word as "Amreeka" (أمريكا).

O like in Europe

The letter o has no equivalent in the Arabic abjad. Arabs have to say "Awruba" (أوروبا).

P like in Europe

There is no letter P in the Arabic abjad. Arabs have to say "Awruba" (أوروبا).

The word Pakistan is Bakistan (باكستان) in Arabic.

The word Palestine is Filastin (فلسطين) in Arabic.

The word Egypt is actually Misr (مصر) in Arabic.

V like in video

The English letter "v" has no equivalent in Arabic and it often replaced by the letter "f". So the word "video" is "fidyu" (فيديو) in Arabic.

G like in Gabriel

There is no letter g (like in garden) in Arabic. So Gabriel is Jibreel (جبريل).

The Arabic letter ج (jim) is read as "g" in the Egyptian Arabic.

CH like in Charles

The "ch" letter is also missing in Arabic. But this sound can be emulated using a combination of the letters t (ت) and sh (ش). This sound is also described with two letters in English, but there is still a little difference between "ch" and "tsh", because "ch" really sounds like one letter.

Arabic letters compatible with the English alphabet

The form of the letter is different when the letter is written alone, when it is at the beginning, middle or the end of a word, so besides the letter alone, this article also provides a sequence of 3 (the same) letters written together. Which is still not enough, since some letters connect to the others and some don't. So when the letter doesn't "like to" connect to the next letter, you won't see a connected form. For example letters lam-dal (ل د) together, would be written this way لد, but letters dal-dal (د د) would be written this way دد (not connected). The Arabic letter د doesn't "want to" connect. The Arabic pronunciation of the name of the letter is in the brackets.

ب - ba (باء)

ببب

ث - tha (ثاء), like in "through"

ثثث

ج - jim (جيم), like in "Jeep"

ججج

ر - ra (راء)

ررر

ز - zay (زاي)

ززز

ش - shin (شين)

ششش

ف - fa (فاء)

ففف

ل - lam (لام)

للل

م - mim (ميم)

ممم

ن - nun (نون)

ننن

و - waw (واو)

ووو

Also could be described as "ue" like in "blue"

ي - yaa (ياء)

ييي

ِArabic letters ambiguously compatible with the English alphabet

There are Arabic pairs of letters which are pronounced a little differently, but they are transliterated into English as the same letter. For example the words سلام (salaam, peace) and صلاة (salah, prayer) start on a different kind of "s".

These letters (t, d, s) have two versions. The difference is like "je vais" and "je veux" in French. The best way to learn the difference would be to watch some videos on the Arabic alphabet and hear the difference.

d

د - dal (دال)

ددد

ض - daad (ضاد)

ضضض

s

س - sin (سين)

سسس

ص - saad (صاد)

صصص

t

ت - ta (تاء)

تتت

ط - ta (طاء)

ططط

There is also another kind of "t", that is pronounced as ت. It occurs usually at the end of a word and words ending on this letter are usually feminine. Thiw "t" is not read when reading the word alone, but it is read when the word is in a context of Arabic text. For example the word prayer, صلاة, can be read as either "salat" or "salah". When the "t" is not read, it is usually transliterated as "h".

ة - ta marbuta (تاء مربوطة)

ةةة

This letter occurs only at the end of words.

Then there are these two similar letters (k, q).

k, q

ك - kaaf (كاف)

ككك <-- very different forms

This is just a regular "k", like in the word "key".

ق - qaaf (قاف)

ققق

When pronouncing ق, the tongue goes up similarly like when pronouncing a regular "k", but deeper in the throat.

Then there are these two letters, both transliterated as "h". Sometimes ح is transliterated as "h" with a dot below it.

h

ه - haa (هاء)

ههه <-- very different forms

When the word الله (Allah) is written in calligraphy, the ه is often written as a curved line, like a horizontal "s".

A regular "h" like in "honey".

ح - haa (حاء)

ححح

Very different kind of "h". It has the typical exhale sound.

And a special letter:

ا - alif (ألف)

ااا

It can be either "a" (like in Arabic), "i" (like in Italy) or "w" (like in Washington). It depends on the diacritics.

ى - alif maksura (ألف مقصورة)

ىىى

Something like alif.

ِArabic letters not compatible with the English alphabet

Probably the ugliest sound of all letters:

خ - khaa (خاء)

خخخ

It's better to listen how it sounds. You lift the tongue as if pronouncing ق (qaf), but leave it up and breathe through it.

These two are a pair similar to د/ض and others, but the basic doesn't have a proper equivalent in the English alphabet:

dh, z

ذ - dhal (ذال)

Pronunciation similar to "d", but the tongue touches the upper teeth. Similar to "the".

ظ - dha (ظاء)

These two are very unique and definitely has to be heard to learn them:

ع - 'ayn (عين)

Throat squeezing sound.

غ - ghayn (غين)

Similar to the French "r".

And a glottal stop:

ء - hamza (همزة‎‎)

It is pronounced by quickly closing the throat (so that you can't breathe).

The word Qur'an has a hamza after the "r", which is indicated by the ' sign. So you shouldn't exhale during the whole pronunciation of the word, but (quickly) close your throat after "r" and then (after a little moment of silence) continue with "aan". The hamza is indicated by diacritics in modern Arabic script (القرآن‎‎, al-qur'an), but in the Uthmani script, it was a letter (القرءان, al-qur'an).

ِArabic letters mixed together

When the letters lam (ل) and alif (ا) are one after the other ("la"), they are written in a special way:

لا

The lam starts on the up left and the alif eventually goes to the right, even though Arabic is written from right to left.

There are also many forms where, for example, ل "goes down into" the letter م, but these are not used as often in a regular text.

However in the Arabic calligraphy this is used very often.

The Arabic Diacritics

The diacritics will be demonstrated with the letter د (d). The diacritics add additional "letters" (sounds in pronunciation) after the letter.

The basic diacritics

دَ - fathah (فَتْحَة)

Fathah (a line above the letter) means a short vowel "a" after the letter. So دَ is "da", بَ is "ba" and so on.

دِ - kasrah (كَسْرَة)

Kasrah (a line below the letter) means a short vowel "i" (ee) after the letter.

Dagger alif (vertical line above a letter) means a long vowel "aa" after the letter.

It is used a lot in the Qu'ran in the Uthmani script, but in the modern Imla'ei script, the words are often written with regular alif placed after the letter instead.

In the words zakat (زكاة‎‎) and salat (صلاة), there is a و (waw) with a superscript alif in the Uthmani script and it is read as a regular alif (صَّلَوٰةِ and زَّكَوٰةَ) without the "w" sound.

This diacritic is missing on Arabic keyboard layout. The Unicode number is U+0670.

آ - alif maddah (أَلِف مَدَّة) <-- don't confuse with "dammah"

maddah (a tilde above the letter) indicates a glottal stop before the alif (alif is long "aa"). Fot example the word Qur'an: قُرْآن. You have to do a "glottal stop" (close the throat) between "qur" and "aan". In the Uthmani script there was a letter hamza before the alif: القرءان. The hamza represented the glottal stop before alif.

Waslah (similar to a the first part of the letter ص) above alif means that the alif is not pronounced, when it is preceded by a vowel (from the previous word). However if the alif waslah is at the beginning of the speech, it is read as a regular alif.

The ال (al-) prefix has an alif waslah. So the alif is usually not pronounced and you will hear only the ل (l). Or you won't hear the "l" either, if the first letter of the word is a sun letter.

Nunation

Nunation (تنوين‎‎, tanween) is adding the sound of the letter ن (nun) to the end of a word (using diacritics). It is used on ء (hamza), ة (ta marbuta) and ا (alif). If the word ends on other letter, ا (alif) is added. Besides adding the "n", these diacritics add a vowel, similarly to fathah, kasrah and dammah. These word endings also determine whether the word is in the nominative, genitive or accusative case. This list uses the letter ا (alif):

اٌ "un" <-- don't confuse with the sign dammah (اُ) or waslah (ٱ)

The letter و with a break on the left side, above a letter, besides the "un" sound, symbolizes the nominative case of the word.

اٍ "in"

Two kasras below the last letter mean that the word is in the genitive case.

اً "an"

Two fathas above the last letter mean that the word is in the accusative case.

َQur'an stop signs

There are little diacritics in the Qur'an, which indicate, for example, if a person can (or can't or shouldn't..) stop (to take a breath) during the recitation. They are called stop (waqf) signs.

Meaningless diacritics

In the Arabic calligraphy, sometimes there are signs around the word, which are not diacritics and they just fill the space.

The Arabic Numerals

Bigger numbers are written in the same order (from left to right) as in English. So 390 would be written as ٣٩٠.

Modern Arabic text often uses the same numerical symbols that are used in English.

Beginner problems

How to "extract" letters from an Arabic word

The Arabic letters are written differently in the beginning, middle and end of the word and also when they are written alone. Some of the letters have are connected and some of them are not connected with the others. The easiest way to determine the letters is to copy that word in to a text editor and insert spaces between the letters. That way, you will see their basic form. Some letters look very similar in some cases. Like the letter ف and غ, when they are in a middle of a word. Look at the middle letter:

ففف (f-f-f)

غغغ (gh-gh-gh)

Of course the correct way to determine the letters is to learn all the forms of the letters.

How to read an Arabic word

If you know the alphabet, you can read the sounds of the letters in a word. But the vowels are often missing, so you have to either determine them from the Arabic diacritics, or from the English transliteration of the word, or from hearing the word.

The words often start with the ال (al-) prefix. The prefix indicates a definite article, like "the" in English. Reading this prefix is complicated. The pronunciation of the alif (ا) is determined by ending vowel of the word before and the pronunciation of the lam (ل) is determined by the first letter after the ل. The alif is just read as the last vowel of the word before. The lam is sometimes read as "l", but sometimes it is read as the first letter of the word. When it is read as the first letter of the word, there is a shadda diacritic sign on the first letter of the word, indicating that the letter is to be pronounced with double length. The Arabic alphabet is divided into sun and moon letters. The sun letters "eat" the lam, the moon letters don't. When a word stars with a moon letter, the ل in ال is read as a regular "l".

So Al-Quran (القران) is read as Al-Quran, but Al-Rahman (الرحمان) is read as Ar-Rahman, because ر is a sun letter.

In the word الله (Allah), the second l "eats" the first one, and is pronounced twice as long, but they are the same letters, so the ل being a sun letter doesn't make much difference.

When the word is alone, the alif in al- prefix is read as a simple "a". Otherwise it depends on the case of the word before. If it ends on w (ue), the word (before) is in the nominative. If it ends on i (ee), it is in the genitive. If it ends on a, it is in the accusative.

How to write an Arabic word

If you know only the English transliteration, you will have to search for the original Arabic word. There is a list of Islamic terms on WikiIslam, with the original Arabic words. When you know from which letters this word is derived, you can write the letters. You will probably have to use a virtual keyboard on the screen. On modern operating systems you can easily switch between different keyboard layouts. After some while, you can memorize the positions of the Arabic letter on the English keyboard and you can write "blindly" over the English letters on your keys. Or write directly on Arabic keyboard on a touch screen. You add diacritics after writing the letter.

If you want to know how to write the letters with a pen, watch some tutorial video.

How to read the Qur'an

Apply all the rules from this article and to be sure, read the English transliteration and listen to a recitation. In the "fancy" printed version, the letters are often in a ligature, so focus more on the dots, to determine the letter. For example two dots below, mean that there is definitely the letter ي. Two dots above mean that there is definitely the letter ت.

There are many versions of the Qur'an written with different scripts. The Uthmani script is older, but more complicated. The Simple (Imla'ei) script is easier. It's easier to read it in the form of a "computer-generated" text, than from scanned images. The printed Qur'an has additional "stop marks" and widened letters and other "fancy" content, which might be too complicated/distracting/misleading for a beginner reader.

In the printed "fancy" Qur'an, the verses usually don't end with a new line, but with a verse number, written in a circular symbol.

There are English websites which provide word-by-word translation and grammatical analysis of the Qur'an [2] so you practically don't need to know the Arabic grammar.

If you want to discuss a meaning of some word, search for all the words derived from the same trilateral root and see the basic meanings of the root. Or look into the tafsirs.

How to read the Islamic calligraphy

The calligraphy often shows some common Islamic phrase, so if you know the Islamic Arabic phrases, you can guess what it is just from the first few letters.

Conclusions

The Arabic alphabet is not perfect. It cannot express some basic sounds, which can be pronounced. So considering the Arabic language to be the best or universal is not accurate, since it is limited from the beginning by its limited alphabet (abjad).

English alphabet can pronounce some sounds, which the Arabic one can't and vice versa.

Although the Arabic alphabet doesn't have a simple letter like "p", it has two different letters for a letter like "t". The two "t" are a little different in pronunciation, but having two letters for this might be considered redundant. Not to mention the ta marbuta.

Although vowels have their own letters (ا و ي), they are often omitted in words and indicated by diacritics.

The diacritics look like an ad hoc solution for missing sounds in pronunciation, so the Arabic writing system seem to be bad designed.

The writing system would probably be more re-designed and simplified if Arabs didn't believe that the way the Qur'an was written is the best way.

The absence of any diacritical marks in the first written Qur'an is a big problem for the full preservation of the Muhammad's recitations.