THE DOHA DICTIONARY.

The Arab Center for Research and Policy Studies announced the official launch of the Doha Historical Dictionary of the Arabic Language, on May 25, 2013, following two years of extensive preparation by a select group of linguistic experts, lexicographers, and computational scientists from a variety of Arab countries. […]
During the meeting, they also announced the launching of a temporary website for the lexicon, hosted on the ACRPS domain for the time being[…]
The new dictionary, which will chronicle the history of Arabic terms over 2,000 years, is projected to take 15 years until completion, with achievement highlights being presented every three years. The dictionary hopes to make possible the facilitation of research on Arab intellectual legacy through the work it uncovers. As a comprehensive electronic corpus, the dictionary will be able to assist a number of projects related to machine language in Arabic, including machine translation and automated spelling and grammar checkers. A number of specialist lexicons will also be published as auxiliaries to the main project, including dedicated works on scientific terms, terms related to the study of civilization, a complete dictionary of contemporary Arabic, and educational dictionaries.

I’ve been complaining about this lack since 2004, and I’m thrilled it’s being dealt with. (Of course, “15 years until completion” is pure fantasy, but let’s not tell them that…) Hat tip to Paul Ogden, who has provided me with so many great links!

I wonder how they plan to deal with the modern ‘dialects’.
As almost all published works – lexicography’s raw material – are produced in Modern Standard Arabic, so by cultural convention the variants will get short shrift in the dictionary.
The Arabic dictionary project has elements in common with the Historical Dictionary Project of the Hebrew Language, yet in other ways they’re quite unlike.
Unlike Hebrew, Arabic has been continuously spoken through the centuries. Yet in the Arab world nothing was printed in Arabic until Napoleon introduced the printing press to Egypt around 1800, while books in Hebrew were published just a few years after Gutenberg. Hebrew evolved even within the bible, continued evolving in written works before the invention of the printing press, and continues to evolve today. Written/published Arabic today, however, appears to have become quite rigid, even if it differs from the classical form.
The 2002 Arab Human Development Report says “(t)here are no reliable figures on the production of books, but many indicators suggest a severe shortage of writing; a large share of the market consists of religious books and educational publications that are limited in their creative content. The figures for translated books are also discouraging. The Arab world translates about 330 books annually, one fifth of the number that Greece translates.” (I was able to locate some older UN statistics here.) It may be noteworthy that only four percent of Arabs get their news from a printed medium. (From the same report: 55 percent of respondents indicated that they have never used the internet, while 42 percent reported that they used it infrequently.)
A June 7 article in the Jerusalem Post (subscription) says that in 2012 there were about 6,525 books of all types published in Hebrew in Israel, “with English next in line at 472 publications, followed by 220 in Russian and 181 in Arabic. Of the Hebrew-language offerings, 1,419 were translations.” About 20 percent of Israel’s eight million people are Arabs, about a million were born in countries of the FSU, and perhaps 150,000 speak English as their first language.
The relative scarcity of published material in Arabic should make the lexicographers’ job easier than, say, that of the folks at OUP. Yet, as Hat says, to complete the dictionary in 15 years is fantasy. However long it takes, one wishes the project success.

Sounds like a great idea. Unfortunately, details are pretty sparse – I hope they’re serious about this, but a political think tank isn’t exactly the most promising startup environment.
There’s no a priori reason to assume that the project will restrict itself to published works; certainly I wouldn’t do so if I were going to make a dictionary of Arabic. The influence of the dialects on the standard language, especially on so-called Modern Standard (which is not all that modern, and strikingly full of variation), is such that for any serious discussion of etymology they will have to do some work on the dialects too, even if they don’t attempt to include all of their vocabulary.
As noted here, there are no genuinely reliable statistics on Arabic publishing. However, those quoted in the AHDR appear rather low – according to the site I just linked, in 2006-8, an average of 1670 books a year were translated into Arabic. I find it unlikely that the number of books translated went up by a factor of 5 in four years. And it won’t make much difference for the lexicographers whether the publication is religious, educational, or “creative”.
For printed media, I’m surprised the figure was even as high as 4%. The coverage of world affairs on the satellite channels is far superior to that of any Arabic newspaper I know, and is much less subject to censorship. Before the spread of satellited dishes in the 1990s it would no doubt have been much higher.

There’s no a priori reason to assume that the project will restrict itself to published works.
Dictionaries, in the conventional sense of the term, make their entry-choices based on published, or at minimum, printed materials (internet text can now also be included, assuming one can overcome link-rot/citation issues). But a problem arises if you’re going to produce a dictionary with at least some entries based on recorded speech: How do you determine the difference between a vagary of pronunciation and a difference of dialect? And how do you decide the spelling of an utterance when you render it as text?

And you can support my book habit without even spending money on me by following my Amazon links to do your shopping (if, of course, you like shopping on Amazon); I get a small percentage of every dollar spent while someone is following my referral links, and every month I get a gift certificate that allows me to buy a few books (or, if someone has bought a big-ticket item, even more). You will not only get your purchases, you will get my blessings and a karmic boost!

Favorite rave review, by Teju Cole:
"Evidence that the internet is not as idiotic as it often looks. This site is called Language Hat and it deals with many issues of a linguistic flavor. It's a beacon of attentiveness and crisp thinking, and an excellent substitute for the daily news."

From "commonbeauty"

(Cole's blog circa 2003)

All comments are copyright their original posters. Only messages signed "languagehat" are property of and attributable to languagehat.com. All other messages and opinions expressed herein are those of the author and do not necessarily state or reflect those of languagehat.com. Languagehat.com does not endorse any potential defamatory opinions of readers, and readers should post opinions regarding third parties at their own risk. Languagehat.com reserves the right to alter or delete any questionable material posted on this site.