Thursday, November 27, 2014

CDLI search

CDLI has recently implemented at <http://cdli.ucla.edu/search/> a search functionality, similar to that for transliterations, for lines of translation and comment that form a part of our core text annotation files. While only a fraction of the numbers of translations available through the Oracc consortium, there are, still, currently some 54,000 lines of translated cuneiform text in CDLI files, mostly in English, but including some instances in German, French, and even Catalan; 14,700 lines of interlinear annotation, from comment on sign preservation up to calculations that underlie numbers in accounts and metrological-mathematical texts, and 88,000 lines of (usually formulaic) comment to text structure. The bulk of current CDLI translations is comprised of those created by Dan Foxvog for the Mesopotamian Royal Inscriptions component of the website (nearly 30,000 lines in 1550 texts; see <http://tinyurl.com/mdhzlrg> and <http://cdli.ucla.edu/projects/royal/royal.html>), and we anticipate more translation content of Sumerian literary texts as ETCSL migrates to CDLI; but 13,600 lines in 1530 administrative texts are also now in some form of translation (<http://tinyurl.com/kjkcut4>). For the record, CDLI restricts translation of texts liable to appear in multiple witness artifacts to their artificial composite entries. As with transliteration search, the exact string of searched characters in translations and comments are highlighted in blue to facilitate their discovery within the displayed texts. Exact string in these instances means that, for example, a search for “pig” will display that string as a discrete word, but also all uses of “pigs,” “pigherder,” and so on. Only “pig” will be highlighted. Please note that the search engine results pages only report numbers of texts found, not individual references to a given search string. Thus a search for “calculation:” in comment results in 228 texts found, but altogether 1026 uses of “calculation:”. As with transliteration search, users can enter multiple character strings in a field, each separated by a comma, for instance "lukalla,account” in translation (currently just six hits, at <http://tinyurl.com/pegtatb>), but unlike transliteration these searches are always of full texts and cannot be restricted to single line, and are not case sensitive, neither of which seemed to us to contribute materially to search strategies.

The primary focus of the project is notice and comment on open access material relating to the ancient world, but I will also include other kinds of networked information as it comes available.

The ancient world is conceived here as it is at the Institute for the Study of the Ancient World at New York University, my academic home at the time AWOL was launched. That is, from the Pillars of Hercules to the Pacific, from the beginnings of human habitation to the late antique / early Islamic period.

AWOL is the successor to Abzu, a guide to networked open access data relevant to the study and public presentation of the Ancient Near East and the Ancient Mediterranean world, founded at the Oriental Institute, University of Chicago in 1994. Together they represent the longest sustained effort to map the development of open digital scholarship in any discipline.