Description: This project will develop an extensive Arabic learner corpus comprising numerous written samples produced by L2 and heritage students, collected over 15 years of teaching. They will be transcribed into a database with cross-referenced categories according to level (beginning, intermediate, advanced), learner (L2 vs. heritage), and genre (description, narration, instruction). The corpus will serve as a source of empirical data for hypothesis testing as well as a resource for developing materials for teaching Arabic. Once it is completed in Summer 2010, it will be made available through the CERCLL and CMES websites, and will be offered to the Linguistic Data Consortium (LDC) for dissemination nationally. A Spring 2010 workshop/demonstration took place at the Western Consortium of Middle East National Resource Centers' Language Workshop.