LINGUIST List 15.842

Thu Mar 11 2004

Calls: Computational Ling/Spain; Computational Ling

Editor for this issue: Andrea Berez <andrealinguistlist.org>

As a matter of policy, LINGUIST discourages the use of abbreviations
or acronyms in conference announcements unless they are explained in
the text.
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.

Computational Approaches to Arabic Script-based Languages
Short Title: coling2004 workshop
Date: 28-Aug-2004 - 28-Aug-2004
Location: Geneva, Switzerland
Contact: Karine Megerdoomian
Contact Email: karineminxight.com
Meeting URL: http://members.cox.net/karinem/COLING2004
Linguistic Sub-field: Computational Linguistics
Subject Language: Arabic, Standard ,Kurdi ,Pashto, Southern ,Farsi,
Western ,Urdu
Call Deadline: 25-Mar-2004
This is a session of the following conference: 20th International
Conference on Computational Linguistics
Meeting Description:
Recently, there has been a surge of interest in the study of the
languages of the Middle East, especially Arabic, Persian (Farsi),
Pashto and Urdu. Computational applications for proper name
identification, entity recognition, categorization, information
retrieval, summarization, machine translation and other
implementations are currently in high demand. The goal of this
workshop, being held as a session of COLING 2004, is to provide a
forum for those involved in the development of NLP systems in Arabic
script languages to exchange ideas, approaches and implementations of
computational systems; to discuss the common challenges faced by all
practitioners; and to assess the state of the art in the field.
SECOND CALL FOR PAPERS
COLING 2004 WORKSHOP ON
COMPUTATIONAL APPROACHES TO ARABIC SCRIPT-BASED LANGUAGES
Geneva, Switzerland, 23-27 August 2004
http://members.cox.net/karinem/COLING2004
WORKSHOP DESCRIPTION
Recently, there has been a surge of interest in the study of the
languages of the Middle East, especially Arabic, Persian (Farsi),
Pashto, Kurdish and Urdu. This sudden and urgent interest is
manifested by the availability of funding for rapid development of
practical systems for processing large volumes of data in these
languages. Computational applications for proper name identification,
entity recognition, categorization, information retrieval,
summarization, machine translation and other implementations are
currently in high demand. This comes at a time when advances in formal
and computational linguistics over the last fifty years are being
consolidated, while work on machine learning and statistical methods
has been showing great promise.
Although there exists a considerable body of work in computational
linguistics specifically targeted to these middle eastern languages,
much of the research and development has been the result of
initiatives by individual research establishments or industry
firms. Furthermore, the usage of the Arabic script gives rise to
certain issues that are common to all these languages despite their
being of distinct language families. Hence, these languages share
properties such as the absence of capitalization, right to left
direction, lack of clear word boundaries, complex word structure, a
high degree of ambiguity due to non-representation of short vowels in
the writing system, and related encoding issues.
The goal of this workshop is to provide a forum for those involved in
the development of NLP systems in Arabic script languages to exchange
ideas, approaches and implementations of computational systems; to
discuss the common challenges faced by all practitioners; and to
assess the state of the art in the field. In addition, one of the aims
of the workshop is to identify promising areas for future
collaborative research in the development of NLP systems for Arabic
script languages. Solutions that are designed to solve the specific
problems of these languages could very well have wider applications
and relevance to the rest of the NLP community.
WORKSHOP TOPICS
Authors of papers in any area of NLP in Arabic script-based languages
are encouraged to apply. We encourage submissions dealing with
language-specific issues, as well as discussions of challenges imposed
by the usage of the Arabic script. Papers dealing with various
methodologies such as statistical approaches, shallow parsing and
linguistic-based analyses are encouraged. Submissions could also be on
- but not limited to - any of the following topics:
* Morphological analysis
* Syntactic ambiguity resolution
* Machine translation from and to Arabic script languages
* Sense disambiguation
* Homograph resolution
* Semantic analysis
* Entity recognition
* Information retrieval
* Classification of documents
* Text mining
* Summarization
* Speech recognition and generation
* Lexical databases
* Knowledge and domain representation
* Spelling and grammar checking tools
Proposals for formal demonstrations of advanced operational systems as
well as research prototypes are welcome.
SUBMISSION REQUIREMENTS
Papers should be original, previously unpublished work and should not
identify the author(s). They should be no longer than 8 pages
(including figures and references) and should emphasize completed work
rather than intended work. Papers that are being submitted to other
conferences must reflect this fact on the title page. Submissions are
limited to one individual and one joint paper per author.
Demonstration proposals should give a short description of the system,
provide its technical specifications and indicate how the
demonstration illustrates new ideas and contributes to the
computational work on Arabic-script languages. The proposals are not
to exceed 4 pages.
Email submissions (ps or pdf) are preferred and should be sent to both
AliFarghalyaol.com and karineminxight.com. Submissions should be in
English. The papers should be attached to an email indicating contact
information for the author(s) and paper's title. The hardware,
software and network requirements for the system demonstrations should
also be indicated in the text of the email. Formatting requirements
for the final version of accepted papers will be posted as soon as
they become available.
Hardcopy submissions should be sent to:
Ali Farghaly
SYSTRAN Software, Inc.
9333 Genesee Ave, Pl 1
San Diego, CA 92121
USA
PROCEEDINGS AND WORKSHOP ORGANIZATION
Accepted papers and formal demonstrations will be published in a
proceedings volume. For the workshops to take place, the COLING 2004
organizers require at least 20 participants to register for the
workshop. Speakers and participants are therefore asked to register
via the official COLING 2004 site as soon as possible.
IMPORTANT DATES
Submissions due: March 25th, 2004
Notification date: April 25th, 2004
Deadline for camera ready copy: May 25th, 2004
ORGANIZING COMMITTEE
This workshop is organized by
Ali Farghaly (SYSTRAN Software, Inc.)
Karine Megerdoomian (Inxight Software and University of California San
Diego)
The call for papers as well as future information on the workshop can
be found at
http://members.cox.net/karinem/COLING2004
PROGRAM COMMITTEE
Jan W. Amtrup, Bowne Global Solutions
Tim Buckwalter, Linguistic Data Consortium
Miriam Butt, Konstanz University, Germany
Violetta Cavalli-Sforza, Carnegie Mellon University
Joseph Dichy, Lyon University
Abdel Kadir Fassi Fehri, Arabization Bureau, Rabat, Morocco
Andrew Freeman, University of Washington
Nizar Habash, University of Maryland, College Park
Masayo Iida, Inxight Software, Inc.
Simin Karimi, University of Arizona
Martin Kay, Stanford University
Kevin Knight, USC/Information Sciences Institute
Farhad Oroumchian, University of Wollongong in Dubai
Ahmed Rafea, The American University in Cairo
Jean Senellart, SYSTRAN Software
Bonnie Glover Stalls, University of Southern California
Remi Zajac, SYSTRAN Software