CRATER corpus

Corpus CRATER

CRATER

ID:

ELRA-W0003

The Corpus Resources and Terminology Extraction project (MLAP-93 20) has extended the bilingual annotated English-French International Telecommunications Union corpus to include Spanish, and has also debugged the existing corpus. The offer consists of a multi-lingual aligned corpus of 1,000,000 tokens per language for English, French and Spanish, with morphosyntactic annotations (human-edited).

An extended version of CRATER (ref. ELRA-W0003) is available in CRATER 2 (ref. ELRA-W0033)