Media Type:

8 Language Resources

This is a phonetic lexicon of 21,560 tokens in Pashto with their phonetic transcription in IPA. It covers the major dialect of the TRAD Pashto Broadcast News Speech Corpus (see ELRA Catalogue reference ELRA-S0381) from which the most frequent words were extracted. The pronunciation dictionary of ...

This is a parallel corpus, which contains 10,000 Pashto words translated into English by two different translators. The source texts have been collected from the following news websites: Azadiradio, Mashaal and Voice of America Pashto.
The content has also been translated into French (see ELRA-W...

This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts have been collected from the following news websites: Azadiradio, Mashaal and Voice of America Pashto.
The content has also been translated into English (see ELRA-W...

This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts come from 3 broadcast news transcriptions of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). These texts are VOA Ashna TV programs recorded on 15/01/2011,...

The corpus consists of the transcription of 106 hours of recordings in Pashto translated into French. The transcriptions are extracted from the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). It contains about 832,000 source words and 747,000 target words. No audio file is provided.
Pasht...

This is a monolingual text corpus in Pashto. The corpus contains about 112,000,000 tokens collected from 46 different blogs and websites.
Identified and negotiated or freely available sources have been crawled in 2012, cleaned and XML-formatted.
Pashto is an indo-iranian language spoken by th...