CorpusBuilder Slovenian Corpus

The documents in this corpus were collected in January 2001 by
the CorpusBuilder system.
They were all filtered using van Noord's
TextCat language filter.
A document is included if TextCat assigned Slovenian as the most probable language.
Some documents may contain small amounts of English or other languages.
No manual filtering has been performed on these pages.