Working with the web as a source for dictionaries of informal vocabulary

Working with the web as a source for dictionaries of informal vocabulary

Abstract

Informal vocabulary, e.g. slang, jargon and other forms of expression that are particular to different types of small or closed groups, is usually suppressed in writing that has passed an editorial process. That is to say with at least one important exception: the dialogues in works of fiction. This means that this type of vocabulary is not so readily gathered for the purpose of lexicon-making. Or this has nevertheless been the case up until recent years. But the constant stream of linguistic diversity on the Internet has given us new possibilities to tap into to the flow of colloquial and informal language.The aim of this presentation is foremost to give a brief account of how the Internet could be ‘harvested’ for the purpose of creating corpora which include substantial amounts of informal language, and secondly, how to use these (in this case Swedish and Icelandic) corpora to gather candidates for headwords with informal markings such as coll., slang, and the like.