Call for Papers

Over the past decade, there has been
a growing interest in collecting, processing and analyzing data from genres of
social media and computer-mediated communication (CMC): As part of large
corpora which have been automatically crawled from the web, CMC data are often
regarded as an unloved “bycatch” which is difficult to handle with NLP tools
that have been optimized for processing edited text; on the other hand, the
existence of CMC data in web corpora is relevant for all research and
application contexts which require data sets that represent the full diversity
of genres and linguistic variation on the web. For corpus-based variational
linguistics, CMC corpora are an important resource for closing the "CMC
gap" both in corpora of contemporary written language and in corpora of
spoken language: Since CMC and social media make up an important part of contemporary
everyday communication, investigations into language change and linguistic
variation need to be able to include CMC and social media data into their
empirical analyses.

Nevertheless, the development of approaches and tools for processing the
linguistic and structural peculiarities of CMC genres and for building CMC
corpora is lacking behind the interest of dealing with these types of data in
the field of language technology, corpus-based linguistics and web mining.

The goal of the NLP4CMC workshops which are organized by the GSCL special interest group "Social Media / Computer-Mediated Communication" is to provide a platform for the
presentation of results and the discussion of ongoing work in adapting NLP
tools for processing CMC data and in using NLP solutions for building and
annotating social media corpora. The main focus of the workshops is on German
data, but submissions on NLP approaches, annotation experiments and CMC corpus
projects for data of other European languages are also welcome.

The 1st NLP4CMC workshop was
held in September 2014 at KONVENS at the University of Hildesheim.
Proceedings of the workshop have been published as part of the KONVENS 2014 workshop proceedings.

The 2nd NLP4CMC workshop will
be held in September 2015 at the annual conference of the German Society for
Language Technology and Computational Linguistics (GSCL) at the University of
Duisburg-Essen. Online proceedings.

TOPICS OF INTEREST:

We encourage the submission of long and short research and demo papers including, but not restricted to the following topics related to social media / CMC

Corpora and lexical semantic resources for the analysis of social media / computer-mediated communication