Student

Formalia

Lectures
Information Retrieval
oder
Information Mining and the use of tools such as RapidMiner
(essential)

Task description

Social media platforms allow millions of internet users to easily create and share
multi-media content. This generates a continuously increasing volume of big data
that harbours precious knowledge of the crowds. Much of crowd wisdom is bundled
up in arguments, i.e. claims that are supported or refuted by evidence. This
evidential data could be used to answer questions, understand complex
phenomena or evaluate services and products - if it was easily accessible.
However, currently, analytic tools can only tell what users report in big data, not
why.

This project will involve the acquisition of data. Once the data is collected it should be manually annotated for arguments,
i.e. claims and evidences for claims should be extracted. Finally, supervised machine learning should be used to perform the extraction of arguments automatically.
Here the student should investigate different features (attributes) as well as machine learning algorithms. Each of these should be evaluated against the gold standard data
(the manually annotated data).

Tasks:

literature scan. This should be done before the actual project starts. Here the student will be given some initial papers.
Based on these papers the student should collect more papers, perform a review of all the papers and prepare an oral presentation of 30 mins. providing an intro to the field. This should take 2-3 weeks.
Actual work:

Acquisition of data. This could be done either automatically or manually.

Preprocessing of data. This can be done automatically using Natural Language Processing techniques.

Manual annotation of the data. This means that the data collected above needs to be annotated for arguments.

Performing the argument mining. The student should perform automatic feature extraction and apply machine learning to perform the argument extraction automatically.
Results of the automatic system should be evaluated using precision and recall.