Workshop this November at Google Hamburg, Germany.

Task Description

We invite you to participate in our challenge on the detection of clickbait posts in social media. Clickbait refers to social media posts that are, at the expense of being informative and objective, designed to entice its readers into clicking an accompanying link. More on clickbait.The task of the challenge is to develop a classifier that rates how click baiting a social media post is. For each social media post, the content of the post itself as well as the main content of the linked target web page are provided as JSON-Objects in our datasets.

As primary evaluation metric, Mean Squared Error (MSE) with respect to the mean judgments of the annotators is used. For informational purposes, we compute further evaluation metrics such as the Median Absolute Error (MedAE), the F1-Score (F1) with respect to the truth class, as well as the runtime of the classification software. For your convenience, you can download the official python evaluation program.

Related Work

Clickbait tweets typically aim to exploit the "curiosity gap", providing just enough information to make readers curious, but not enough to satisfy their curiosity without clicking through to the linked content.

A tweet is Clickbait if (1) the tweet withholds information required to understand what the content of the article is; and if (2) the tweet exaggerates the article to create misleading expectations for the reader.

Clickbait is saying "this town" or "this state" or "this celebrity" instead of saying Los Angeles or Colorado or Justin Timberlake. It's over-promising and under-delivering. It's leaving out the one crucial piece of information the reader may want to know.

This paper presents the first machine learning approach to clickbait detection: the goal is to identify messages in a social stream that are designed to exploit cognitive biases to increase the likelihood of readers clicking an accompanying link.

Datasets

Over the course of the competition, three datasets are going to be released. Each dataset is provided as a zip archive with the naming pattern clickbait17-<dataset>-<version>.zip. It contains the following resources (the unlabeled dataset lacks the truth file):

instances.jsonl: A line delimited JSON file (JSON Lines). Each line is a JSON-Object containing the information we extracted for a specific post and its target article. Have a look at the dataset schema file for an overview of the available fields.

truth.jsonl: A line delimited JSON file. Each line is a JSON-Object containing the crowdsourced clickbait judgements of a specific post. Have a look at the dataset schema file for an overview of the available fields.

media/: A folder that contains all the images referenced in the instances.jsonl file.

Software Submission

We use the Evaluation as a Service platform TIRA to evaluate the performance of your classifier. TIRA requires that you deploy your classifier as a program that can be executed with two arguments for input and output directories via a command line call. E.g., the syntax could be:

> myClassifier -i path/to/input/directory -o path/to/output/directory

example command line call for tira.io

At runtime, the input directory contains the unzipped dataset (i.e. instances.jsonl and media/ folder) your classifier has to process. The predictions of your classifier should be written into a file called results.jsonl into the given output directory. The results.jsonl file should contain a valid JSON-Object in each line that contains the id and the predicted clickbaitScore for a post (cf. the dataset schema file).

We will ask you to deploy your classifier onto a virtual machine that will be made accessible to you after registration. You can choose freely among the available programming languages and among the operating systems Microsoft Windows and Ubuntu. You will be able to reach the virtual machine via ssh and via remote desktop. More information about how to access the virtual machines can be found in the user guide below:

Once deployed on your virtual machine, we ask you to access TIRA at www.tira.io, where you can self-evaluate your software on the test data.

Note: By submitting your software you retain full copyrights. You agree to grant us usage rights only for the purpose of the Clickbait Challenge. We agree not to share your software with a third party or use it for other purposes than the Clickbait Challenge.

Paper Submission

Paper submission information and paper templates will we provided as soon as the organization of the workshop has been finished.

Workshop

The workshop takes place on November 27, 2017 at the Google campus in Hamburg, Germany. Details concerning traveling and accommodation options follow.