hate speech

Abstract

No, there was no decrease of hate speech in the German social media buzz during the last 6 months. As already seen during the last 2 years, the largest number of hate speech is found in twitter (70%), followed by facebook (10%), blogs and forums.

I have developped a detector which finds public hate speech posts in these pagetypes. Read here, how this detector works.

Findings

Here are the data the detector produced for all 4 page types from May to October 2017:

The peaks mark terroristic attacks during this summer. We had a knife attack July 28th in Germany and the Barcelona-car on August 20th.

Due to the dominant role of twitter in the diagram above, Facebook’s performance is somewhat flattened. Here is the diagram only for Facebook:

A decreasing line looks different.

Now, before you interpret and use some of the numbers shown in the diagrams, you should read however the following paragraph.

Method

The detector works very simple, using a common social media monitoring tool. The queries for the detection-research look for posts containing “hate-words”. Hate-words are swearwords which mean people give to others, who they hate, to muslims, blacks, jews, foreigners and so on. You know some of these words of course, I don’t have to list them here. They are all ugly.

Now when a post contains such a hate-word, the post is counted as hate speech.

Does that cover all posts in question correctly? No, it does not.

The detector finds post which are no hate speech. For instance if I had given an example of a hate word in this post, this very post would count. It contains one of the ugly words.

The detector does not find true hate speech. The phrase: “One should really hang all xyz!” contains no hate word (as long xyz itself isn’t one). Hence, when looking through the detector’s lens, this phrase is no hate speech.

So the real volume of hate speech may very well be quite off the number we have found with the little detector here.

But how relevant is this absolute volume? Much more important to me is, how this number develops. It is quite obvious that its curve will have a very similar shape as the one of the detector. When the true hate speech volume increases the detector will show an increase too – and vice versa.

More information

I have separated the queries by “peer-group”. One query f.i. is called “jew-enemy”, another one “muslim-enemy”. The curves for these queries are not completely parallel and hence provide further interesting insights.

Since the detector now runs since more than two years, I have collected some data from that time. It is avalaible in principal.