Technology

Using machine learning to slow the spread of hate speech.

You can’t stop what you don’t understand.

The first key to countering hate speech is to have a clear definition of what it is.

We adapted Dr. Gregory Staunton’s 10 Stages of Genocide to create a structure for hate speech identification. Originally presented in a briefing to the U.S. Department of State in 1996, the report helped us understand the process of classification and dehumanization.

We changed it by condensing the stages and removing some that aren’t relevant to Twitter (e.g. extermination). We also added contemporary phenomena found in social media (e.g. coded language).

Our hate speech classifications

Mode

Output

5. Intention

Incitement to genocideIncitement to general violenceIncitement to specific violenceIncitement to degrade and discriminate

4. Polarization

Inculpation of target groupHistorical negationismPromotion of known hate groupsExclusion of target group

When there is such a volume, we have to ask ourselves what can we do? What can the Internet service providers do? What can vast segments of society do? So that we hold people accountable and create safe spaces online the way we expect those spaces to be in the real world.

— Oren Segal | Director of ADL's Center on Extremism

We teach machines to help us.

The power of machine learning is that it allows us to analyze thousands of tweets and return hate classifications within milliseconds. The flexibility of our platform allows us to continually adapt our model to constantly evolving terminologies used by hate groups on social media.

Step 1:Build a Machine

We leverage enterprise-level AI platforms for Natural Language Processing and Image Recognition APIs, so that we are able to digest and interpret messages as they are posted, in near real time.

Step 2:Train the Machine

Our Machine needs to be good at sniffing out one thing- hate speech. So we need to feed it a stream of hate speech in social media to break down and learn from. We use Spredfast, an intelligent social listening platform, to moderate incoming messages and categorize them into streams of hate speech. Those streams are fed on an ongoing basis, into our Machine so it can understand the linguistic nuances begin learning.

But even with artificial intelligence, there are challenges in identifying hate speech online.

Machines have trouble understanding the subjectivity and nuance of hate speech. See the examples below, all referencing "third world" in different ways.

Not Hate Speech

Folks like to say baseball is boring but we're about to have the third World Series Game 7 in four years. This one needed to go the distance

Uber Hateful

We defeated Hitler so we could pay for endless third worlders who gang rape our women while bearded “women” demand to be called women and Berlin could pay for a teaching manual that provides instruction for teachers on how to teach gender diversity issues to pre-school children

Supervised machine learning.

We Counter Hate is a human-moderated platform.

Our machine learning platform is continuously finding hate speech for us to counter. We're continuously giving it feedback based on what we’re given. This loop continually refines our framework, increasing reliability of the hate speech we "counter."