The goal of the study was two-fold: First, the researchers wanted to see if they could quantify disingenuous behavior, behavior such as leading people on by pretending to be legitimate, harassing others, or intentionally trying to derail forum threads. After identifying quantitative metrics that could discern trolling behavior, the team wanted to construct a classifier that would analyze the first few posts a person made and try to anticipate whether or not the person will engage in trolling behavior.

The data set used in the study came from comments collected from forums at CNN, Breitbart, and IGN. The data was sorted into groups of people who were eventually banned, and people who became productive, sincere members of the community. The researchers hypothesized that the members who were banned from the community would have tell-tale signs that could be analyzed by a machine learning algorithm, such as different ways of speaking and different posting habits.

Examples of the features that the team identified as characteristics of trolls or “future banned-users” include the vocabulary that trolls used and the overall writing style, as trolls tended to write with much harsher, antagonistic language overall. Specifically, the researchers used sentiment analysis to see if the general tone of a comment was positive or negative, and looked for things like nuanced language vs. absolutist statements.

The research team also looked at the general “read-ability” of comments, and they employed a naïve Bayes algorithm to determine deviant or dissimilar posts in a given thread. The researchers noticed that in almost any given thread the normal posters would communicate with each other using similar tones, vocabulary, and structure. Meanwhile, trolls posts would typically be fairly deviant from these social norms. In fact, trolls often show an inverse correlation with adherence to the social norms of a forum, starting out somewhat following the guidelines and over time deviating further and further until they are eventually banned.

The team used all these metrics to create a classifier which would predict, based on the first 5-10 posts of a user, whether or not a user would end up being banned at some point.

The applications for the classifier include being able to flag a specific user for increased attention by community moderators. This would act as a sort of “Troll Early Warning System”, notifying moderators that they may want to keep an eye on certain users. A human would still make the final call about whether or not a user is a troll and worthy of being banned, but the algorithm would help moderators make more educated guesses about users they need to pay attention to, thus preventing moderation teams from spreading themselves thin.

More recent research into using AI and machine learning algorithms to fight trolls has experimented with using AI to weed out toxic comments. Google’s Jigsaw unit released a new piece of software dubbed “Perspective”, which is an API that lets developers employ Jigsaw’s own AI created to automatically detect toxic and abusive comments. Much like the work done by the Stanford and Cornell teams, the algorithm won’t have a final say in which comments get published, but it might make maintaining a website’s community easier.

The technology created by Jigsaw isn’t without its problems, as currently the algorithms employed seem to have difficulty discerning the context of a comment. While the algorithms will hopefully get more sophisticated and more able to discern context, there are also concerns over possible discrimination and censorship.

The Jigsaw team admits some of these problems are genuine and have created methods of addressing them, such as the ability to report a false positive. Founder of Jigsaw Eric Cohen admits that the system has its problems, but he believes it is a step in the right direction. Cohen says it’s a “milestone, not a solution.”

As for the issue of censorship, Cohen argues that the default position is actually one of censorship, as many sites simply employ blacklists of words deemed offensive, or else disable comments altogether. Cohen hopes that technology like Perspective will facilitate discussions by creating better atmosphere for discussions. He hopes it will draw people back into having civil discussions with each other when they don’t agree, breaking up echo chambers.

It remains to be seen if Perspective will have the effect Cohen hopes it will, yet as online communities grow ever larger and more complex it seems increasingly likely that the efforts of Cohen and other researchers will pave the way for more AI-based methods of moderation.

Freelance writer and programmer chasing a Master’s in Human Computer Interaction. I have a diverse background which includes education in computer science, electronic media, communication theory, psychology and creative writing. I’m passionate about programming and believe in the power of data science to solve societal problems. I hope to work on projects which bridge the sciences and humanities, and create technology that educates and empowers.