Google fights online trolls with new tool

The Internet can be an ugly place — one where the mere act of expressing an opinion can result in a barrage of name-calling, harassment and sometimes threats of violence.

Nearly half of U.S. Internet users say they have experienced such intimidation; a third say they have resisted posting something online out of fear, according to the nonprofit Data and Society Research Institute. Women, particularly young women and women of color, are disproportionately targeted.

Now Google is zeroing in on the problem. On Thursday, the company publicly released an artificial intelligence tool, called Perspective, that scans online content and rates how “toxic” it is based on ratings by thousands of people.

For example, you can feed an online comment board into Perspective and see the percentage of users that said it was toxic. The toxicity score can help people decide whether they want to participate in the conversation, said Jared Cohen, president of Jigsaw, the company’s think tank (previously called Google Ideas). Publishers of news sites can also use the tool to monitor their comment boards, he said.

People can also feed specific words and phrases into Perspective to check how they’ve been rated. A quick scan of some very ugly words yielded counterintuitive results: The n-word was rated as 82 percent toxic; c---, a term for women’s genitalia, was 77 percent toxic; k---, a derogatory word for a Jewish person, was 39 percent toxic; and c----, a slur for a Chinese person, was 32 percent toxic. If you add the phrase “you are a” to any of those words, the toxicity score goes up.

Cohen emphasized that Perspective was a work in progress and would only improve if people contributed to it.

Google’s troll-fighting efforts trail that of other tech companies and nonprofit groups. Earlier this month, Twitter — which has developed a reputation as a playground for abuse — launched new tools to cut on trolling. The company said it would begin retaining more user data as part of an effort to prevent people who have harassed other people from deleting their accounts and then reemerging under a new username. It also said it was tweaking its algorithms to flag certain tweets as “potentially abusive or low-quality.”

Hack Harassment, a group founded by Intel, Vox Media and Lady Gaga’s Born This Way Foundation, is working to raise awareness. Other organizations, such as TrollBusters and Crash Override Network, are support groups for people who have experienced harassment online.

Last year, gaming entrepreneur Brianna Wu, along with Harvard professors, the International Women’s Media Foundation and the social-services agency Digital Sisters, sent a list of demands to technology companies for how they can clean up their services.

To build a model that can predict hate speech, Cohen said that the Jigsaw team had originally acquired a data set of 17 million reader comments from the New York Times. The company also mined the comment sections of Wikipedia and collected data from victims of online harassment who had kept a record of their experiences.

From there, the company hired several thousand people who rated the comments as toxic or not toxic, as well as whether they thought such comments were personal attacks. Jigsaw developed its toxicity scores based on the annotations. Because there was widespread disagreement on what constituted a personal attack, Cohen’s team ultimately decided not to use “personal attack” as a rating category.

It’s up to publishers to decide how to use the tool. Some may choose to show the scores to readers and use crowdsourcing methods that are similar to current systems where readers can flag offensive language to human moderators. Others may choose to use the data to clean up the language on their own sites.

Asked whether the site could result in censoring free speech, Cohen said that the software tool wasn’t intended to bypass human judgment, but to flag “low-hanging fruit” that could then be passed on to human moderators.