Training bias in AI "hate speech detector" means that tweets by Black people are far more likely to be censored

Follow Us

More bad news for Google's beleaguered spinoff Jigsaw, whose flagship project is "Perspective," a machine-learning system designed to catch and interdict harassment, hate-speech and other undesirable online speech.

Specifically, candidate texts written in African American English (AAE) are 1.5x more likely to be rated as offensive than texts written in "white-aligned English."

The authors do a pretty good job of pinpointing the cause: the people who hand-labeled the training data for the algorithm were themselves biased, and incorrectly, systematically misidentified AAE writing as offensive. And since machine learning models are no better than their training data (though they are often worse!), the bias in the data propagated through the model.

We analyze racial bias in widely-used corpora of annotated toxic language, establishing correlations between annotations of offensiveness and the African American English (AAE) dialect. We show that models trained on these corpora prop-agate these biases, as AAE tweets are twice as likely to be labelled offensive compared to others.Finally, we introduce dialect and race priming,two ways to reduce annotator bias by highlightingthe dialect of a tweet in the data annotation, and show that it significantly decreases the likelihood of AAE tweets being labelled as offensive. Wefind strong evidence that extra attention should be paid to the confounding effects of dialect so as to avoid unintended racial biases in hate speech detection.

For decades, people (including me) have predicted that cyberinsurers might be a way to get companies to take security seriously. After all, insurers have to live in the real world (which is why terrorism insurance is cheap, because terrorism is not a meaningful risk in America), and in the real world, poor security practices destroy […]

One of the major contributors to greenhouse gases is the methane that cows belch up as they break down cellulose, but five years ago, research from Australia's Commonwealth Scientific and Industrial Research Organisation (CSIRO) found that adding small amounts of a pink seaweed called Asparagopsis to cows' diets eliminated the gut microbes responsible for methane […]

On Slate Star Codex (previously), Scott Alexander breaks down Invisible Designers: Brain Evolution Through the Lens of Parasite Manipulation, Marco Del Giudice's Quarterly Review of Biology paper that examines the measures that parasites take to influence their hosts' behaviors, and the countermeasures that hosts evolve to combat them.

On the one hand, nostalgia is “a corruption of the historical impulse,” according to William Gibson. On the other hand, “Super Mario Bros.” will never not be cool. Luckily, there’s a way to satisfy that retro gaming while still keeping an eye on the future: The GameShell Kit. This thing is simultaneously the last handheld […]

The field of data analytics can get intimidating, even for business professionals who constantly rely on it. But at its heart, its purpose is to simplify. To take mounds of information and distill their insights into a single clear picture. Currently, the go-to software for painting that picture is Tableau. And if you want to […]

If you’re in the market for a stable, durable camera fully suited for first-person video, there’s a good chance that you’re the adventurous type. So why settle on a familiar name like GoPro? The DJI Osmo Action 4K HDR Camera checks off all the same boxes on the action cam checklist as the GoPro 4K […]