On Tuesday, more than 8,000 Starbucks locations across the U.S. closed their doors for an afternoon of racial sensitivity training. As the company has indicated in interviews, the training will include “critical” instruction on the problem of “unconscious” bias.

Like countless other corporate seminars on unconscious bias, Starbucks’ is likely to rely heavily on the “implicit association test,” a psychological measure purporting to reveal hidden bias within a person’s subconscious.

There’s just one problem: The test is seriously flawed and may not predict racist behaviour.

This test is used absolutely everywhere The race version of the implicit association test is a simple computer test that asks users to sort faces by black and white. Then, it asks users to sort those faces simultaneously with a number of words: white faces and “good” words (glorious, peace, etc.) go in one category, while black faces and “bad” words (failure, awful, etc.) go in the other category. Finally, the word categories are switched, and users must pair “bad” words with white, and “good” words with black. A complex algorithm undergirds the test, but essentially, if a user takes longer to associate “good” words with black faces than with white, the test is going to diagnose them as having an “automatic preference for white people compared to black people.” Invented in the late 1990s at the University of Washington, the test has now utterly permeated how American society views racism. Universities, governments and corporations have adopted the test as a keystone of racial sensitivity training. Statisticians are using the test to measure the relative racism of American demographic groups. Former democratic presidential candidate Hillary Clinton endorsed implicit bias testing in an election speech, linking implicit bias to police shootings of black men. The test’s creators have called for it to be a mandatory component of jury selection. A version of the IAT has even been put forward as a foolproof lie detector test. In short, it’s kind of important that there’s a bunch of people accusing the test of being extremely flawed.

The tests have wildly inconsistent results If the implicit association test determines you’re a “moderate” racist, there’s a surprisingly large chance that you’ll get a different result if you take the test again. Psychological tests are routinely graded on their “test-retest” reliability, with a grade of 0.7 being considered the minimum for “acceptable reliability.” The IAT can only manage a grade of about 0.5, the threshold of “unacceptable reliability.” Many test-takers have experienced the IAT’s unreliability firsthand. Vox reporter German Lopez took the test three times and got three different results: biased against blacks, biased against whites and, finally, the rare result of not being biased against anybody. “I felt like I had gotten no real answers about my bias from this test,” he wrote. It’s for this reason that critics have accused the IAT of repeatedly failing to live up to basic scientific standards.

There is startlingly little evidence that the test predicts racist behavior More than 17 million people have taken an online version of the IAT hosted by Harvard University. While the results have definitively revealed that millions of people have bad reaction times when associating positive words with black faces, the evidence isn’t nearly as clear on whether this is an accurate way to diagnose hidden racism. There are now enough published studies on the IAT that researchers have been able to conduct a number of “meta-analyses” averaging the results. “When you use meta-analyses to examine the question of whether IAT scores predict discriminatory behaviour accurately enough for the test to be useful in real-world settings, the answer is: No,” reads a detailed 2017 takedown of the IAT in New York magazine. Author Jesse Singal highlighted a whole galaxy of published critiques against the test, including the fact that someone who does crosswords every morning is likely to score as less biased by simple virtue of being better at word association games. Being hungover, in a loud room, or even just worried that you’ll get a bad result could all have dramatic effects on the final test results. “If the test can’t predict individual behavior, it’s unclear exactly what it does do or why it should be the centre of so many conversations and programs geared at fighting racism,” wrote Singal in a follow-up. One meta-analyses out of the University of Wisconsin came to a particularly troubling conclusion: If someone is truly a racist, taking an IAT probably isn’t going to stop them. As researchers wrote, “changing implicit bias does not necessarily lead to changes in explicit bias or behaviour.”

Even the test’s creators have warned not to take it too seriously In a 2014 paper, creators of the IAT themselves warned in a paper that its most popular use was also its most flawed. The authors, which included the University of Washington’s Anthony Greenwald and Harvard’s Mahzarin R. Banaji, wrote that the test should not be used to “classify persons as likely to engage in discrimination.” The paper adds “attempts to use such measures diagnostically for individuals therefore risk undesirably high rates of erroneous classifications.” Critics including Jesse Singal and Quartz science writer Olivia Goldhill have pointed out that this note of caution differs strongly from the numerous public statements in which Greenwald and Banaji have touted the test as a new and valuable tool in predicting discriminatory behaviour. Goldhill excerpted one section from the pair’s 2013 book, Blindspot, in which they cite the “empirical truth” of egalitarian-minded individuals having their “discriminatory behaviour” revealed by the test. “Among research participants who describe themselves as racially egalitarian, the Race IAT has been shown, reliably and repeatedly, to predict discriminatory behaviour that was observed in the research,” it read.

It might be preventing us from dealing with actual racial problems The rise of the implicit bias test has popularized the notion that if only well-meaning Americans could slay the racist beast within their subconscious, everything would be fine. But there are voices on both sides of the political spectrum arguing that this admittedly enticing idea allows society to ignore much more meaningful fixes to racial inequity. Vox.com speculated that all the energy spent on hunting for “implicit bias” in the masses might be providing cover to a subset of “conscious” racists who are actually perpetrating all the racism. Tech giants like Google and Facebook have enthusiastically adopted implicit bias training, only to see it have almost no effect on their corporate diversity.

A lot of this might be Canada’s fault If history ultimately determines that the IAT has been an enormous, time-wasting, red herring, a Canadian will shoulder some of the responsibility. Although the test was invented by Americans, it wasn’t truly propelled into stardom until the 2005 publication of Blink, by Canadian author Malcolm Gladwell. The test features prominently in the book, and receives one of its more full-throated endorsements as a foolproof detector of hidden racism. “The IAT is more than just an abstract measure of attitude,” he wrote. “It’s also a powerful predictor of how we act in certain kinds of spontaneous situations.” Gladwell, who is mixed race from Waterloo County, Ont., found that his own result was a “a moderate automatic preference for whites”