Tag Archives: WeaponsofMathDestruction

On the hiring side, I’m not sure whether algorithmic arbitrariness or human arbitrariness is worse. I have a sense that, distinct from the expected biases (ethnicity, gender, geography/wealth) algorithms might bias for similarity. That is, they bias against candidates who have the larger skills to do the job, but whose previous job titles or majors aren’t a close word for word match for a job description. Of course humans might be just as likely to have that bias, but a human, if they wanted to think “outside the box” could at least be metacognitively aware of it.

I found the next chapter “Sweating Bullets” more alarming. The core of the problem is that outside of widget production for a factory worker or sales volume, the link between what an individual worker does and an institutional KPI is often tenuous. My instinct is that bad algorithms full of second or third order proxies make this much worse that a human based system with safeguards (such as something like 360 evaluation)

Did anyone else find the sociometric badge used in the call center (132) seriously creepy?

As to one of Bryan’s questions, about whether boycotts can provide a meaningful check on this sort of thing, it seems to me it might work in the public sector where transparency can be enforced via FOIA, but I have little hope for the private sphere. Boycotts sound good, but are rarely well enough organized or maintained to provoke real change.

Notes and Quotes

“…we’ve seen time and again that mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or education. It’s up to society whether to use that intelligence to reject and punish them — or to reach out to them with the resources they need.” (118)

“The root of the trouble, as with so many other WMD’s, is the modeler’s choice of objectives. The model is optimized for efficiency and profitability, not for justice or the good of the ‘team.’ This is, of course, the nature of capitalism.” (129-130)

I was struck the other day by how similar Cory Doctorow’s whuffie system (from Down and Out in the Magic Kingdom), the rating system in the Black Mirror episode “Nosedive” and the Chinese social credit system I described in last week’s post are.

Here’s last week’s prompt for the Weapons of Math Destruction Book Club, I have chosen to ignore the provided questions, however. (Sorry, Bryan)

My big takeaway from these chapters is the importance of the decisions that are made about how to use data. Both predatory recruiting and nuisance policing seem to start with explicitly harmful (the former) or flawed (the latter) justifications. This makes the issue one of big data making it easier for people to do bad things.

The description of how the Chicago predictive policing initiative included social network analysis reminded me of the Social Credit system China is developing. (See this article from the Independent or this one from the Financial Times [warning:paywall]) Incidentally, the Independent article has a video and I was shown a pre-roll Lexus ad that was in Mandarin.

Unlike the Chicago system, where one’s core is known presumably only to police, the Chinese system, which includes in the “social credit score” algorithm your activity on social media and the scores of your friends, makes those scores public, encouraging you either to lean on your friends with low scores in an effort to improve their behavior or to shun them. Both approaches will improve the social component of your score. I wonder to what extent social credit scores are used in the Western world and we just don’t know about it yet.

NOTES AND QUOTES*

(96) Justice cannot just be something that one part of society inflicts upon the other.
(102) Part of the analysis that led police to McDaniel involved his social network.

*Yes, I’m aware that it should probably be Notes ans Quotations, but I will sacrifice grammatical accuracy for rhyme scheme

I’m moving on to Chapters 2 and 3 of Weapons of Math Destruction . In this week’s prompt, Bryan asks:

If creating or running a WMD is so profitable, how can we push back against them?

By making them less profitable. The only way I see to do this is to require some human intervention in the decision processes these algorithms facilitate. Making a person responsible for verifying the algorithmic outputs would at least improve accountability. In the event of egregious harms, the person who signed off on the algorithmic output could be held to account for his or her decision.

Do you find other university ranking schemes to be preferable to the US News one, either personally or within this book’s argument?

I don’t know enough about them to say.

At one point the author suggests that gaming the US News ranking might not be bad for a university, as “most of the proxies… reflect a school’s overall quality to some degree” (58). Do you agree?

This doesn’t matter. Even if the proxies are good proxies, the fact that it’s a ranking system creates the arms race condition which forces colleges to game the system aggressively by doing things like rejecting qualified applicants who are unlikely to enroll. O’Neil discusses this decline of the safety school. The root of the problem is the role of reputation in the whole system.

Not surprisingly the discussion of college rankings in Chapter 3 resonated strongly with me for two reasons:

I applied to colleges just at the end of Ivy League collusion on financial aid offers. I wonder how much the early effects of the US News rankings might have affected me as an Ivy League applicant.

As the parent of teenagers, I had only a vague sense of how the admissions process has changed since I was a college applicant. I worry for my children.

Note:

“However, when you create a model from proxies, it is far simpler for people to game it.” (55)

A. “What would it take for an education algorithm to meet all of O’Neil’s criteria for not doing damage?”

The big problem with almost all educational measurement is the use of clumsy at best proxies (O’Neil uses the term) for the learning we want to measure. Since the fundamental output metric is a test with all the possibilities for manipulation that suggests, when we then try to measure what input changes improve that output we are are at least two levels of abstraction removed. Until we can measure educational outcomes some way other than by means of a crude manipulable proxy, I’m not sure we can fix this.

B. “What are the best ways to address the problem of “false positives”, of exceptionally bad results, of anomalies?”

I think the best way to solve this problem is to place some limits (preferably not determined by an algorithm) on the kinds of decisions we allow algorithms to make without human input. The potential harm of a bad book recommendation from Amazon is much lower. That probably means thoughtful review of every adverse algorithmic recommendation by at least one live human being. Of course, thus undermines the efficiency and scale that algorithms are designed to create. An important step is to acknowledge that algorithms are not neutral even if they manage not to be arbitrary. They encoded the assumptions and biases of their creators, and acknowledging those assumptions and biases is a key part of the design process.

The DC schools example draws attention to the importance of checking for flawed input data. After all, the algorithm is only as accurate as the data you feed it.

Notes: O’Neil’s three criteria for a Weapon of Math Destruction are “opacity, scale, and damage.” She uses the initialism WMD. I wish she had come up with something else, because of the namespace confusion with chemical, biological and nuclear weapons.

Opacity makes me think of Frank Pasquale’s The Black Box Society, which I haven’t read yet. The synopses of Pasquale’s book make me wonder how his and O’Neil’s work intersect.