In case you don’t know the lingo, A/B testing is a test done by marketers to decide which of two ad designs is more effective – the ad with the dark blue background or the ad with the dark red background, for example. But in this case it was more like, the ad with Obama’s family or the ad with Obama’s family and the American flag in the background.

The idea is, as a marketer, you offer your target audience both ads – actually, any individual in the target audience either sees ad A or ad B, randomly – and then, after enough people have seen the ads, you see which population responds more, and you go with that version. Then you move on to the next test, where you keep the characteristic that just won and you test some other aspect of the ad, like the font.

As a mathematical testing framework, A/B testing is interesting and has structural complications – how do you know you’re getting a global maximum instead of a local maximum? In other words, if you’d first tested the font, and then the background color, would you have ended up with a “better ad”? What if there are 50 things you’d like to test, how do you decide which order to test them in?

But that’s not what interests me about Kyle’s Obama A/B testing blogpost. Rather, I’m fascinated by the definition of success that was chosen.

After all, an A/B test is all about which ad “works better,” so there has to be some way to measure success, and it has to be measured in real time if you want to go through many iterations of your ad.

In the case of the Obama campaign, there were two definitions of success, or maybe three: how often people signed up to be on Obama’s newsletter, how often they gave money, and how much money they gave. I infer this from Kyle’s braggy second sentence, “Overall we executed about 500 a/b tests on our web pages in a 20 month period which increased donation conversions by 49% and sign up conversions by 161%.” Those were the measures Kyle and his team was optimizing on.

Most of the blog post focused on getting people to donate more, and specifically on getting them to fill out the credit card donation page form. Here’s what they A/B tested:

Our plan was to separate the field groups into four smaller steps so that users did not feel overwhelmed by the length of the form. Essentially the idea was to get users to the top of the mountain by showing them a small incline rather than a steep slope.

What I find super interesting about this stuff (and of course this not the only “data science” that was used in Obama’s campaign, there was a separate team focused on getting Facebook users to share their friends’ lists and such) is that nowhere is there even a slight nod to the question of whether this stuff will improve or even maintain democracy. They don’t even discuss how maintainable this is.

I mean, we gave the Obama analytics team lots of credit for stuff, but in the end what they did was optimize a bunch of people’s donation money. Is that something we should cheer? It seems more like an arms race with the Republican party, in which the Democrats pulled ahead temporarily. And all it means is that the fight for donations will be even more manipulative, by both sides, by the next presidential election cycle.

As Felix Salmon pointed out to me over beer and sausages last week, the problem with big data in politics is that the easiest thing you can measure in politics is money, which means everything is optimized to that metric of success, leaving all other considerations ignored and probably stifled. And yes, “sign ups” are also measurable, but they more or less correspond to people who will receive weekly or daily requests for money from the candidate.

Readers, please tell me I’m wrong. Or suggest a way we can measure something and optimize to something that is less cynical than the size of a war chest.

ContentIssues

Yesterday, the Wall Street Journalreported they had obtained the internal staff report from the Federal Trade Commission, where top staff urged action against Google for abusing its monopoly position.

On four key questions, (1) Did Google illegally favor its own content over rivals, (2) Did Google illegally copy content from rival sites, (3) Did Google illegally restict advertisers' ability to run campaigns on rival search engines, and (4) Did Google illegally restrict other websites that publish its search results from working with rival search engines, the staff report found that Google had violated the law and that the FTC should take action.

However, when the FTC commission made its ruling, while a few commissioners expressed concern, they argued in the end that Google had done nothing illegal and no action was warranted. Notably, the commissioners did not even reference most of the evidence outlined in this unredacted report, which was based on nine million pages of documents from Google and other parties. Factually, one important detail that the FTC staff had found was that Google's dominance of seach was not 65% as widely report but potentially as high as 84% of U.S. search queries.

The FTC Commissioners may well have just had a different view of the evidence than its staff -- although I obviously thing the staff had the better arguments (see my law review articles here and here on other approaches I argue the government should be taking against Google on antitrust) -- but what's surprising is that anyone would expect the political appointees at the FTC to take any other action. The Republican-chosen members had an ideological bias against antitrust action of almost any kind to begin with, while the Democratic nominees were just very unlikely to go after one of the most important political friends of President Obama.

Eric Schmidt, Chairman and previously CEO of Google, was not any ordinary political funder of Obama. He was a core political operative for the campaign who played a key role in establishing the digital operations that wowed the nation as Obama built a campaign network that took first the nomination and then the Presidency. As David Plouffe wrote in his campaign autobiography, The Audacity to Win: The Inside Story and Lessons of Barack Obama's Historic Victory:

“With the help of supporters like Eric Schmidt of Google, we dramatically improved our digital strategy and execution, and I’d say we were competitive digitally with any business-world start-up.”

It is not overly cynical to note that while Obama has taken on corporate interests in a number of fields from health care to the for-profit health care industry to the energy sector, he has been notably softer on a few of the sectors where he got the most core initial campaign support, notably Wall Street finance folks clustered around Goldman Sachs and Silicon Valley. And Google through Eric Schmidt was one of the key conduits for that Silicon Valley support, so expecting Obama's FTC to take what would be a groundbreaking antitrust action against Obama's key political supporter was unlikely to ever be in the cards.

This emphasizes that one of the core dangers of monopoly is not just that its distorts the economic marketplace, but that large centralized corporate monopolies end up distorting the political field as well to the point that they can't be challenged either by economic rivals or by elected officials.

Public Citizen last fall published a report Mission Creep-y: Google Is Quietly Becoming One of the Nation’s Most Powerful Political Forces While Expanding Its Information-Collection Empire. The report noted that in the first three quarters of 2014, Google ranked first among all corporations in lobbying spending in the United States. It has steadily hired former government officials into its lobbying operations and "a steady stream of Google’s employees has been appointed to high-ranking government jobs," including helping save Obama's mismanaged initial rollout of the Obamacare website. More broadly, Google has disclosed that it contributes to about 140 trade associations and other non-profits to spread its influence across multiple sectors.

As Public Citizen noted in conclusion of its report:

As Google’s forays into new technologies far outpace the relevance of existing regulations, Google is seizing the opportunity to influence what new regulation will look like. Citizens must ensure that new technologies are designed and regulated through open, democratic processes, not to further empower dominant entities like Google, but to protect and empower consumers.

That the FTC barely acknowledged the criticisms and evidence marshalled by its own staff of Google's abuses in dismissing action against the company reflects that Public Citizen's expressed concerns are well warranted.

ContentIssues

Under both Democratic and Republican administrations, over more than fifteen years, the Federal Trade Commission has ignored privacy concerns in approving merger after merger. The Electronic Privacy Information Center (EPIC) details that history in extensive comments submitted to the FTC as part of its review of its own merger remedy process.

Just a sampling of the mergers detailed where privacy concerns were ignored:

2000- EPIC and a coalition of consumer groups highlighted the danger to privacy in the proposed merger of Tim Warner and AOL.

2007- EPIC highlighted the clear danger to consumer privacy of combining Google’s own extensive profiling of consumers with Doubleclick’s database in a corporate merger.

2014- As recently as last year, consumer groups asked that privacy concerns be taken into account to block the merger of Facebook with WhatsApp.

As EPIC argues, in each case:

[T]he practical consequence of the merger would be to reduce the privacy protections for consumers and expose individuals to enhanced tracking and profiling. The failure of the Federal Trade Commission to take this into account during merger review is one of the main reasons consumer privacy in the United States has diminished significantly over the last 15 years.

In many cases, companies that previously built their businesses on promises not to collect or share personal data, then were absorbed by companies without such commitments, betraying the trust users had placed in the original companies. Notably, after the DoubleClick merger, “Google has continued to expand the tracking and profiling Internet users, often ignoring prior commitments it had made to protect the privacy of these same users.”

Notably, European competition regulators are increasingly seeing protecting personal data from corporate control as an integral part of their responsibility. Recently appointed European Union Competition Commissioner Margrethe Vestager argued recently:

Very few people realize that, if you tick the box, your information can be exchanged with others. Actually, you are paying a price, an extra price for the product that you are purchasing. You give away something that was valuable. I think that point is underestimated as a factor as to how competition works.

The Federal Trade Commission in the United States, however, has, as EPIC notes, been extremely resistant to integrating analysis of the harms to competition and consumers from control of personal data by increasingly centralized data platforms.

Unfortunately, it’s not clear how soon that may change. I spoke at a panel yesterday at George Mason’s Law and Economics Center on the topic, Big Data, Privacy, And Antitrust. Keynoting the briefing was FTC Commissioner Maureen K. Ohlhausen, who essentially doubled down on the traditional argument of the FTC that they should generally not address privacy in antitrust or merger proceedings. She outlined arguments from an upcoming law review article she has co-written, which argues vigorously that the FTC should focus exclusively on maximizing competition, defined in traditional “free market” Chicago School terms, and leave protection of privacy to separate consumer protection laws.

Part of the problem may be the word “privacy” itself, which tends to conflate everything from the icky “harms to dignity” of intrusion that Justice Brandeis wrote about a century ago to equal rights for women embodied in abortion rights to protecting the rights of consumers to receive a fair return on the value of the personal information that big data platforms use to generate their megaprofits.

If companies too easily collect user data without fair compensation to those users, they will end up with a distorting edge in dominating particular business sectors. Preserving “privacy” – ie. consumer control of their own data and the ability to demand fair compensation – is therefore a critical tool for competition regulators in ensuring a competitive marketplace. And where competition is failing, as it is too often in multiple online sectors, agencies like the FTC should be stepping up their intervention to do what the market is failing to do—protect consumer control of their data.

As EPIC details, we have had a bipartisan failure of the Federal Trade Commission to properly factor privacy concerns into merger reviews. We need a change in agency direction to ensure we don’t have a similar 15 years of inaction going forward or we will end up with a near irreversible obliteration of both privacy and competition across the economy.

Even so, I already feel capable of critiquing this review of his book (hat tip Jordan Ellenberg), written by Columbia Business School Professor and Investment Banker Jonathan Knee. You see, I’m writing a book myself on big data, so I feel like I understand many of the issues intimately.

The review starts out flattering, but then it hits this turn:

When it comes to his specific policy recommendations, however, Mr. Schneier becomes significantly less compelling. And the underlying philosophy that emerges — once he has dispensed with all pretense of an evenhanded presentation of the issues — seems actually subversive of the very democratic principles that he claims animates his mission.

That’s a pretty hefty charge. Let’s take a look into Knee’s evidence that Schneier wants to subvert democratic principles.

NSA

First, he complains that Schneier wants the government to stop collecting and mining massive amounts of data in its search for terrorists. Knee thinks this is dumb because it would be great to have lots of data on the “bad guys” once we catch them.

Any time someone uses the phrase “bad guys,” it makes me wince.

But putting that aside, Knee is either ignorant of or is completely ignoring what mass surveillance and data dredging actually creates: the false positives, the time and money and attention, not to mention the potential for misuse and hacking. Knee’s opinion on that is simply that we normal citizens just don’t know enough to have an opinion on whether it works, including Schneier, and in spite of Schneier knowing Snowden pretty well.

It’s just like waterboarding – Knee says – we can’t be sure it isn’t a great fucking idea.

Wait, before we move on, who is more pro-democracy, the guy who wants to stop totalitarian social control methods, or the guy who wants to leave it to the opaque authorities?

Corporate Data Collection

Here’s where Knee really gets lost in Schneier’s logic, because – get this – Schneier wants corporate collection and sale of consumer data to stop. The nerve. As Knee says:

Mr. Schneier promotes no less than a fundamental reshaping of the media and technology landscape. Companies with access to large amounts of personal data would be “automatically classified as fiduciaries” and subject to “special legal restrictions and protections.”

That these limits would render illegal most current business models — under which consumers exchange enhanced access by advertisers for free services – does not seem to bother Mr. Schneier”

I can’t help but think that Knee cannot understand any argument that would threaten the business world as he knows it. After all, he is a business professor and an investment banker. Things seem pretty well worked out when you live in such an environment.

By Knee’s logic, even if the current business model is subverting democracy – which I also argue in my book – we shouldn’t tamper with it because it’s a business model.

The way Knee paints Schneier as anti-democratic is by using the classic fallacy in big data which I wrote about here:

Although professing to be primarily preoccupied with respect of individual autonomy, the fact that Americans as a group apparently don’t feel the same way as he does about privacy appears to have little impact on the author’s radical regulatory agenda. He actually blames “the media” for the failure of his positions to attract more popular support.

Quick summary: Americans as a group do not feel this way because they do not understand what they are trading when they trade their privacy. Commercial and governmental interests, meanwhile, are all united in convincing Americans not to think too hard about it. There are very few people devoting themselves to alerting people to the dark side of big data, and Schneier is one of them. It is a patriotic act.

Also, yes Professor Knee, “the media” generally speaking writes down whatever a marketer in the big data world says is true. There are wonderful exceptions, of course.

So, here’s a question for Knee. What if you found out about a threat on the citizenry, and wanted to put a stop to it? You might write a book and explain the threat; the fact that not everyone already agrees with you wouldn’t make your book anti-democratic, would it?

MLK

The rest of the review basically boils down to, “you don’t understand the teachings of the Reverend Dr. Martin Luther King Junior like I do.”

Do you know about Godwin’s law, which says that as soon as someone invokes the Nazis in an argument about anything, they’ve lost the argument?

I feel like we need another, similar rule, which says, if you’re invoking MLK and claiming the other person is misinterpreting him while you have him nailed, then you’ve lost the argument.

The control of personal data by “big data” companies is not just as issue of privacy but is becoming a critical issue of economic justice, argues a new report issued by the organization Data Justice (www.datajustice.org), which itself is being publicly launched in conjunction with the report.

“This steady loss of data by individuals into the hands of increasingly centralized corporate hands is helping to drive a large portion of the economic inequality that has becoming central to political debate in our nation,” said Data Justice Director Nathan Newman, the report’s author.

Big data platforms collect so much information about so many people, details the report, that correlations emerge that allow individuals to be slotted into hiring and marketing categories in unexpected and often unwelcome ways that usually leave them at a distinct disadvantage in negotiations. This enables advertisers to offer goods at different prices to different people, what economists call price discrimination, to extract the maximum price from each individual consumer. Such online price discrimination raises prices overall for consumers, while often hurting lower-income and less technologically savvy households.

Data crunchers were key to manipulating financial markets and securities throughout the financial industry and big data platforms were critical parts of the marketing machine that used various forms of consumer profiling and price discrimination to push subprime financial products out to the most vulnerable members of the American public. Notably, by the mid-2000s, the lion’s share of the online advertising economy was being driven by subprime and related mortgage lenders, highlighting the ways the profits of big data platforms have often come at the expense of consumer welfare.

“At the same time, big data is fueling economic concentration across our economy,” argues Newman. As a handful of data platforms generate massive amounts of user data, the barriers to entry rise since potential competitors have little data themselves to entice advertisers compared to the incumbents who have both the concentrated processing power and supply of user data to dominate particular sectors. With little competition, companies end up with little incentive to either protect user privacy or share the economic value of that user data with the consumers generating those profits.

The report argues for a threefold approach to making big data work for everyone in the economy, not just the big data platforms’ shareholders:

First, regulators need to strengthen user control of their own data by both requiring explicit consent for all uses of the data and better informing users of how it’s being used and how companies profit from that data.

Second, regulators need to factor control of data into merger review and to initiate antitrust actions against companies like Google where monopoly control of a sector like search advertising has been established.

Third, policymakers should restrict practices that harm consumers, including banning price discrimination where consumers are not informed of all discount options available and bringing the participation of big data platforms in marketing financial services under the regulation of the Consumer Financial Protection Bureau

Data Justice itself has been founded as an organization “to promote public education and new alliances to challenge the danger of big data to workers, consumers and the public.” It will work to educate the public, policymakers and organizational allies on how big data is contributing to economic inequality in the economy. Its new website at

Have you ever heard of phrenology? It was, once upon a time, the “science” of measuring someone’s skull to understand their intellectual capabilities.

This sounds totally idiotic but was a huge fucking deal in the mid-1800’s, and really didn’t stop getting some credit until much later. I know that because I happen to own the 1911 edition of the Encyclopedia Britannica, which was written by the top scholars of the time but is now horribly and fascinatingly outdated.

For example, the entry for “Negro” is famously racist. Wikipedia has an excerpt: “Mentally the negro is inferior to the white… the arrest or even deterioration of mental development [after adolescence] is no doubt very largely due to the fact that after puberty sexual matters take the first place in the negro’s life and thoughts.”

But really that one line doesn’t tell the whole story. Here’s the whole thing, it’s long:

Pages 1 and 2

Pages 3 and 4

Pages 5 and 6

As you can see, they really go into it, with all sorts of data and speculative theories. But near the beginning there’s straight up racist phrenology:

From page 1

To be clear: this was produced by a culture that was using pseudo-scientific nonsense to validate an underlying toxic and racist mindset. There was nothing more to it, but because people become awed and confused around scientific facts and figures, it seemed to work as a validating argument in 1911.

Anyhoo, I thought this was an interesting back drop to the NPR story I wanted to share with you (hat tip Yves Smith) entitled Recruiting Better Talent With Brain Games And Big Data. You can read the transcript as well, you don’t have to listen. Basically the idea is you play video games and the machine takes note of how you play and the choices you make and comes back to you with a personality profile. That profile will help you get a job or will exclude you from a job if the company believes in the results. There’s been no scientific tests to see if or how this stuff works, we’re supposed to just believe in it because, you know, data is objective and everything.

Here’s the thing. What we’ve got is a new kind of awful pseudo-science, which replaces measurements of skulls with big data. There’s no reason to think this stuff is any less biased or discriminatory either: given that there’s no actual science behind it, we might simply be replicating a selection method to get people who we like and who remind us of ourselves. To be sure, it might not be as deliberate as what we saw above, but that doesn’t mean it’s not happening.

The NPR reporter who introduced this story did so by saying, “let’s start this hour with a look at an innovation in something that’s gone unchanged, it seems, forever.” That one sentence already gets it wrong, though. This is, unfortunately, not innovative. This is just the big data version of phrenology.

ContentIssues

If you are childless, shop for clothing online, spend a lot on cable TV, and drive a minivan, data brokers are probably going to assume you’re heavier than average. We know that drug companies may use that data to recruit research subjects. Marketers could utilize the data to target ads for diet aids, or for types of food that research reveals to be particularly favored by people who are childless, shop for clothing online, spend a lot on cable TV, and drive a minivan.

We may also reasonably assume that the data can be put to darker purposes: for example, to offer credit on worse terms to the obese (stereotype-driven assessment of looks and abilities reigns from Silicon Valley to experimental labs). And perhaps some day it will be put to higher purposes: for example, identifying “obesity clusters” that might be linked to overexposure to some contaminant.

2) Helping match those offering products to those wanting them (food marketing)

3) Promoting the classification and de facto punishment of certain groups (identifying a certain class as worse credit risks)

At present, law does not do enough to recognize how valuable goals like 1) are, and how destructive 3) could become. In fact, to the extent 1 is highly regulated, and 3 is unregulated, law may perversely help channel capital into discriminatory ventures and away from socially productive ones.

“So deregulate all of it!”, a well-funded lobby might reply. But we need to update anti-discrimination law and policy, not simply give up on it in the face of big-data driven construction of new minorities. Reputation intermediaries outside the health sector are now using data not covered by HIPAA to impute health conditions to individuals. As the former CIO of Google (& CEO of ZestFinance) puts it, “[A]ll data is credit data, we just don’t know how to use it yet.” A lawyer might respond: “all data is health data,” too, and should be subject to HIPAA and HITECH strictures.

We need to distinguish between innovation and discrimination. If a firm like ZestFinance finds out that the obese (or people with minivans) are worse credit risks, and imposes a higher interest rate on them, I question whether that is “innovation” as valuable as, say, finding better ways of curing a disease, growing food, or cooking a meal. It may, instead, merely be a way for industry to arrogate to itself a quasi-juridical role of punishing one group and forcing them to generate more rents for the finance sector.

A recent review of Julia Angwin’s excellent book “Dragnet Nation” (a muckraking take on privacy) concluded that its “lack of a more radical critique of digital capitalism may say more about the scope of the problem than our paucity of solutions.” But some academics and activists are addressing fundamental issues. Our innovation (and privacy) law must recognize that a cancer cure is of greater value than a tool that helps companies avoid hiring people who are likely to have cancer. The ever-insightful Tarleton Gillespie offers one way of doing so:

The third party data broker who buys data from an e-commerce site I frequent, or scrapes my publicly available hospital discharge record, or grabs up the pings my phone emits as I walk through town [is] building commercial value on my data, but offer me no value to me, my community, or society in exchange. So what I propose is a “pay it back tax” on data brokers. . . .

If a company collects, aggregates, or scrapes data on people, and does so not as part of a service back to those people . . . then they must grant access to their data and access 10% of their revenue to non-profit, socially progressive uses of that data. This could mean they could partner with a non-profit, provide them funds and access to data, to conduct research. Or, they could make the data and dollars available as a research fund that non-profits and researchers could apply for. Or, as a nuclear option, they could avoid the financial requirement by providing an open API to their data. . . . . I think there could be valuable partnerships: Turnstyle’s data might be particularly useful for community organizations concerned about neighborhood flow or access for the disabled; health data could be used by researchers or activists concerned with discrimination in health insurance. There would need to be parameters for how that data was used and protected by the non-profits who received it, and perhaps an open access requirement for any published research or reports.

Gillespie’s proposal addresses core problems of our increasingly big data driven (and intermediary driven) economy: law’s agnosticism as to the ultimate productive value of what innovators are doing. Yiren Lu recently asked, “Why do . . . smart, quantitatively trained engineers, who could help cure cancer or fix healthcare.gov, want to work for a sexting app?” The answer is pretty obvious: the money. If we develop an elaborate set of laws that channels billions of dollars to at best reallocative (and at worst, flat out discriminatory) endeavors, we shouldn’t be surprised when tech talent flocks to them. If we want entrepreneurs to use big data for higher ends, we have to change the incentives. The question is not: “should the US have an industrial policy for big data?”–we already have a highly dysfunctional one. We should, instead, focus on improving returns for those who contribute to real gains in productivity.

ContentIssues

As a handful of data platforms generate massive amounts of user data, the barriers to entry rise since potential competitors have little data themselves to entice advertisers compared to the incumbents who have both the concentrated processing power and supply of user data to dominate particular sectors. The upshot of this market power by big data platforms is that the marketplace is doing little to create options for consumers that might alleviate the misuse of consumer data or encourage big data platforms to better compensate users who are willing to share their data.

Data Justice has been launched as a project to promote public education and new alliances to challenge the danger of big data to workers, consumers and the public. Our work will include:

A Focus on Financial Exploitation: Data Justice will educate key stakeholders, allies, and the public on approaches to prevent big data platforms from using that data in ways that harm consumers. We will highlight the way big data platforms facilitate the exploitation of employees, consumers and citizens by abusive financial services companies and thereby increase economic discrimination and inequality in the economy.

Outreach to Allies: We will work to expand the coalition of organizations focused on the problem of big data platforms by bringing in the consumer, civil rights, union and other organizations currently mobilized around financial reform in the wake of the recent financial crisis. Data Justice will work with a range of organizations to highlight the problem of big data platforms and how a focus on their role in promoting economic exploitation in financial services fits within those groups' current work.

Public Education Campaign: Data Justice is engaged in a broadpublic education campaign, including developing a public website, social media campaign, public policy documents and placing individual articles, blog posts and other media pieces to support the effort. Our goal is focus media and public attention on the issue of online price discrimination and the power of big data platforms, as well as the way concentrated control of user data feeds increasing economic inequality in the economy.

Develop Policy Research: On an ongoing basis, Data Justice will produce policy reports, articles and policy briefs that outline how big data impacts different sectors of consumer and workers rights and policy options for reducing those harms to the public.

Educate Government Officials and Other Targets of Campaign: We will work to educate key members of government agencies, elected officials and other non-government institutions about what we see as the danger of data platforms' impact on economic justice issues and why regulation is warranted.