Post navigation

A San Francisco judge has rebuffed LinkedIn’s attempts to stop a third-party data-analytics startup from using the publicly available data of its users. According to legal experts, the case could wind up in the Supreme Court, given the important constitutional and economic issues it raises.

As we reported in July, HiQ, a San Francisco startup, has been marketing two products, both of which depend on whatever data LinkedIn’s 500m members have made public: Keeper, which identifies employees who might be ripe for being recruited away, and Skills Mapper, which summarizes an employee’s skills.

To reiterate: HiQ isn’t hacking anything away – it’s just grabbing the kind of stuff you or I could get on LinkedIn without having to log in. All you need is a browser and a search engine to find the data HiQ’s sucking up, digesting and selling.

LinkedIn has tolerated this for years. Then, for whatever reason, it told HiQ to stop. Bad news for the start-up – without a steady stream of data from LinkedIn, HiQ cannot function.

HiQ CEO Mark Weidick was a bit surprised. It’s not as if LinkedIn suddenly discovered what the company was up to. Its employees had attended a conference HiQ put on, he told the San Francisco Chronicle:

I thought we were on good terms. They knew perfectly well what we are doing. We were doing it in the broad light of day.

Nonetheless, in May, LinkedIn sent a cease-and-desist order to HiQ, alleging that the startup was violating the Computer Fraud and Abuse Act (CFAA), the Digital Millennium Copyright Act (DMCA), and unfair business practices under California state law. In the letter to HiQ, LinkedIn noted that it had used technology to block HiQ from accessing its data.

HiQ filed for relief in early June, asking for a temporary injunction and recommending that the parties take 30 days to discuss the matter and, hopefully, to come up with an amicable solution.

On Monday, the San Francisco judge sided with HiQ, saying that the “balance of hardships tips sharply in HiQ’s favor” and that LinkedIn’s argument about HiQ having violated the CFAA is pretty dubious. The law wasn’t put in place to gum up access to publicly available data, the judge said in a court order granting HiQ’s motion for a preliminary injunction.

Indeed. The CFAA, which prohibits accessing a computer without authorization, has been used in many criminal cases, such as to prosecute ex-employees who hack their former employers. It was also used, infamously, to prosecute internet activist Aaron Swartz. Rights groups have called the act “infamously problematic“.

But to use the CFAA to prosecute a company for scraping publicly available data? Um, no, that’s not a thing, the judge said on Monday:

The broad interpretation of the CFAA advocated by LinkedIn, if adopted, could profoundly impact open access to the internet, a result that Congress could not have intended when it enacted the CFAA over three decades ago.

The motion ordered LinkedIn to dismantle any technical roadblocks it put in place to fend off bots that scrape its members’ data. The BBC reports that LinkedIn is considering an appeal.

HiQ is far from the first company to spin a business model out of whatever it can siphon off another service. You can think of social media platforms – say, LinkedIn, Twitter, and Facebook – as trees. They’ve got an ecosystem of epiphytes, sucking up their data to package and sell in some form.

Sometimes, that parasitic relationship can carry on for years. Take, for example, Geofeedia’s use of the APIs of Twitter, Facebook and Instagram.

For five years, Geofeedia used their data streams to create real-time maps of social media activity in protest areas. As was made clear in a report from the American Civil Liberties Union (ACLU) about police monitoring of activists and protesters via social media data, police have used the maps to identify, and in some cases arrest, protesters shortly after their posts became public.

The metadata – including images, geolocation data, and screen names available on Instagram’s public feed – on Geofeedia’s map of Ferguson protests was all publicly available. But the scale at which police were identifying and retaining data on protesters was beyond what any individual could achieve without special access to social media platforms’ APIs.

LinkedIn is rationalizing its opposition to HiQ not in terms of scale but rather in terms of user privacy and HiQ’s ability to retain user data. It’s pointing to what it says are more than 50m LinkedIn members who’ve used a “Do Not Broadcast” feature that prevents the site from notifying other users when a member makes profile changes, even when a profile is set to Public.

LinkedIn says it’s also received user complaints about the use of data by third parties. In particular, two users complained that information that they had previously featured on their profiles, but subsequently removed, remained visible to third parties (other than HiQ).

LinkedIn maintains that even though HiQ wants to collect data that’s publicly viewable, it could use profile tweaks – even those listed as Do Not Broadcast – to label an employee as being at high risk of leaving under its Keeper product. It could also retain and make available data that LinkedIn users have deleted – including entire profiles.

OK, those arguments have some merit, the judge wrote. But is data privacy seriously at risk? Out of 50m users who turned on “Do Not Broadcast”, LinkedIn only managed to scare up a measly three complaints about data privacy related to third-party data collection. And none of those three mentioned HiQ or the Do Not Broadcast option.

LinkedIn is even willing to sell profile change data to third parties, if they subscribe to its Recruiter product, according to marketing materials HiQ presented to the court. What’s good for the goose is definitely not good for the gander in LinkedIn’s opinion: for years, it’s charged recruiters, salespeople and job hunters for higher levels of access to profile data, but now it’s telling HiQ to keep its hands off.

Where does this leave LinkedIn and its users? It’s a question with obvious relevance to anybody who’s looking for a new job but would like to keep the search on the QT, not served up on a platter to their current boss. Sure, we want our professional information to be public. How else would potential employers find us? But does that leave third parties free to romp, able to do whatever they like with our data, without our say-so?

We’ve seen multiple social platforms fight against the data-sucking epiphytes, for good reason: the bots have scraped publicly available data for a host of privacy-challenging and/or unsavory purposes. For example, last year, without users’ permission, Danish researchers publicly released data scraped from 70,000 OkCupid profiles, including their usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits, and answers to thousands of profiling questions used by the site.

But the LinkedIn/HiQ case could have far wider implications than just that of a scuffle between two companies. The constitutional scholar and Harvard law professor Laurence Tribe is weighing in to advise HiQ in the case, due to what he told the San Francisco Chronicle are its important constitutional and economic issues.

For a long time, this has been a central concern for me. Today, social media is the new equivalent of the public square. [LinkedIn’s actions present] a serious challenge to free expression in the modern world.

Freedom of speech is not just about flag-burning. It’s about how you use information in the digital economy. Data is the new form of capital in creating products and services.

If it does reach the Supreme Court, we’ll be sure to keep following the case.

It seems to me there are two different issues here. The first is whether Microsoft can use an outdated federal computer crimes law to stifle HiQ’s behavior. Past case law has suggested that violations of a website’s terms of use are crimes under the CFAA. But that has always seemed dubious, and this case (and the judge’s ruling) highlight the limitations of that interpretation.

The second issue is whether Microsoft can enforce terms of use that dictate what users (including corporations) can do with its web application. I don’t see why they shouldn’t be able to stop HiQ from scraping their data. It’s their service, and HiQ is using it in a way that isn’t allowed. Further, I don’t see why Microsoft shouldn’t be able to sue for damages (if they can demonstrate that damages have indeed occurred) if someone violates their terms of use.

Note: All opinions are my own and are not posted on behalf of my employer.

I see a corrolary to financial data protection where the laws are a bit more up-to-date. Still, new credit bureaus can spring up and consumers are left with the task of policing the accuracy of the data they obtain.

None of these laws provides adequate consumer protection. Instead, the focus continues to be on commerce. Following the money has resulted in shallow considerations for real privacy.

Why would Microsoft sue for damages when it is actually the Linkedin users that can be damaged and they may not even get to know about it if, say, their bosses don’t promote them because they know that they have been looking around for another job. Lawrence Tribe isn’t the only one to be worried about this. Little old me has as well for what it’s worth.