Sites log your keystrokes and mouse movements in real time, before you click submit.

Share this story

If you have the uncomfortable sense someone is looking over your shoulder as you surf the Web, you're not being paranoid. A new study finds hundreds of sites—including microsoft.com, adobe.com, and godaddy.com—employ scripts that record visitors' keystrokes, mouse movements, and scrolling behavior in real time, even before the input is submitted or is later deleted.

Session replay scripts are provided by third-party analytics services that are designed to help site operators better understand how visitors interact with their Web properties and identify specific pages that are confusing or broken. As their name implies, the scripts allow the operators to re-enact individual browsing sessions. Each click, input, and scroll can be recorded and later played back.

"Collection of page content by third-party replay scripts may cause sensitive information, such as medical conditions, credit card details, and other personal information displayed on a page, to leak to the third-party as part of the recording," Steven Englehardt, a PhD candidate at Princeton University, wrote. "This may expose users to identity theft, online scams, and other unwanted behavior. The same is true for the collection of user inputs during checkout and registration processes."

Englehardt installed replay scripts from six of the most widely used services and found they all exposed visitors' private moments to varying degrees. During the process of creating an account, for instance, the scripts logged at least partial input typed into various fields. Scripts from FullStory, Hotjar, Yandex, and Smartlook were the most intrusive because, by default, they recorded all input typed into fields for names, e-mail addresses, phone numbers, addresses, Social Security numbers, and dates of birth.

The following video captured data as it was transmitted in real time to FullStory:

User replay fullstory demo.

Even when services took steps to mask some of the data, they often did so in ways that continued to jeopardize visitor privacy. Smartlook and UserReplay, for instance, collected the number of characters typed into password fields. UserReplay also logged the last four digits of visitors' credit card numbers.

Englehardt said the services provide manual and automatic tools website operators can use to redact information that is collected on their properties. But the tools in many cases require large amounts of developer time and skill. And even then, sites with strong legal incentives not to leak sensitive data were found doing just that. Walgreens.com, for instance, sent medical conditions and prescriptions alongside user names to FullStory despite the extensive use of manual redactions on the pharmacy site.

Another example: the account page for clothing store Bonobos leaked full credit card details—character by character as they were typed—to FullStory. Adding insult to injury, Yandex, Hotjar, and Smartlook all offer dashboards that use unencrypted HTTP when subscribing publishers replay visitor sessions, even when the original sessions were protected by HTTPS.

Representatives for both Walgreens and Bonobos have said the sites have stopped sharing information with FullStory, according to reports from Motherboard and Wired.

It's not clear what meaningful recourses Internet users have for preventing the data collection. The researcher said that ad-blockers can filter out some, but not all, of the replay scripts. Checking the "do not track" option built into some browsers also failed to stop the logging. That means every keystroke typed into a Web field may be logged, character by character, even if the visitor later deletes the field and never presses a submit button.

Until more robust protections are available, people should remember that just about anything they do while visiting a website can be logged.

Ghostery tells me that ars is running ten advertising trackers, including ones with somewhat ominous names like "SkimLinks" and "Yieldbot". Is there any easy way to tell if all my stuff is being recorded on random tracking things like those?

I'm curious, do these capture mouse and keyboard strokes even if the web page is just running in the background? Or does the web page have to be I current focus?

The short answer is that the page has to be the current focus (and the page itself, not just the browser). Browsers (and the specs they implement) have all kinds of mandated sandboxing going on under the hood to prevent information leakage and protect user privacy (much to the chagrin of many commercial entities).

Ghostery tells me that ars is running ten advertising trackers, including ones with somewhat ominous names like "SkimLinks" and "Yieldbot". Is there any easy way to tell if all my stuff is being recorded on random tracking things like those?

Well this is alarming...I'm curious, do these capture mouse and keyboard strokes even if the web page is just running in the background? Or does the web page have to be I current focus?

Unless it installs a system wide/in-memory keylogger, for the most part, these have to be active web pages. Most webpages (and even browsers) have keylisteners and mouselisteners to help shortcut the desired action. For example, for yahoo email (yes, i have no shame), CTRL+N will take you to create new email view (instead of having to click the new email icon or menu item).

Years ago (at a previous employer) we used a HP session recording tool for automated UI testing. This was ONLY used in-house. I was not aware of these session recorders were being used on public websites, but I guess it's not that surprising.

The only valid reason I could think of a public website to use this is to analyze the UX (user experience) to see how the website/application is most widely used and were UX improvements may be utilized.

Ghostery tells me that ars is running ten advertising trackers, including ones with somewhat ominous names like "SkimLinks" and "Yieldbot". Is there any easy way to tell if all my stuff is being recorded on random tracking things like those?

My ghostery is blocking 12, so it would seem you are still being tracked by 2.

How is this alarming? Yes, the tool records what you're doing on the page, even inserting stuff into the forms, etc, which is data you're giving to the company anyways. Why would you be upset when the company is recording activity on the page when you're filling out the form to give them the information anyways?

It records what you're doing and sends it to a third party, including sensitive information. When I put my credit card number or social security number in somewhere, I don't give them permission to send it out unencrypted to a 3rd party. That's just absolutely rediculous.

As a front-end web developer I find this sad but not surprising. People often do not even think about security. By the way, you can basically only stop this for sure by turning off javascript (assuming everything else is normal). This often makes the page unusable though.

This seems like something an ad-blocker would have trouble with, even if it set out to prevent it. What about the guys that make the browsers? I doubt Google would be interested in shutting this down, but MS and Apple might be (acknowledging that MS had one of the offending websites).

Not that I needed more reasons to have by content blockers waging war on 3rd party crap but...

The real question at this point is if I'm safe in assuming that these services will be getting detected and blocked by Privacy Badger for ignoring my DNT settings or if I need to go back to assuming anything that looks like an analytic engine is malware and manually blocking it. (After installing privacy badger I stopped doing so, on the belief that if they were behaving well the risk in letting the site operator collect some metrics was acceptably low.)

Doesn't surprise me at all. I've had two or three experiences where I was in the process of completing an online account then didn't like something in the terms of service and never finished the account creation. Yet they still retained my email and started marketing to me.

Ghostery tells me that ars is running ten advertising trackers, including ones with somewhat ominous names like "SkimLinks" and "Yieldbot". Is there any easy way to tell if all my stuff is being recorded on random tracking things like those?

My ghostery is blocking 12, so it would seem you are still being tracked by 2.

This seems like something an ad-blocker would have trouble with, even if it set out to prevent it. What about the guys that make the browsers? I doubt Google would be interested in shutting this down, but MS and Apple might be (acknowledging that MS had one of the offending websites).

If you adblocker's getting feeds instead of only running on a self-created list I assume the list maintainers will start adding these people to their hit lists.

I also wouldn't be so quick to write off Google. They've been using the insecure hammer on sites leaking passwords/credit card info via HTTP where anyone could steal it. This seems similarly awful; so I wouldn't be surprised it it ends up in their unacceptable column at some point in the future.

How is this alarming? Yes, the tool records what you're doing on the page, even inserting stuff into the forms, etc, which is data you're giving to the company anyways. Why would you be upset when the company is recording activity on the page when you're filling out the form to give them the information anyways?

If you have to ask that question, you don't understand the implications in the first place.

Or, perhaps, you'd like a potential new employer to see all of the information on a resume you filled out with the mistakes you made before you submitted the version you wanted to?

When asking questions, it's always better to actually, you know, think about the possible answers than to just fall down in front of everyone and embarrass yourself like that. If you had spent as much time thinking about the answer as you did typing the question, you'd probably never have had to type the question in the first place.

The company I work for uses one of these. I get why they do it, but I really don't like them either.

If you want to tell if a website is doing something like this, it is usually pretty easy. Open the developer tools (in Chrome, hit Ctrl+Shift+I). If you see network requests firing after you do things (like type in a text field or move your mouse), then it's probably tracking.

You should even be able to see what information it's sending by clicking on it and looking at the "Request Body" or "Form Data" portion at the bottom of the "Headers" tab.

Ghostery tells me that ars is running ten advertising trackers, including ones with somewhat ominous names like "SkimLinks" and "Yieldbot". Is there any easy way to tell if all my stuff is being recorded on random tracking things like those?

There's a real opportunity for browser writers NOT connected to an advertising network (I'm looking at you FireFox) to fix this. Simply abstract a layer of mouse and keyboard inputs that don't pass on to scripts or web pages until the user clicks the button or presses a submit button. If you want to track my every mouse movement on your page you'll need an opt-in from me to get it - which ain't gonna happen.

CFAA applies. Just so long as you have used your PC for interstate commerce (bought anything on eBay/Newegg/Amazon/etc.?) or interstate communication (posting on Ars).

The only reason nobody is pushing forward with a class action, much less criminal charges under CFAA is because jurists and DAs treat the Internet like TOSs/EULAs/AUPs somehow get around federal law. This has never been the case until recently. Contract line items which deprived one of protections under federal / state law were consistently ruled against up until the last couple decades. For some reason, our legal system is doing a 180 on this in order to appease the tech and intellectual property industries.

A website just running targeted ads is in principal in violation of the CFAA, as you are affecting interstate commerce and gathering information without permission.

Decades of the WWW and we still don't have a sufficient body of law or enforcement of the existing laws. The FCC had broadcast television figured out and standards in place within a couple years of the first black and white broadcast. Demonstrably, our government has become slower and more onerous.

It'd be nice if passing regulations actually became politically feasible again. I have no problem with websites collecting usage information. But I think there are some reasonable things they should be doing anyway that it would be helpful to have legal encouragement to make sure it happens. Like requiring the trackers to report over https, standards for keeping the data private, requirements for disclosure beyond "By using our website you agree to our Terms and Conditions". A clear explanation of what data is being captured and how long it will be retained. Those kinds of things.

Walgreens.com, for instance, sent medical conditions and prescriptions alongside user names to FullStory despite the extensive use of manual redactions on the pharmacy site.

Well, that is surely a massive HIPAA violation. No really - there needs to be a massive fine. I worked in a pharmacy around the time that HIPAA went into affect. If I had done anything even vaguely similar - I would have been fired (at the least) and the pharmacy would have been fined.

The story getting out would also have probably lead to fair amount of lost business, with nearby competitors eager to market against it to boot.

amen. i love uMatrix and the clear view you get (once your eyes adjust and you get over the 'omg' moment)

being in control of your browsing experience is important and when you're on the world's largest repository of knowledge i find it difficult to excuse the ignorance. I've had my kids trained up on this stuff since they were very little!

So as I have often mentioned here, corporations have long since overstepped the data collection line and we are long overdue for legislation that,

Explicitly outlaws any data collection from customers/users without full disclosure and an opt-in from the customer/user.

Opt-ins must explicitly outline what data is being collected and for what purpose.

Corporations must purge their current databases of all current data except the bare minimum required for customer/user relations.

Online monitoring of persons browsing the Web is illegal, period. The only exception is for law enforcement purposes after establishing probable cause and the issuance of a court sanctioned warrant.

There is more, but just seeing this article has just peeved me to no end. This attitude by government agencies, law enforcement, and, more so than the latter two, corporations that they have the right to covertly engage in de facto surveillance of every private, or otherwise (expectedly) anonymous, activity in which anyone engages needs to stop. I am seriously tired of this shit!