There’s a brave new world in book publishing, and it’s being shaped by and around audience insights. Not only are publishers becoming more adept at using data to work smarter, but code and algorithms are also getting better at gathering information and executing tasks without the help of humans.

At Jellybooks, we recently developed a piece of code called candy.js, which is embedded inside an ebook to track how users actually read. Penguin Random House UK was among our earliest partners in a pilot program of the technology, and the insights we gathered were fascinating. The question now becomes what story this data tells us and what impact it might have.

More at the source.

Nice of them to warn us.

They don't say what format ebooks or what vendors carry this spyware so anybody with randy penguin UK titles run into this?

All of my ebooks are "cleansed," but none of my ereaders are connected to wifi. So, while I am bothered by unannounced payloads in any digital product, and by the vendors who think they somehow have the right to do this, I'm pretty well still under the radar.

It's going to have to be specific to a device/app, I'd think. They can't just make any-old ebook start collecting data willy nilly. Unless they're in cahoots with the firmware devs for device manufacturers, that javascript file is going to sit there like a lump in most cases. Kobo does some js, but I certainly don't think they do any old js. Epub3 allows js, but there's still going to be limitations on what it can do.

I rip open a lot of ebooks (including Penguin ones) from a lot of vendors and I've never come across this file. I'm guessing it's going to be found in something cloud-based or in the dedicated apps for ebook subscription programs.

This is a new service that allows authors, agents and publishers to conduct virtual focus groups that work as follows:

Selected readers receive a complimentary ebook (ARC) prior to publication date for reading on a third party app or device of their choice (for example iBooks by Apple). While reading the ebook, Jellybooks collects reading data for each individual reader based on unique tracking software embedded inside the ebook. The collected data is stored inside the ebook and the user can be reading online or offline. In return for receiving the free ebook, readers are prompted to upload the data with a single click form inside the ebook. Jellybooks then distribute the results as online data graphs and figures to authors and publishers.

If this .js stores the data in the ebook, does that mean that it changes the ebook? If it is a DRMed ebook, which means it is encrypted, how are they going to accomplish this?

They are probably using html local storage. This is an html 5 feature that allows data to be stored to a database locally. It is supported by modern browsers and probably also works in some browser-based e-reader software.

I doubt that e-ink devices from Amazon or Kobo are involved with this particular effort, as their e-ink devices already collect information, including of course page read data. Unfortunately it is not entirely clear exactly what data is being collected and how. This is a link to an Electronic Frontiers Foundation paper on the matter from 2012.

1. In relation to many of the questions addressed in their chart, the answer is uncertain.
2. It appears that the software itself has not been analysed. To quote from the paper:

Unfortunately, unpacking the tracking and data-sharing practices of different e-reader platforms is far from simple. It can require reading through stacked license agreements and privacy policies for devices, software platforms, and e-book stores. That in turn can mean reading thousands of words of legalese before you read the first line of a new book.

Legal agreements are useful to determine what can explicitly be done within the agreement, but does not really tell us much about whether it is being done or whether the agreement is being complied with. I suspect some of the answers have been obtained by asking the companies concerned, or by implication.

3. It is three years old. And I doubt the privacy situation has improved in the meantime.

If you want to be certain of preserving your privacy, the only solution is to see that your device never goes online. It is almost certainly okay to connect your device to an online computer via Calibre, but I wouldn't be so sure about closed source products like, say, Kindle4PC or Kobo desktop and the like. Privacy is probably completely out the Window on Android or IOS.