Reading online privacy policies cost us $781 billion per year

Michael Kassner interviews two privacy researchers who feel we are spending too much to understand privacy policies.

A close friend called a few weeks ago, asking for my opinion on Facebook's proposed changes to their Privacy Policy. I felt a certain obligation; my friend has patiently endured my online-security sermons for as long as I can remember. Besides, there was mention of a free meal.

After 4727 words and significant mental effort, I managed to grasp most of the Facebook changes. One particularly interesting alteration was the wholesale replacement of "privacy" with "data use." For example:

"Your privacy is very important to us. We designed our Privacy Policy Data Use Policy to make important disclosures about how you can use Facebook to share with others and how we collect and can use your content and information."

All in all, it was worth the sacrifice, as I enjoyed a pleasant (free) lunch with my friend while explaining Facebook's revisions.

Sign up for our IT Security newsletter!

Who reads them?

On the drive home, I wondered how many people would actually read nine pages of legalese. My next thought - are they all that long? I checked. TechRepublic's privacy policy contained 2872 words or six pages; others were in the same ballpark.

Now multiply 3000 words (rough average) by the number of websites you visit, it gets a bit daunting. Besides, time and effort spent deciphering a privacy policy is a tangible cost. Hey, I might be on to something here - visions of a journalistic scoop came to mind.

Too late

It seems I'm too late. Two privacy experts already figured it out. In the United States during 2008, reading privacy policies cost companies and individual users 781 billion dollars. My son, a business guru, said that figure is more than some states' GDP.

"In this paper we explore a different way of looking at privacy transactions. What if online users actually followed the self-regulation vision? What would the cost be if all American Internet users took the time to read all of the privacy policies for every site they visit each year?"

Determine parameters

Now I'd like to share some of the paper's results. The first slide graphs the privacy-policy word count of the 75 most popular websites:

I was curious how the researchers would determine the cost associated with time spent reading a privacy policy. The paper explains:

"Economics literature suggests time should be valued as salary plus overhead, which is the value corporations lose. In the United States, overhead is estimated as twice the rate of take home pay.

Through revealed-presences and willingness-to-pay studies, studies estimate people value their leisure time at one quarter of their take home pay."

For March of 2008, the Bureau of Labor Statistics determined the average hourly wage to be 17.93 dollars. With that in mind, the researchers decided to use the following costs:

At home: 4.48 dollars per hour

At work: 35.86 dollars per hour

Next the two doctors determined how much time an individual - if diligent - would spend reading privacy policies in one year. Their results:

Finally, all the information was tossed into the hopper and here's what they came up with:

I also had some questions Dr. McDonald gladly answered. But before getting to them, I wanted to mention this article mentions only a few of the parameters looked at. An example of the researcher's thoroughness is their consideration of whether an individual is likely to skim the privacy policy or read it in its entirety.

Kassner: Just to make sure, your research has determined the cost to read privacy policies is on the order of $781 billion using 2008 dollars?
McDonald: Yes, that was our estimate for the United States. We measured how long it takes to read and skim privacy policies. We estimated how many privacy policies US Internet users would need to read for all of the sites they visit in a year. Then we used economic estimates of how much their time would be worth, both at work and as leisure time. Putting that all together, we had an estimate of $781 billion as the value of peoples' time to read privacy policies in the United States, in 2008.
Kassner: In the paper's conclusion, were you trying to point out Internet users would read privacy policies if the time-cost was reduced?
McDonald: More Internet users might read privacy policies if policies took less time to read, but even doubling or tripling the rate of users who read privacy policies would still end with a very low readership rate. Improving the format only goes so far. But privacy policies are not going to go away, either, and even one percent of Internet users is a lot of people affected if we can make privacy policies work better.
Kassner: The paper was written in 2008, has anything changed since then?
McDonald: The idea for this paper came in 2007 when I heard someone interviewed say, "we know people don't care about privacy because they don't bother to read privacy policies." I think that notion has been put to rest: many people do care very much about their privacy, but reading privacy policies is an unworkable general solution. The Notice and Choice approach asks people to spend as much time reading policies as they do using the web. It does not work.

Now in 2012, I hear people talk about the benefits of privacy policies in terms of how the process of creating privacy policies helps companies think through their policies, how they create a legal minimum standard, and how they are useful for a very few, very dedicated people who read policies and highlight unusual practices in the press.

We were not the first authors to point out privacy policies are a huge burden on users. There is fantastic scholarship on how hard it is to read privacy policies written in legal jargon and technical jargon, and that users feel there is no point reading policies when they cannot make choices.

What was new in 2008 was that our findings suggest if you were able to cure those defects and write in plain English, that wouldn't help enough. We need a new plan. Since our work, there is solid progress on getting users more useful information by rethinking privacy notices altogether.

The Internet has changed over the past four years as well, with more third-party data gathering and more Americans online. If we were updating the study we would need to include the time to read policies from the approximately 120 third-parties that most Americans run across in a year, and multiply by more Americans online.

The second big change is a huge surge in mobile Internet use, often from cell phones. We could update with time estimates for how much longer it would take to read website policies on a tiny screen, but we cannot do a good job estimating the time to read privacy policies for mobile apps. That is because right now, the majority of mobile apps do not have privacy policies.

Thanks to work from the California Attorneys General that will change soon, and if we talk again in a few years it will be a different story again.

Kassner: Now for the tough question. If you had the ability to fix the problems surrounding user privacy while online, what would you do?
McDonald: That is an ambitious question! It is not as if there were an optimal level of privacy for all people, or if people want the same privacy in all contexts. It's so personal and particular. Let me give you a metric for how we know we are there, rather than an answer.

We can say we have "fixed" data privacy when users are able to make choices about how their data is collected and used, in ways that let them make tradeoffs and set the right level of privacy for them at that time. We will have some exceptions to picture: someone who had a car repossessed may not want a potential lender to know that, but for public policy reasons, they won't get to hide their mistakes on that one. But overall, privacy is fixed when people can make good choices for themselves.

Final thoughts

Obviously we aren’t spending 200 hours a year reading privacy policies. Does that mean we aren’t being diligent or is it because privacy policies are so complex it’s a waste of time to read them?

Thank you Dr. McDonald and Dr. Cranor for the thought-provoking research.

I skip the privacy policies and assume they'll use my information in any way that will profits them.
I try as much as possible not to give them any information I don't want to be public. If they make some information I feel they don't need a required field, I lie.
I apologize to the people at the addresses "1 main street, smalltown, USA" and a@a.com for the spam you've received.

There's just one little error in the article. These are not privacy policies, but privacy violation policies: "These are the ways will will violate your privacy... and there may be more we make up from time to time."

If they restrained the text to simple point form with straightforward wording, it would save everyone a lot of problems. Lawyers, it seems, are hired to confuse and annoy people instead of making issues clear. If you cannot explain something with simplicity and clarity, then you do not know what you are talking about.

Apart from peoples time that is wasted by facebook now it is good to see how much money they waste on a simple thing that they made over-complicated.
Privacy polacy should be as short and simple as this:
Our privacy policy: We respect ALL of your privacy all of the time.
But but no, Big greedy FB even changes privacy policy into data use policy????
thus ignoring privacy alltogether????
They should not only be boycotted, someone should tell Obama to fly a few planes into this tower aswell. Brutality policy is what i call it.
Was Zuckerberg as a kid also this brutal? How can it be possible that idiots like this may run destructive companies like that?
How to delete this madness? Lowlevel format the damn thing and then strike it with a hammer.

Personally, I usually don't bother to read Privacy Policy statements anymore, if only because they usually proceed to detail exactly what I don't want the policy-makers to do with the data that they collect about me.
I do, however, adopt some measures to frustrate, and perhaps prevent, their efforts to collect such data. Ordinarily I run Firefox 12 with Do Not Track Plus, and with No Script, which I configure to universally block Java Script from some 3rd-party collectors in particular.
To the extent that I succeed, it doesn't matter what their Privacy Policy may be. There are, however, some firms with whom I have declined to "register" a website "account", and others whose websites I rarely, if ever, visit because I have read and don't accept their Privacy Policy and/or their "Terms of Service / Terms of Use".
What we really need is a law, perhaps a Constitutional amendment, which declares that our personal data is our personal property, and, among other provisions, declares that no person or organization has any right to acquire or to use such data without our prior knowledge and consent.

The first example of do we pay attention to these kind of things is that you state he article that the cost is $781 million, and then in the section "Too Late" you state that the cost is $781 BILLION dollars. Who do we believe? This just shows it is hard to grasp what we are reading.

"McDonald: The idea for this paper came in 2007 when I heard someone interviewed say, ???we know people don???t care about privacy because they don???t bother to read privacy policies.??? I think that notion has been put to rest: many people do care very much about their privacy, but reading privacy policies is an unworkable general solution."
I'd put money on that this 'people don't care' attitude persists among corporate exec types.
BTW I know it's possible to determine how long someone has lingered on a given web page, though I don't know how often (or why) such capability is deployed. I always imagined "they knew" whether someone spent enough time to actually read the policies or not. Has any data of that type, eg actual dwell times on policy pages been collected?
I'm one of those stick-in-the-mud types that reads these things and I try to comprehend everything. There have been many policies and EULAs I've turned down. I've also chided users for violating spirit and letter of an agreement, and even 'fired' a client or two over the issue. (in those cases the EULA in question was hatched in Redmond, Washington... )
People are adopting technologies in droves, without the slightest clue whether there are any long term, fundamental consequences, let alone whether any of those consequences are bad.
To update the classic, "They were the most interesting of times, they were the scariest of times..."
There you go, Michael, making me thing again. :D

This has been a growing trend and I think it will continue for many years before someone gets bitten hard by the misuse of their info before we see our test case.
Right now it seems to me that all website owners, service providers, software producers and similar make the assumption that because you're using their product, reading their pages, playing their game, watching their media or what-have-you, that they have a right to collect any info they can about you and the experience you are having and use it in any way they see fit. This includes, but is certainly not limited to, behaviour tracking and the wholesale selling of your information for the purposes of advertising and marketing.
To me, this situation is wrong. Just because I want to use a particular website it doesn't mean I consent to you tracking my movements and selling my details to advertisers.Even telling me up front that by visiting the site I agree to these terms is rubbish - if I opened a book and the first page was "By reading this book you are agreeing to us collecting information about you. This may include your e-mail address, address of where you read this book from, the pages you turn over the most, the words you linger on and other similar information. We reserve the right to pass such information to selected third parties in either our, or your, interest. We also reserve the right to change these terms at any time without prior notification and without calling your attention to it", I WOULDN'T READ IT.
Take the same idea about shops. If I walk into a real shop I know security are watching me and I know that the store and my card provider records what I've purchased. I know this information is used to stock the right sorts of things in the right quantities and to provide new items hat I may like next time I visit. I also know that from time to time someone may analyse the layout of the store and typical consumer behaviour to optimise the layout and to boost sales and convenience for visitors. Now, apply that to a website - analysis of purchased items and of visitor experience to improve the site and stock the right sort of things. Totally acceptable and expected - even without a stupid user agreement.
I suppose I could come out and state that if I'm on your site go ahead and watch what I'm doing there, where I click, what I buy. Use it to improve your site and boost your sales and readership. Do it without my express consent - it's a reasonable thing to do. Don't then collect every bit of data about me you can and potentially share that with other entities for whatever purpose. That isn't cool and you wouldn't get away with that in the physical realm. Digitality (which isn't a word but just roll with it) doesn't excuse grubby practices. You don't own me or what I do so don't assume that you do on the web.
Right now the Internet is a data-miner/advertiser's wet dream. All the power is with those who are watching us. There ARE NO ALTERNATIVES right now (other than 'don't use the Internet'). Wherever you go you're tracked, your data is collected in various ways and sold or used with your assumed consent. No other environment we consume media in is like this and we need to start thinking about how this affects us seriously before we find ourselves in a world where to use any service you have to give up all your personal details to some grubby marketing company to be endlessly advertised at. You thought spam was bad - try aggressively targeting advertisement, all with your own consent.
For a vague idea about how power is shifting firmly into the hands of the producer rather than the consumer, check out what's happening in the games industry right now. The current trend in online passes and such to combat second hand game sales and ever increasing digital distribution channels has led to the rise of the game that the gamer never actually owns. Free to play and microtransaction fuelled games are already at a state where the real costs are the data you feed them with - and the subsequent advertising you then receive.
Right now it all seems benign enough and it's a long way off "We're all doomed! They own us! There is no privacy!" but the debate must be had early enough that we don't complacently slip into a world where assumptions about what data companies and producers own about us are far beyond what most of us would have been comfortable with.
PHEW! Sorry about the long post. Guess I had my wheetabix this morning. If you got this far, thanks for putting up with this!! :)

Andrew is spot on and a good friend of mine does that a lot. It occurs to me that if everyone used the same info in protest we would get data miners to maybe notice as they review the reasons they get so many errors or bouncebacks.
I propose we all use:
you.may.not.have@my.data.net
555-5555
0870080085
Address:
Mydatais mine
Mydataisnotyours
USA
TX
DOB: 01/01/1901
Earnings: over 100,000 pa
Occupation: Protester!

"Furthermore, we assume the rights to all creative writings, ideas, products and other commodities (both real and imagined) posted in our forums, on our blog service, on our facebook links, to our hashtag and blah blah blah blah.
Essentially, what we're saying is, we own you and everything about what you do while you're here. Now cough up a piece of your soul - there's a good user."
Sad fact is, policies that assume ownership over commentary and feedback on any given site are actually more common than you think. Check your blog provider's policies and the forum policies of any big companies whose forums you may use.

In the O'Reilly book, "Practical Programming in C", the author explains why programming languages are necessary by stating that traditional spoken languages are very bad at procedural instructions -- just look at a legal document.
It was an amusing quip, but also dead-on accurate. The depth of legal documents exists because of spoken language's ambiguity. It's very difficult to define exactly what is and is not permissible, because it can always be argued that (e.g.) something particular wasn't specifically prohibited, and then it's someone else's (expensive) job to justify the application of common sense and decency. Then you run in into spirit vs. letter debates, and so on...
The author essentially points out that "if (x > 9) ..." will always be true if x is 10, and there's just no room for debate, and that is why we don't program in English.

Like the current one against them are lost by FB they will change their tune and act responsibly.
It's a sad day when large Legal Claims need to be launched against companies to make them behave.
Now who's going to support the 15 Billion Claim against FB? Personally as I don't use it I don't really care but they are a major Security Attack Vector that needs to be closed.
Col

Sure I use no-script to block anything but the root site. What happens when the advertisers demand that their scripts run via the root site...
Those monetizing our data have a vested interest in subverting our security.
A law will push this behind a layer of obfuscation and or into the gray & black market. Of course the credit card companies already do all the same privacy hi-jacking things that these sites do with an addition of being able to verify your identity.

I can't believe I messed up that bad. It is 781 billion. It's been a while since I've gotten that kind of mistake past my personal editors and TechRepublic's. I guess when I oops, I do a good job of it.

I'm OK with how things are, but I do sense the push of the envelope. This is why I still buy physical media. I don't own a Kindle because I like to know that my paper can't be redacted. I buy CDs because I prefer higher quality source material (even if I do often listen to my own lossy encodes) that I can physically transfer to another device at will. I buy Blu-ray discs and then make my own digital copies for home theater and mobile playback.
The rights we give up in the name of convenience aren't acceptable, so I do things on my terms. My own private coup against selling my freedom, for whatever it's worth.

Facebook is one of the easiest ways to use 'social engineering' to set up an attack vector on any given entity. This has been the case since it became popular and, arguably, since it began. Sites like 'friends reunited' and dating sites were also easy kills in terms of data collection on your targets.
Social networking will always be so if users are to get the benefits they wish from them. These days users of such services are less clueless about sharing sensitive info but enough people will always do it for convenience or naivete for such sites to be a go-to for those using a social engineering approach in their activities.

The opt-in/opt-out arguments could do with a little common sense application. Then we'd find it fell firmly on one side or the other. I think online content producers need to put themselves in the situation where they look for the real world analogy to what they're trying to do. In the vast majority of cases if you collect information about me in the real world, follow me about, observe me, then use that information for your own financial gain without my express permission you'd be looking at criminal charges or a lawsuit.
Why should that be any different in the digital realm?

I recently purchased an ebook for my Android Kindle app. I have the real deal -- a valuable Mark Twain edition -- and I did not want to take it on a trip. I was mystified by the amount of alterations. I'm hoping they are honest mistakes, but one has to wonder.

You're welcome. I think we were typing at the same time to be fair. One moment there was no comments and by the time I was done, lots before me :)
I've got to cut my post lengths down though....
Glad you found my rambling interesting. Weetabix was good ;)

It is not surprising for a large entity like Facebook. It probably costs more to run FB for a day than the GDP of many countries. When is the last time you paid your Facebook bill? *THAT'S* why they track your habits. Anyone that believes they really are getting something for nothing is horribly naive.
Can you even imagine how attractive FB would be to marketing analysts? SOooo much information, given freely, constantly, and without any bias -- posted just because it's on someone's mind, not because (as in a survey) it's the "right answer" or the message they want to send to a particular company. And marketing firms must be chomping at the bit for statistics and demographics from such a heavily-trafficked site.
Ol' Zuck gots bills to pay.

I hardly find that surprising somehow.
The sad fact of it all is, other companies will follow suit when a popular company, site, producer or what-have-you successfully uses a tactic to increase their revenue. Facebook will absolutely not be the only people using this sort of tracking tactic.
Cookies were designed to track preferences on a site and make your life as an internet user better. These days they're the gateway most companies are using to gather info on you, all with you and your browser's implicit consent.
Hateful cheating web-ruining ultratrolls!
EDIT: This makes me wonder how the Android and iOS apps track you beyond cell location or GPS co-ordinates. FB is integrated on many Android builds so even if you aren't a user, how does that affect you? This may require some looking into.
+1 for making me think and supplying the link.
hey! I'm a poet, and I didn't know it :)

the particulars of the lawsuit filing, and the article to which you link is unclear about them. The lawsuit is a class-action, seeking damages for the alleged specific violations of several federal and California statutes that pertain to privacy.
So, the principal issue is whether Facebook has violated any of the cited laws, [i]i.e.,[/i] whether Facebook policies and/or practices violate them. That is not quite the focus of the article, though [i](quote follows)[/i]:
[i]".... Fundamentally, the case revolves around accusations that Facebook tracks its users even after they have left the company's website and subsequently browse to other Internet locations."[/i]
First, as far as I know, doing that does not violate any law(s), and, if it does, there are a huge number of companies that are committing criminal acts.
Second, is that practice something which is not disclosed in the Facebook Privacy Policy and/or the ToU/ToS? If memory serves, Facebook not only does "track" users, but also publicly *advertised* the practice as a *benefit* of using Facebook when they implemented it.
Don't ask [i]me[/i] why I don't have a Facebook account -- just read their Privacy Policy and ToU, then try to use their huge number of vaguely-documented "privacy options" to stop them from selling everything that they can learn about you to the highest bidder.

For the money factor to be overcome it's the good old USA that would have to lead the way. At a risk of insulting our 'special friends' the current USA is partly based on the idea that money brings freedom and happiness. I know it isn't meant to be that way - the ideal was pure - but let's face it in a capitalist society you need law to defend your freedoms and in the USA (and every other 1st world country to one degree or another) money is needed to buy you effective access to use the law to defend your rights.
Look at a multitude of court cases - a legitimate tactic on both sides of the pond is to simply out-last the other guy when you know you can afford to keep a case running but they can't. Sad but true.
So, to challenge the lobbyists in a court of law to defend your freedoms to defend your right to decide what happens to your own data you will need more money than they've got. Either that or the law system will need to change to take money out of the equation. Hardly likely in the US, UK or anywhere in Europe for that matter.
Gawd - I really am pessimistic today. I'm sorry!

A lot of the comments here come down to exactly that - "Cus they can" applies a lot in the physical realm too and when it's damaging behaviour laws are passed, regulators are appointed and penalties imposed to discourage people from continuing.
Is this possible on the Internet? Is it feasible that we appoint a watchdog backed with harsh penalties to hit companies that collect your info despite you and make money on it without your consent?
That is the difficult path and could even be seen as the first step on the slippery slope to creating 'Teh Interweb poh-leese' if it isn't handled with the appropriate delicacy. Could simply a laww banning data collection practises be enough without a watchdog?

(1) I am not a lawyer, and (2) any pending lawsuits aside, there are constant and many campaigns to pass various laws on this matter and/or to change existing ones, so I'm seldom totally up-to-date.
Nonetheless, what I stated is, as far as I know, correct: as long as a person or organization does not commit a specifically illegal act, such as installing "malware", they can install and/or store anything they want on a computer system or network device which they do not own or lease.
First, doing that or attempting to do that does not prevent the owner of that computer system or network device from doing what they want with their property (within the law, of course) -- [i]e.g.,[/i] preventing one or more persons and/or organizations from installing and/or storing anything on their property, and/or removing anything that someone else installs and/or stores on it.
Second, these are still sensitive issues that often have "gray areas". It is *usually* assumed that the person or organization must have, at least, *implicit* permission to install software and/or to store data, especially for an indefinite span of time. Whether a firm has "permission" to do that simply because you have instructed the browser on the computer that you are using -- one that is not necessarily *your* property, note -- to fetch a page from the firm's website is one such "gray area" (i.e., as far as I know, it is not a clearly-decided matter).
For example, when a website instructs a browser to retain a "cookie" *indefinitely*, the website owner/operator is using the property of another party to do it -- whether with that party's prior knowledge and consent. On the face of it, the website owner/operator has implicit permission to use the cookie feature of a browser as a necessary and desirable means to "maintain state" of the current "connection" between the browser and the website (the apparent original purpose for using a cookie). Whether they also have *implicit permission* to use a cookie(s) for other purposes such as tracking is another matter. Regardless, most browsers have features that allow the user to "manage" cookies, including preventing a website(s) from setting a cookie. The website cannot stop the user from applying such a feature(s), but they can deny the user access to the website or limit their use of it when the website cannot set a cookie on the other party's computer [i]via[/i] the browser.
In contrast, *usually* it has been assumed that the person or organization must have *explicit* permission (a) to gather data about a computer or a network device, i.e., except for data which the other party voluntarily discloses, and/or, (b) to gather or alter existing data, especially data which has not been previously stored on the computer or network device by that person or organization. Doing the first without permission is, generally, illegal "hacking" and doing the second without permission is, generally, illegal "data theft" or "data alteration".
That said, we must be vigilant about proposed legislation which might change those assumptions, whether reinforcing them instead would be better. These issues are often addressed in the context of Privacy Policies and in Terms of Use / Terms of Service disclosures. The fundamental issue is not disclosure, however.
Ultimately, the fundamental issue is whether we have the right to control the use of our property by another person or organization who uses it *for their benefit*, regardless of whether there is any benefit that we might receive, wanted or not, as a result of their control of it. Succinctly: which party has the power?

I had no idea that this was true:
"Currently there are no legal constraints as to what a person or organization can install or store on a computer system or network device which they do not own or lease"

Quote: " .... In the vast majority of cases if you collect information about me in the real world, follow me about, observe me, then use that information for your own financial gain without my express permission ..."
Actually, there is nothing that you can do about that, unless you can prove that the person following you around is violating a law against "stalking". If s/he is a threat to you or harassing you, then you might be able to obtain a restraining order from a court.
Mobs of photographers, for example, surround the residences of movie stars and other famous (or infamous) people, watching and waiting for an opportunity to take a picture which they can sell to publishers. From time-to-time one of them "crosses the line" between public and private, then the victim can obtain a restraining order against that one from a court -- but not against all of them, generically.
There is no privacy in a public space (in the USA). As long as the Internet is considered to be a "public space" data collectors can "follow you around", but you don't have to let them place or maintain "web beacons", or permanently store data, on a computer system, either. It's not their personal property even if it isn't yours.
Currently there are no legal constraints as to what a person or organization can install or store on a computer system or network device which they do not own or lease -- as long as they don't violate a law which prohibits and punishes them for some specific act(s). There are laws, for example, against installing "malicious software" on computers, against gaining "unauthorized access" to computers and networks, against "unauthorized use" of computers and networks, and against data theft, alteration and/or destruction.
What it boils down to is that anything which is not prohibited is permitted. As we say, "it's a free country". However, if someone causes damage to your property and/or inflicts injury and suffering upon you, then you can sue them -- but there's no guarantee that you will win.

For the current model of the internet to prosper, yes, it would. Many of the internet's most optimistic fans have prophisised that the Internet (or a global network very like it) would be one of the keys to breaking down international barriers though - creating an international law globally accepted could be the snowflake that starts that avalanche.
Let's face it though - in today's climate this would never happen. Our nations are too far away from truly co-operating so how would this sort of law really work?
I believe that we'd end up with the 'AOL effect' - each country would own and operate it's own national network governed by it's own laws and hooked up to the global internet for access. Censorship, prosecution and conditions would push Internet companies into favourable territories much the same as what happens in the pirate community right now.
Pessimism or realism? I can't quite decide right now.

Client side security is unworkable, full-stop. As bboyd says: "Those monetizing our data have a vested interest in subverting our security."
If we engage in protecting ourselves client-side all that will happen is the old protect and crack argument. Those monetising the data will simply find a new way to track us and get at our information, thus continuing their grubby little practises. It will be the same as virus writers and security experts - client side protective mechanisms will always be the reaction to new trends and will always be one step behind.
Fact is this - it needs to be enshrined in law that our data is our own and that nobody has a right to it unless we specifically state otherwise before these sorts of things will cease. Harsh penalties need to be applied to this or companies will simply 'take the hit' as the detection rate of what their doing from the average internet user will be low - after all, how many of us actually try and find out how a cold-calling company got hold of your phone number?
(I heartily recommend asking to speak to a company's dialler manager next time a sales person calls you out of the blue and then demanding information on where they buy their data in from. Watching them try and wriggle out of telling you anything is hilarious)

This is why there is such a lot of pressure to move everything to the "Cloud".
"Shady" Corporate and Government groups will have unlimited access to your info and complete control over what you are allowed to access.

Remember Winston Smith's job, in "1984" by George Orwell?
Wikipedia
http://en.wikipedia.org/wiki/Nineteen_Eighty-Four
"[i]At the Minitrue, Winston is an editor responsible for historical revisionism, concording the past to the Party's contemporary official version of the past; thus making the government of Oceania seem omniscient. As such, he perpetually rewrites records and alters photographs, rendering the deleted people as "unpersons"; the original documents are incinerated in a "memory hole". Despite enjoying the intellectual challenges of historical revisionism, he becomes increasingly fascinated by the true past and tries to learn more about it.[/i]"
Eliminating "hard copy" is every petty dictator's dream (Corporate and Government).

Talk a lot? That makes two of us.
I agree with a lot of what you say here. I may not like Apple and it's control freak mentality but I have to admit that without them portable digital music and tablet/slate computers would have struggled more than they did. Kudos to them for that at least.
Content producers will choose the highest visibility platform in the vast majority of cases. This is how Amazon, ITunes and FaceBook absolutely rule in their spheres of influence. It's a bit like the old adage - money begets money. In this case, content begets users begets content begets users begets content begets MONEY $$$
The protect and crack argument also covers many areas relevant to a site like TR. Again, I refer the right honourable ladies and gentlemen to my earlier gaming example - and look where content control mentalities are getting us. Gamers are being punished because their chosen passtime is being made needlessly difficult because producers can't adjust to the digital age. Producers still try to enforce old strictures of payment, distribution and control which conversely makes those who aren't buying their product enjoy the better experience. You buy a game with a silly DRM system (anything by Ubisoft or the recent Diablo 3 debacle) and the chaps with the cracked copies, who aren't as restricted as you, seemingly get the better deal. Same with music and digital publishing - why buy protected DRM laden music or e-books when the cracked or unprotected files offer the better experience?
Most of us want cheap, sure, but as users we want to see more content and will be happy to pay a reasonable amount for the right experience. Just because it's digital doesn't mean I don't own it - I'm not 'renting' files from the producer. The data is on my devices being consumed by me after I've paid for it. I may not own the copyright but those files are sure as hell mine to consume how I please.
Bottom line - protect and crack is destructive, costs money, inconveniences real users and doesn't work. The movie industry found it, the music industry found it, the games industry found it, the digital publishing industry is finding it now and not a one of them is learning the lesson. Digital distribution demands a new strategy and a new business model. I hope they catch on soon.

"... does not space between a period and a W ..."
They probably scanned the printed text with an optical character reader for the input to the software that created the e-book file.
When I took the typing course, we were taught to space TWICE after a period before typing the capitalized letter of the next word that started a new sentence. Two spaces on a monospace font make the text very easy to read rapidly. About ten years later there was a big debate among some use-net groups as to whether the second space was just a waste of expensive and limited storage space, and of 300 bits-per-second modem bandwidth. So people began writing use-net messages with just one space between sentences, after colons (:), etc.. (Do I need TWO periods after an abbreviation that ends a sentence? Any more nits to pick today?) The practice spread to other texts created with software and stored on computers.
Eventually monitors became able to display graphic modes with enough resolution to make variable-space fonts feasible. They had been used in printing since the beginning, but ordinary typewriters used only monospace fonts (until IBM introduced the Selectric with the whirling plastic ball). Now they could be used by text-editing software AKA "word processors". The problem is that the variable "space" character in printing fonts is often quite small, so even more than two must sometimes be used between sentences (especially) to make the text legible.
That wouldn't be "cool" in today's electronic computer age, would it? So texts are printed with just ONE space between a period and a "W" .... and an OCR might simply fail to recognize that tiny *extra* space for many fonts.
But spelling and grammar errors? Those are just either poor or non-existent editing!!

I believe much of this is a way of seducing the content producers. Take Apple for example, with iTunes. Jobs really pushed for the 99c song. Then he pushed to remove DRM. Was this altruistic, or a strong belief in people's rights? I doubt it, but he had to woo both parties:
Producers want their content protected while consumers want freedom. Producers want to maximize their profit while consumers want to pay as little as possible. However, producers will follow the channel with the most visibility, and consumers will purchase from the channel with the most content. So... it behooves both sides to compromise. Jobs did a stellar job of walking this line. He was a champion for both parties.
Interestingly (to me, anyway) I find the small number of online music purchases I've made are more often from Amazon than iTunes. It's almost counter to my preferences -- AAC vs. MP3 is no contest, since MPEG4 is superior in terms of quality vs. size. Amazon has a less-than-perfect track record for QC, and even though the MP3s are encoded at the highest possible bitrate, the quality of those encodes is dependent on the encoder and the result is not always as good as it sh/could be.
Nevertheless, I find myself gravitating toward the MP3. This is mostly for the ubiquity (I DJ from time to time, and MP3s have better software compatibility), but I also trust the format more. I know it can't be locked down -- there's no mechanism for that in the file format.
The industry learned with CDs -- you can't just put it out there without a way to constrain it, otherwise you WILL lose control of it. It's a never-ending cycle of protect and crack, so the rules get more and more strict until we end up with products we never truly own. The sad part is, this never really protects anyone, it just makes life difficult for everyone.
Geez, I talk a lot. Sorry guys.

I think this generation is a little more lax about professional appearance. Blame it on social media, texting, whatever. I'm regularly surprised when respected publishers (e.g., Forbes) post something online that is full of typographical or grammatical errors. It seems like there isn't as much concern over errors -- especially when something is published online. Maybe because of the decreased time-to-publish? But, the content we read is shifting heavily toward online media, so it's no longer an auxiliary outlet -- it's becoming the primary delivery channel.
TV is getting this way, too. Particularly, newscasts, talent shows, that sort of thing -- live interaction, mostly. The production quality is beginning to decline despite better graphics, a sharper picture, multichannel audio. The technology gets better, and the human element gets sloppy.
I waffle between chiding myself on being pedantic over things that ultimately aren't important (hey, I'm a casual guy -- who wants everything all rigid and stuffy, and who am I to judge anyway?) and fretting that these are signs of a dwindling attention to detail that will bleed over into other areas that are more important.
Take games for example. Back when you paid $50 for a Nintendo game cart, the publisher got one chance to get it right. If there were show-stopper bugs, that meant an expensive product exchange. Now, you buy a game and wait for hours while it downloads hundreds of MBs of patches. This isn't totally fair -- after all, NES games were typically 256KB or less, and had to run on exactly one hardware configuration, but this only excuses problems that can be attributed to the complexity of the game or the operating platform.
I don't lose sleep over any of this, but I do find it interesting to ponder sometimes.

I trust Amazon for the most part, and it's my choice. What I have a problem with is when a third-party sticks their nose into the mix and my not have a clue who they are. Which by the way is the topic of my upcoming article.

But, I would expect spelling and grammatical errors to be non-existent.
I am currently reading a freshly-published book that for whatever reason does not space between a period and a W -- in every case.

I could have used this alongside my gaming anecdote. Damn my memory!
E-books are going the same way as games and Amazon are at the forefront of grubby practices in this arena. That's a shame because they're also at the forefront of pushing the tech forward (much like Zuckerberg and Facebook are to social media - pioneering yet useful). You buy a Kindle edition but Amazon still control whether you can use the item you bought and how you can use it and collect data on the whole experience whether you like it or not. Again, assumed ownership and consent.
Here's the danger. We tend to accept the grubby or uncomfortable practices because of the genuinely pioneering approaches coming in alongside them. We need to be careful to separate what's tasty from that which makes us queasy and feedback to our pioneers appropriately so they can truly supply exemplary services we all love to use and are safe to trust.
For me my own personal coup against that one is to own a Sony e-reader. better screen, no silly company controlling what I do with it. To a similar vein I stay away from Apple's products. I'll use my tech my way on my terms, thank you very much.

by an acquaintance who worked with creating e-books, differences between e-book & paper-book texts arise because the "source" file(s) that contain the paper-book text have a format that is not used for the input file(s) of the e-book reader. So the source content must be transmogrified to a file which the e-book reader can use. The number and kinds of errors that will occur during that process varies considerably. The source file(s) often have a format which is proprietary and peculiar to the software that was developed for its editors by the publisher just for their own use -- and the software that is used to transmogrify it must be modified so that it can read it. (Frankly, I don't much about input files that e-book readers read .... but, as far as I know, the various readers only read files that are in a proprietary format which is peculiar to the brand of reader).
Usually a human editor(s) will review the two respective texts, either visually reading and comparing them, or using software to compare them and find differences. I don't know why it would be so, but I've been told that the editor cannot always alter the e-book data file text(s) to exactly match the text in the source file(s). I gather that sometimes they aren't given enough time to do more than a cursory read-through and correction, too.
Then there are the huge multitude of books for which no editing software data-input file was ever created. After all, we printed and read books for about 300 years before any electronic data processing device was ever built.
For various reasons, there are often differences in the text in two respective books that contain the "same" work, even if they were produced by the original publisher, especially if they weren't printed in the same "run" of the press. Any popular work is likely to be re-published by other publishers (with permission of the current copyright holders), and the respective texts are usually not identical word-for-word, if only because of typographical errors.
An e-book such as a work by Mark Twain is, of course, a re-publication, but of which printed source? Even if the source is the original manuscript (not likely), there could be differences in interpretation of Mr. Clemen's handwriting -- if you've ever seen a photocopy of it, you would see why. :-)