Main Navigation

The lesson of this last post is not about gmail in particular; it is that web-based software, provided as a service, isn’t going away. If anything, it will keep expanding, because the user benefits are of a sort that traditional, user-managed software will have an extremely hard time matching.

It isn’t just that web-delivered software can be very flexible (especially once greasemonkey is involved), or as powerful as all but the most powerful rich client software, or that it can be really convenient to have access to your data at any time and any place, or that it is nice to be social and trivially share things with friends. All those things are very nice, but none of them are particularly exclusive to the software as a service, and traditional software either already does better or can catch up if it wants to. If these were the only questions, I’d put my money on locally managed software.

But these relatively shallow software features aren’t the only issues. The problem here for any provider of locally-managed software, be it the Free Software Foundation or Microsoft, is that software as a service is a different architecture; an architecture which provides features which go outside of pure software. Most importantly, this architecture abstracts away the most hated and unreliable parts of the self-managed software ecosystem- hardware, support, security, and maintenance. Those are someone else’s problems- all you have to do is log in and use the software. In Jesse’s words, ‘I no longer have to worry.’

In the locally-managed software world, those issues can be truly resolved only with redundant hardware in redundant locations, reliable bandwidth, complex mirroring setups, and the application of lots of manpower, both to set things up and to minister to them when they go wrong. You can improve every part of the system, but the need for time, maintenance, and redundancy will never be completely eliminated. Hardware and software will always require maintenance, and time and skill will be needed to resolve the inevitable failures. The million dollar question is whose responsibility these things are. Hosted software promises to make that responsibility go away, so that you can focus on other things and sleep easily at night.

In an age where everyone has gigabytes of data to back up, hundreds of pieces of software to keep up to date, and so on, this ability to sleep easily at night – to not worry – to put the responsibility on someone else’s shoulders – is not to be undervalued. People will make many compromises, in functionality and in other freedoms, in order to reduce that worry and get that security. Of course, the security provided by some (all?) of the hosted service providers is to some extent illusory. Hosted service providers can be subpoenaed, or fail, or decide to hold your data for ransom. But people strongly believe (with some reason) that software and hardware are even more likely to fail, and at high cost given the centrality of our data to our lives. So until that expectation changes (either because service providers get worse, or because self-managed software gets radically better) software as a service will only become more common.

The implications of this for personal freedom will be the subject of an upcoming post; suffice it to say right now that we need to start thinking about principled services now so that we can design and implement them.

The way around that would be a p2p system. Storing and distributing your data redundantly (optionally encrypted) on a peer to peer system, (and caching it of course on your local machines) solves the problems of server management/failure/maintenance, backups, universal access, keeping stuff in sync on several of your machines, etc.

There’s several projects that I’m keeping an eye on that are working towards something like this; I don’t know if this is ever going to be realistic (it depends on performance and adoption) – but in my mind this would be the best solution. Speaking as someone who’s tired of shuffling and maintaining data on 4 personal computers and 4 servers, but can’t rely on a centralized online service either for a variety of reasons.

p2p solves some of the problems, but not all. Putting aside the theoretical difficulty of making some classes of software p2p, in the end you still have to sink time and effort into hardware maintenance and dealing with hardware failure; you have to update the software when there are vulnerabilities; you have to fix the software when it fails, and of course you have to set it up in the first place.

Are any of these hugely crushing burdens? No. Are they likely to get even less crushing with time? Sure. Are they likely to still be more than most people will want to do? I think so, at least for the very foreseeable future.

But clearly I didn’t make that clear, so thanks for bringing it up. I’ll try to make it more obvious in the next draft (since this is turning into an essay.)

In a “centralized” case you still have to do the same thing: to use gmail you need a machine that is maintained enough to have internet access, meaning a web browser and operating system + security updates, and some hardware that’s not failing yet.

In the p2p case you need almost exactly the same: a “p2p-browser”, os + updates, and some hardware.

So when you say: “Are they likely to still be more than most people will want to do? I think so” – what exactly do these people have to do _more_ in a hypothetical p2p case?

What you gain in the gmail/centralized case is that you don’t have to run your own servers, and that by putting your data (mail, calendar, bookmarks, IM rosters, gconf settings?) in one place, they’re made accessible to all your machines.

But in the p2p case there are no extra servers to maintain, and you get the redundancy and safety a “professional” service like Google) provides by spreading your data on the network.

But yes – there is the theoretical difficulty of making some classes of software p2p. But to be honest I’d rather see some more work sunk in this kind of infrastructure, than in an “Online Desktop” where we just tell everyone to just use Google, Yahoo, Facebook etc – even though I do accept that these aren’t going to go away soon, and some way of interacting with these will always be necessary.

In this series, you’ve talked about how you’re offloading worry and responsibility. What about if Google was to have a datacenter go down (e.g. espionage) and your files were gone as a result. You can’t even really complain because the service is free.

Google gears is going to make gmail work off-line. It may be working already in the current beta. It will cache all of the mail on your local disk, it’s just a cache so it doesn’t matter if it gets lost.

Of course, the security provided by some (all?) of the hosted service providers is to some extent illusory. Hosted service providers can be subpoenaed, or fail, or decide to hold your data for ransom.

If you’re storing that data at home, it can be subpoenaed too, of course. And if you try to delete that data afterward you are going to jail for obstruction of justice. Maybe you’ll get lucky and have your sentence commuted by the president.

And yes, those providers might go out of business or hold your data for ransom, but I don’t really consider those cases to be any different than a hard drive failure… except that they’re probably less likely, and are definitely less likely with a larger, reasonably well-trusted $200b company like Google or a $40b company like Yahoo.

The privacy aspects are really the big ones here, and worth being concerned about. In light of some huge privacy breeches in the US Government and at large corporations, my hope would be that SaaS becomes closer to operating like your bank (which you almost certainly use online and happily hand over all your money (analogy: data) to). Of course, email is incredibly insecure anyway and easy to eavesdrop. Your privacy is an illusion, etc.

If you’re storing that data at home, it can be subpoenaed too, of course.
Sure, though you can encrypt if it is local, and (most likely) you can get away with not handing over the password under a fifth amendment defense (though that is still very much contested law, IANAL, etc.)

And yes, those providers might go out of business or hold your data for ransom, but I don’t really consider those cases to be any different than a hard drive failure…
I consider them different; the hard drive failure is more likely. Sorry if I didn’t make this clear- this passage was intended to address skeptics of hosted services, not to indicate that I myself am terribly skeptical of them.

my hope would be that SaaS becomes closer to operating like your bank
There are, of course, banking privacy laws, and nothing similar (in the US) for data. I’m not a huge fan of government regulation, but I have a hard time seeing how we avoid it here. See the first bullet point here.

I consider them different; the hard drive failure is more likely. Sorry if I didn’t make this clear- this passage was intended to address skeptics of hosted services, not to indicate that I myself am terribly skeptical of them.

Oh, no, I understand your position. I am merely pontificating on your blog. You know, because I have a lot of free time these days.

I’m not even sure that offline matters that much for most people and most applications. We’ll have wifi in planes shortly, and I got net access via cell in the middle of the swamp of central Florida a couple months ago. (I got there via swamp buggy, but I ‘watched’ game 3 of the world series on my blackberry.)

The work in maintaining a web browser: finding a device with a web browser is easy, and requires approximately no work. Within 2-3 years, your TV will have it (via Wii or something else), your phone will have it (via iphone or gphone or blackberry or something else), and every computer you own will practically boot into it. It will take effort to find a device with a display that does not have a competent web browser. The data will be stored remotely, so the most failure-prone part of the device (the hard drive) will be irrelevant. If for some reason the browser in one device doesn’t work, you’ll just pick up the browser in one of the other two devices within reach.

Note that p2p only solves some problems- in particular, if you want to store data elsewhere, you have difficulty with search or performance or data security- you don’t get all of them. If you don’t want to your own data elsewhere, you then run into the same problems you’ve currently got- to reliably access from anywhere, you’ve got to have always-on and high-bandwidth from whereever your primary host is- either home (which makes it your problem) or colo (which means you’ve got to contract with a colo.)

I do agree with you that anyone who wants to advance freedom would be best off dropping everything else they are doing and working on solving self-hosting p2p. Or at least, some solution where data is controlled by the user instead of the service- openauth and openid seem potentially promising here.

to not worry – to put the responsibility on someone else’s shoulders – is not to be undervalued. People will make many compromises, in functionality and in other freedoms, in order to reduce that worry and get that security.

There is another axis of compromise. If you are willing to pay money (a shocking concept, I know) then you can have reliability and retain functionality and freedom — today this is called a VPS but who knows what it may evolve into. PC “system administration” isn’t that hard today, so there is hope that servers can be made equally easy.

Wes: fair, though I don’t think the barrier for most of us is money- as I pointed out (perhaps in the other post’s comments?) I’m willing to pay for freedom, but I’ve yet to actually find a host which both handles the shitwork /and/ gives me meaningful amounts of freedom- you seem to either get a set of dumb, locked-down services, or a blank box. (I currently pay for the blank box for tieguy.org.) I’m all ears, though.

Right. What we want doesn’t exist yet, although for only double the price of a regular VPS you can get a “managed” VPS with automated daily off-site backups, 24×7 hardware replacement, and maybe even someone to do the security patches for you. But it’s still mostly a blank box.

I say replace the use of the term ‘p2p’ here with “encrypted data storage that I don’t have to worry about, reliability-wise”. That could be p2p, that could be Amazon S3, or even Google. I think the argument can be made (re Luis’ post #10, para 4) that with only moderate internet access and storage requirements, you can have search and performance with the privacy afforded by encryption.

A document search index can be stored in an encrypted form remotely; assuming we’re talking keywords/faceted data (and not something weird like an XPath search over every HTML e-mail you’ve ever received), you will only need random access to a small subset of the data. If latency is more of an issue than bandwidth, you can begin speculative fetching of portions of the index as the user types (slowly) on their mobile phone. (Information leakage from the random access might be a problem, and you would still need to own/control a more powerful/connected system.)

Additionally, intelligent caching can be used to keep e-mails you are likely to care about locally, namely recent e-mails and e-mails historically of interest or involving contacts you care about. A search index for a larger set of e-mails (perhaps still avoiding e-mails you are likely to totally not care about) could also be cached on the device.

Given the current trends of 1) faster mobile phone internet, 2) ever-larger solid-state storage, and 3) everyone carrying a cellphone, I think this is all reasonable.

I do recognize Joe Shaw’s point that e-mail is currently wildly insecure by default, but it doesn’t always have to be that way.

Maybe encrypted data storage is enough for email (Rice did some work on this: http://epostmail.org/), but there’s much more to life than email. Until P2P or S3 can implement email, IM, identity, blog hosting, photo/video sharing, social networking, etc. then it’s not a replacement for today’s services.

I wasn’t suggesting that encrypted S3/P2P is going to replace the internet in its entirety, nor is it very likely as things are currently trending. And I am definitely not claiming any of this is here today; I only claim that it’s feasible and could be made largely painless with dedicated/concerted effort by concerned parties. I do think it can easily cover many of the personal communication bases that don’t implicitly give up private information in the very near future, though (ex: e-mail and IM, which is very similar).

Things like private photo/video sharing and private blogs can be done as long as the other people have network access to the same data store. I cryptographically sign an authorization for S3 to serve certain directories of encrypted files to my family/friends/whomever and include a copy of the symmetric key to each picture/video/blog post encrypted asymmetrically for that user.

I think the PGP/GPG web-of-trust thing can handle identity, although it’s more likely for someone like google to make it popular and go with a global approach rather than a distributed mechanism. (Or perhaps a federation of large corporate/government entities with a tenuously involved group of popular OpenID servers.) You could probably shoe-horn social networking into this framework, but it seems diametrically opposed to why people seem to use myspace/facebook in the first place. Unless the whole encrypted side of things lets a shady underworld flourish that people secretly want, but are afraid to expose that side of themselves on such a public forum…