Thursday, March 02, 2006

In a world with infinite storage, bandwidth, and CPU power

In particular, I liked slide 19, 20, and 31, all of which makes it clear that Google isn't losing its wide-eyed optimism.

Slide 31 says that Google's philosophy to new product development is "no constraints" and that they initially ignore "CPU power, storage, bandwidth, and monetization."

Slide 20 says (in the notes) that Google plans to "get all the worlds information, not just some."

And slide 19 (in the notes) talks about how their work is inspired by the idea of "a world with infinite storage, bandwidth, and CPU power." They say that "the experience should really be instantaneous". They say that they should be able to "house all user files, including: emails, web history, pictures, bookmarks, etc and make it accessible from anywhere (any device, any platform, etc)" which leads to a world where "the online copy of your data will become your Golden Copy and your local-machine copy serves more like a cache". And, they say that they want "transparent personalization" that uses user "data to transparently optimize the user's experience ... implicitly."

Google also recommits to a future with personalized search. They say in the notes on slide 12 that they will "introduce new personalization elements" and that they view that as one of two major directions for their efforts to improve relevance rank.

Some might be inclined to dismiss all this talk as the wild fantasies of engineers with too much caffeine, but I think Google does see their ability to build out their massive cluster as one of their primary competitive advantages. I think they do intend to continuing extending their computing infrastructure until everyone everywhere really does feel that they have near infinite CPU power and storage at their fingertips.

Unfortunately, this new PDF version of the slides no longer has the notes attached to each slide, so you can't see some of what I was referring to in my comments above.

However, I did download the original PPT presentation. Though I didn't keep a copy, I recently discovered that my Google Desktop cache does contain a text-only copy of notes for slide 12 and most of slide 19. The cached copy ends in the middle of the notes for slide 19.

Here are the notes from slide 12 with the reference to using personalized search to improve relevance rank:

Lead in SearchAs the market leader, we need to ensure search doesn't become a commodity. Our focus on search is nothing new. We built our brand on being the best search engine, with the best results, and as our competitors have caught up to us, it's become even more important for us to focus on:1) SpeedSolve international speed issues and bring international users to US performance2) Comprehensiveness and freshness"All webpages included in the Google index and searched all the time" -- Teragoogle makes this possibleExpand to other sources of dataBecome the leader in geo search (any search with a geographic component).New forms of content -- video, audio, offline printed materials3) RelevanceLeverage implicit and explicit user feedback to improve popular and nav queriesIntroduce new personalization elements4) User InterfaceExperiment with several new UI features to make the user experience better

And here are part of the notes from slide 19. Unfortunately, my cached copy ends right before the discussion of "transparent personalization" that I mentioned above:

In a world with infinite storage, bandwidth, and CPU power, here's what we could do with consumer products --Theme 1: SpeedSeems simple, but should not be overlooked because impact is huge. Users don't realize how slow things are until they get something faster.Users assume it takes time for a webpage to load, but the experience should really be instantaneous.Gmail started to do this for webmail, but that's just a small first step. Infinite bandwidth will make this a reality for all applications.Theme 2: Store 100% of User DataWith infinite storage, we can house all user files, including: emails, web history, pictures, bookmarks, etc and make it accessible from anywhere (any device, any platform, etc).We already have efforts in this direction in terms of GDrive, GDS, Lighthouse, but all of them face bandwidth and storage constraints today. For example: Firefox team is working on server side stored state but they want to store only URLs rather than complete web pages for storage reasons. This theme will help us make the client less important (thin client, thick server model) which suits our strength vis-a-vis Microsoft and is also of great value to the user.As we move toward the "Store 100%" reality, the online copy of your data will become your Golden Copy and your local-machine copy serves more like a cache. An important implication of this theme is that we can make your online copy more secure than it would be on your own machine.Another important implication of this theme is that storing 100% of a user's data makes each piece of data more valuable because it can be access across applications. For example: a user's Orkut profile has more value when it's accessible from Gmail (as addressbook), Lighthouse (as access lis... [...TRUNCATED...]

Update: Derrick made the full notes for slide 19 available in the comments to this post.

Update: The full story about why the PPT version of these slides disappeared is now clear.

When I first posted a few excerpts from the notes to the slides, I had assumed that the notes were intended for the speakers of the presentation. I was annoyed and even a bit angry when the PPT was pulled, not fully comprehending why Google wouldn't want to make the notes generally available.

It now appears that many of the notes in the slides were cut-and-pasted from other presentations, never intended for Google Analyst Day. As mb points out in the comments to this post, the notes for slide 10 contain an odd reference to CBS, something I didn't notice when I originally was reviewing the slide deck.

Even worse, the notes to slide 14 contain revenue projections for next year, also something I didn't notice previously. Because Google published these projections to their website, even briefly, they were forced to file a 8-K with the SEC. In that filing, they say that the notes were "not speaker notes prepared for the Analyst Day presentation."

All very unfortunate.

Google's mission may be "to organize the world's information and make it universally accessible," but some information is not intended to be accessed by all.

Update: After waiting for the press storm to fade, Paul Kedrosky posts the original PPT file with the troublesome notes included.

Update: Nearly two years later, the WSJ reports that "a service that would let users store on its computers essentially all of the files they might keep ... could be released as early as a few months from now."

Here is the full text comment of pg 19. What the heck is lighthouse and is there a GDrive project within Google?

Purpose of this slide:In a world with infinite storage, bandwidth, and CPU power, here's what we could do with consumer products…Theme 1: SpeedSeems simple, but should not be overlooked because impact is huge. Users don't realize how slow things are until they get something faster.Users assume it takes time for a webpage to load, but the experience should really be instantaneous.Gmail started to do this for webmail, but that's just a small first step. Infinite bandwidth will make this a reality for all applications.Theme 2: Store 100% of User DataWith infinite storage, we can house all user files, including: emails, web history, pictures, bookmarks, etc and make it accessible from anywhere (any device, any platform, etc).We already have efforts in this direction in terms of GDrive, GDS, Lighthouse, but all of them face bandwidth and storage constraints today. For example: Firefox team is working on server side stored state but they want to store only URLs rather than complete web pages for storage reasons. This theme will help us make the client less important (thin client, thick server model) which suits our strength vis-a-vis Microsoft and is also of great value to the user.As we move toward the "Store 100%" reality, the online copy of your data will become your Golden Copy and your local-machine copy serves more like a cache. An important implication of this theme is that we can make your online copy more secure than it would be on your own machine.Another important implication of this theme is that storing 100% of a user's data makes each piece of data more valuable because it can be access across applications. For example: a user's Orkut profile has more value when it's accessible from Gmail (as addressbook), Lighthouse (as access list), etc.Theme 3: Transparent PersonalizationThe more data, access, and processing Google can handle for the user, the greater our ability to use that data to transparently optimize the user's experience.Google Desktop w/ RSS Feeds is a good first example: the user should not have to tell us which RSS feeds they want to subscribe to. We should be able to determine this implicitly.Other potential examples: User should not have to specify the "From" address in Google Maps; user should not have to specify which currency they want to see Froogle prices in; user should not have to manually enter their buddy list into Google Talk.

I like the story of the disappearing ppt: - For some weird (?) reason, if you search for 20060302_analyst_day.ppt on Google, nothing comes up (not even in the Google cache ;-). alltheweb at least finds Greg's mention, Yahoo even tomcaster's copy (which lacks the comments) ...

An important implication of this theme is that we can make your online copy more secure than it would be on your own machine.

Except now the government does not need a warrant any more to access your information, it can be requested with just a subpoena, which is suprisingly easy to get. Admittingly though, for the average user this would be more secure, its just the legal ramifications that worry me.

I think it's ironic. Google promises "no contraints" - and then promptly deletes its own content, so it has actually created a constraint and has prevented readers from seeing the PPT ! Sounds like something Big Brother would do. " Constraints " are defined by google , not by the user !

Great stuff, Greg. It’s a classic blunder that many companies make, inadvertently leaving information in Microsoft Office documents that they don’t want the outside world to see. They also revealed that they have little understanding of usability for investor relations websites. I've explained that here.

Perhaps lighthouse is an IM app based on the google talk format that implicitly finds your important contacts. Maybe the bandwidth constraint is the use of voice recognition to search gtalk voice calls, or maybe some sort of video conferencing and video conference archive search. I wouldn't put it past them, especially given their recent foray into "voicemail" inside gmail.

Lighthouse. If you look at the context, it is refering to an interface which can access information. Gmail accesses information from Orkut. Lighthouse must also be a Web-interface. My guess is that it is file access. Similar to Flikr except files. So, when I upload a MS Word document, I can post it to Lighthouse for it to be viewed/reviewed by anyone I designate who has a gmail account. Consider the technology of Sharepoint and put it into Google terms.. makes a lot of sense.

It's kinda funny that Google got burned by accidental release of information, just like privacy advocates are worried that users will get burned when the Google Grid archives our digital lives.

Today Google filed an 8-K with the SEC since some of the presentation comments contained financial projections. In the 8-K, they say some slides were copied from an "internal product strategy presentation." (See http://www.sec.gov/Archives/edgar/data/1288776/000119312506047267/d8k.htm)

From reading the full notes at Derrick's site, it seems that other slides may have been copied from presentations to CBS. Witness Slide 10 which attempts to sell CBS on exposing their video "assets" to the "wisdom of crowds."

Yeah infinite storage so they can sell all our secret information to the U.S. government damn it! No, Google will never cooperate with the U.S. government. -- they won't give child porn records to the Justice Department.

I remember Mr. Ellison from Oracle wanting to do this a long time ago... everything will be in a thin client experience with no "pc" needed. I disagree with the part about upgrading everyone to U.S. speeds....try Eastern world speeds with their fiber everywhere. The U.S. needs to catch up man.

...> An important implication of this> theme is that we can make your online copy more secure than it> would be on your own machine. ...> Another important implication of this theme is that storing 100%> of a user's data makes each piece of data more valuable

Google likes to throw ideas out there, to use the market as theircomputing device for what works. However, by the reasons (*) listed below, google already knows that this one will not fly.

Why do this, then? They might be just trying to raise market awareness for the problems of such approach. Even though Microsoft already had to pull the plug on a very similar program (google "hailstorm microsoft"), Microsoft is still in an ideal situation to try it again and better. Which (given users' notorious naivete') would kill a large market segmentfor google's search -- namely, every Internet user. Of course, google'ssearch appliance for enterprises (and later, a more affordable gadgetfor the masses) would not have these problems...

Brilliant preventive move by google, it seems, as it looks for options-- and time -- to better place its technology.

Cheers,Ed Gerck

(*) What google proposes is a direct contradiction, for several reasons:

(1) Because you *still* have your local copy, the online copy becomesan _additional_ risk. Risk MUST increase with the added online copy.

(2) Even if the online copy is encrypted (best case) with a key thatgoogle does NOT have, the file may still be attacked and decrypted bya variety of methods -- some of them not even cryptographically orcomputationally limited.

(7) Either contradicts legal requirements for confidentiality or makes google legally liable for safekeeping everyone's data against any disclosure risk (including disclosure that is legally mandated, which is always a risk to comply with because any order is potentially disputable).

Why the google paranoia and "evil" witch-hunt? Storing files on the net IS the future. Don't store stuff online if you don't want to, but don't imagine it won't happen and don't say it's a bad thing that should be stopped.

I did this same copy-and-post of the proof when I found Kaiser Permanente System Diagrams online. Though they had been on a public web site for up to five years and Kaiser had unclean hands in retaliating against someone who blew the whistle on them, Kaiser sued me. Furthermore, a California State agency backed them up and issued a public order against me: though this agency had no jurisdiction over me, they justified their actions because they thought I might be doing something dangerous on the Internet.

When I try to tell people what happened, all they see is some disgruntled person who should quit complaining and move on. I can't seem to get anyone to understand that what I did was common Internet activity (note you just did it), and my case has great ramifications for anyone operating under the assumption publicly posted web documents are in the public domain. Because of my case, corporations now have the means to perform a shakedown (the court system is burden in itself) on anyone who engages in this sort of Internet whistleblowing.

During the months I sought help for this, I was turned down by Stanford Cyberlaw, the EFF, and the ACLU. No one cared about my situation, yet people continue to act as if the Internet could still be assumed to be public domain. I tried to point this out to John of AMERICAblog when he did his whistleblower victory lap over the cellphone records scandal, just as I'm trying to point it out again here to you.

The U.S. is now acting as a tyranny where they see people as weak defendants (me). Please don't assume that the targets will always be the weak defendants. The precedents will eventually be used against all citizens, and the effect on free speech and public participation will be chilling indeed.

Nice coverage on this matter,Greg.I was thinking around six months back , that why Google is not delving into Chat service and GDump(my abbreviation).And now I see Google launching its Chat,that's taking a toll on other chatting service.About GDump, I thought of something like a big dumping storage space provided by google ,where users can dump the things they want to share with others and we can have some categorization on them based on regions or types of dumped materials.And now I am hearing about this online storage space by Google .Quite excited!.Waiting to see what it will be like.

With all the ruckus raised by this, I've yet to see one shred of info on where the original ppt posted by Kedrosky came from, or if it's legitimate.

Greg Linden retrieved partial comments from his Google Desktop cache, but not the original file. Then Derrick mysteriously posted all the comments, but didn't post the original. Now Kedrosky posts what is supposed to be the original but with no explanation of where it came from.

It's a little suspicious that the zip file Kedrosky posted contains a ppt doc dated 3/6/06 at 11:15 PM, while the Analyst Day was held on back on 3/2/06. Did someone change the document after the Analyst Day? Did someone copy the comments from Derrick's site into the redacted slide deck?

Sorry if I sounded like a crackpot conspiracy theorist. I'm not disputing that Google published the ppt notes accidentally, or that your Google Desktop cached some of the notes.

I was pointing out that there's not enough information to determine if the ppt published by Kedrosky's is authentic. He doesn't say where he got it, and the file date is four days after the Analyst Day. For all the drama (Google Desktop archives, SEC filings), it would be nice to have some reason to trust the authenticity of the file.

Paul said today (http://paul.kedrosky.com/archives/2006/03/08/the_google_ppts.html) that he personally downloaded the file, and may have changed the modified date of the file. That's good enough for me.

ha there using microsoft software to present there ish...lol microsoft owns the world and google is no competition. also..with this leaked, microsoft now has the upper edge with this new user storage thing and they can improve on googles way of things...look out google, the new live is coming strong and hard!

I would not trust Google at all or for that matter any company with such a vast amount of personal information. Google has been known to track people, it would not stop there. This sounds like they are moving into P2P?

I find it quite funny that many people distrust any Government institution with personal information and yet they are readily willing to hand over their desktop and content to some private institution who is much less contrained by law and do as they please with the information ...and with an ultimate goal of simply making money.

It should also be noted that any corporate Goliath can be brought down by simple events so read the fine print. If gdrive is where you will store your cherished data then you may be interested to know what would happen to that data should the company ever become insolvent. Nothing is "free" and there is a price to pay for any service ...pay me now or later is the only difference.

Sorry folks, gdrive is no-go from the get-go for me. As disk drives become less costly by the day, I will and continue store my data at home on a couple external drives ...simple and effective.

Installing a secure remote desk-top to access my data from anywhere allows me the freedom which gdrive suggests. Add to this that my important data is encrypted in case my remote desktop ever gets compromised.

All Microsoft needs to do, for the average home user, is simplify the above process to install and support remote desktop, storage device add-ons and data encryption.

I remember when the server was the king and I didn't like it and neither did my colleagues ...it was called the "mainframe". More power to the desktop baby!!!

Back-up needs to be more than just make an extra encrypted copy to leave at the house! Sure, you can remote into your desktop from anywhere in the world but what would you do if you were remoted into your PC and your house burned down? Where would your data be then? Also, what makes you think the media you are using to copy/save your data is going to be available in 10 or 20 years? You can't find anything to translate music from an 8-track tape and VHSs are on their way out now. In digital format the playback media isn't a concern and you are always up to date!

Google can't shoot anybody, so I think any worries are overstated. They're trying to make a buck, not take over the world. Now if they're selling private data to the Federal Government, then that IS a problem.

Well, beyond comments and doubts, this is gonna happen.Why?, Well, world's power has get through 3 phases, when the one with most Lands were the most powerfull, then when the one with most capital where the most powerfull and now since 1970 so far, when more INFORMATION you have more powerfull you are.Fist it all, internet it's a dangerous for any goverment, because it (actually) permits any kind of personal's opinions or argumet at your reach, that can differs with what goverments wants us to believe or accept as true.This is gonna be solutionated in few time with the "internet privatization" that has been put in march for all the "pipes" companies like AT&T, Cysco, Motorola, etc.This company want to price more the more you use their pipes, like for examples, youtube. This mean that you get package for the kind of information you want. Besides that other programas like p2p will be elimenated or reduce to the lower velocitys conections, say nothing about webs with "no correct" material, like page with political information that they find "incorrect" for the people.That mean the internet's free expretion extermination. So, you'll hear what they want you to hear. Exactly like tv or radio.And finally the information's centralization where make easier all this, and ofcourse, google bring government access to all your web information, but not, porn information, the really information like: are you believer of such belief, are you a "terrorist", do you have or manage information that can be dangerous for "national security". The stupid media news about google dening information to goverment is just a news for the people believe that google is gonna a be a guardian of you information more important and confidential, that you can trust it. Well, you can not avoid it , but you can make money with all this. An example, buy stocks from companies that creats the next generation massive storge information "disks", make a "short sell" of stocks of companies that makes and dedicates to PCs.This can sound paranoic, but well, is like world moves and always moves.