Google 2000 vs. Google 2011

I sometimes hear people say “Remember when Google launched and the results were so good? Google didn’t have any spam back then. Man, I wish we could go back to those days.” I know where those people are coming from. I was in grad school in 1999, and I remember that Google’s quality blew me away after just a few searches.

But it’s a misconception that there was no spam on Google back then. Google in 2000 looked great in comparison with other engines at the time, but Google 2011 is much better than Google 2000. I know because back in October 2000 I sent 40,000+ queries to google.com and saved the results as a sort of search time capsule. Take a query like [buy domain name]. Google’s current search results aren’t perfect, but the page returns several good resources as well as some places to actually buy a domain name. Here’s what Google returned for that query in 2000: URL_1:http://buy-domain-name.domain-searcher.com/domains/buy-domain-name.shtml URL_2:http://buy-domain-name.domain-searcher.com/buy-domain-name.shtml URL_3:http://buy-domain.domain-searcher.com/domains/buy-domain.shtml URL_4:http://buy-domain.domain-searcher.com/Map3.shtml URL_5:http://domain-name-broker.domain-searcher.com/domains/domain-name-broker.shtml URL_6:http://users5.50megs.com/buydomain32/ URL_7:http://users4.50megs.com/buydomain02/ URL_8:http://domain-name-service.domain-searcher.com/domains/domain-name-service.shtml URL_9:http://domain-name-service.domain-searcher.com/Map2.shtml URL_10:http://dns-id.co.uk/

Seven of the top 10 results all came from one domain, and the urls look a little… well, let’s say fishy. In 1999 and early 2000, search engines would often return 50 results from the same domain in the search results. One nice change that Google introduced in February 2000 was “host crowding,” which only showed two results from each hostname (here’s what a hostname is). Suddenly, Google’s search results were much cleaner and more diverse! It was a really nice win–we even got email fan letters. Unfortunately, just a few months later people were creating multiple subdomains to get around host crowding, as the results above show. Google later added more robust code to prevent that sort of subdomain abuse and to ensure better diversity. That’s why it’s pretty much a wash now when deciding whether to use subdomains vs. subdirectories.

Improving search quality is a process that never ends. I hope in another 10 years we look back and say “Wow, most queries were only a few words back then. And we had to type queries. How primitive!” Mostly I wanted to make the point that Google looked much cleaner compared to other search engines in 2000, but spam was absolutely an issue even back then. If someone harkens back to the golden, halcyon days when Google had no spam–take those memories with a grain of salt.