GeoCities to be kept alive on Bittorrent

Just a year after the massively popular web host GeoCities was shut down, a team of hardy archivists are planning on re-releasing the service, in its entirety, as a single, giant torrent file.

This time last year, GeoCities was in its final death throes as Yahoo, which bought the site in 1999 for $3.57 billion, announced that it would be closing down the much-loved service.

Not content with just freezing accounts and bouncing back new registrations, Yahoo planned to forcibly toss hosted content off the web. From the closure announcement in April to October 27th 2009, more than 38 million user-built pages were ditched from the internet.

Several archival teams attempted to save parts of the site as a form of digital archeology. Despite GeoCitie’s penchant for <marquee> and <blink> tags, animated gifs and embedded midi soundtracks, its hilarious primitivity served as an important reminder of just how far we’ve come. From crude HTML to CSS3 and from single purpose pages to complicated content management databases, it would be a shame to lose so much historical context.

“We feel that it is a waste to leave the internet with a hole of this magnitude,” says ReoCities, a website that offers a Firefox Extension to “fix” broken GeoCities links and redirect URLs to re-hosted content. “By doing this, [Yahoo] were taking away a big part of the history of the internet with them,” writes GeoCities.ws, another archival project.

But perhaps the most ambitious attempt yet to restore GeoCities comes from Jason Scott and his Archive Team, who proclaimed this week “we are releasing GeoCities on a torrent”.

As the site was crashing down last year, his team sent out “hacked” Googlebots to automatically recover and download content. “At one point we were well past 100 megabits of bandwidth yanking onto all our archives,” Scott writes. He mentions that the vast majority of sites took up less than 10 megabytes each.

They didn’t get every single page, “but we know we got a bunch of GeoCities sites -- a significant percentage, especially of earlier, pre-acquisition data”. They also compared notes with other archival teams and merged recovered data.

And soon, you’ll be able to look through it all yourself. Scott says the torrent file is for “anyone who feels like browsing among the artefacts of yesterday, who wants some data to play with, who is doing research into history or who wants to get some mileage out of a few weblog postings of crazy glittery animated GIFs and MIDI music”.

The file will likely get close to a terabyte in size -- the team is aiming for a 900GB file after compression -- so you might want to buy a new hard drive if you plan on downloading it. The team doesn’t have an exact date for the torrent’s release, but you can send an email to geotorrent@textfiles.com to get a one-time notification of when its available to download.

“While it’s quite clear this sort of cavalier attitude to digital history will continue, the hope is that this torrent will bring some attention to both the worth of these archives and the ease at which it can be lost -- and found again,” says Scott. “Clear your disk space -- this one’s going to be a doozy.”

Edited by Duncan Geere

Comments

"the team is aiming for a 900MB file after compression"

Just thought I should point out the file will in fact be 900GB, not MB. :)

Harry

Oct 29th 2010

You're absolutely right. Thanks for pointing it out. Corrected!

Duncan

Oct 29th 2010

This is a copyright nightmare. Many of the GeoCities websites that were archived probably contained infringements of copyright (MIDI or mp3 versions of popular songs, pictures snagged from elsewhere on the web). In addition, I don't see any mention of the guys at textfiles having obtained copyright permissions from the thousands (hundreds of thousands?) of people whose websites they will be distributing in the torrent.

I wouldn't be surprised to see lawsuits result from this.

Spoilsport

Oct 29th 2010

really this is what they get taken web history away from us!!! for less then 100 dollar 1TB drive hell today 2TB =$100.00 like i said HISTORY even that the pages were not that good but glade to see that its still alive

billy

Oct 31st 2010

Alas, something systemic seems to have recently happened to the Reocities-archived Geocities pages. Background images have ceased to display, font coding seems to have been lost, and page formatting has largely broken down. Sad to say, but some pinhead has apparently vandalized the museum.

Tiberius

Nov 3rd 2010

I've managed to recover 104 sites from Geocities from the torrent. The torrent is only seeded to 23% so most of the zips are corrupted. But here is what I could rescue: http://networkprogramming.wordpress.com/2010/11/17/saving-geocities/