The short form of this post is that I and Archive Team would like to ask you to donate money to the Internet Archive. During the month of December, they’ve got a 3-1 matching partner, which means that every dollar you donate results in $4 going to the Internet Archive’s funds. That is unbeatable and if you want to support what Archive Team is doing and support the Internet Archive at the same time, you will not. find. a better. deal.

So, the great news: Archive Team has been KICKING ASS. This band of people are pretty much an establishment now, with various sub-groups doing daily, hourly work to rescue at-risk websites, retrieve lost data, change the regard of user-generated content, and even get in the face of people with influence and decision-making and make them change long-held beliefs. We are doing really well.

Among our greatest additions is the Archive Team Warrior, a virtual machine that runs on a bunch of platforms and produces an easy-to-use, fast, and well-coordinated effort to download the entire content of a website. It’s friendly, it’s beautiful, and oh man, does it work.

Seriously – this thing is a monster. It took us 9 months to download all of Geocities, which was roughly one terabyte of data. Now, we can run through with this distributed preservation of service of attack and download many sites within a week or month that dwarf Geocities handily. And it’ll be in REALLY nice shape, REALLY great integrity.

We have been rescuing a LOT of data, people. We’re past 320 terabytes.

THREE HUNDRED AND TWENTY TERABYTES OF HISTORY, OF USERS, OF LIVES.

That’s a big deal. And so big, it got on the Internet Archive’s radar, as in “blip, that’s a lot of data you just uploaded”. And yeah, it is! 320 terabytes is, by any current standard, nothing to sneeze at. By our estimation, it represents over 4 million user accounts spread across dozens of now-defunct services and sites.

And as of this month, they’re showing up in the Wayback machine. The Internet Archive is now putting up the newest load, with over 10 Petabytes of web history and media available which includes 240 billion website snapshots. The vast majority of Archive Team downloads are going to be up on the new Wayback machine, meaning those sites that were referenced by others will return. Fun fact: When Geocities went down, Wikipedia had over 100,000 links to Geocities sites for their citations. Wiped out in a night.

We continue to monitor the world and bring in data by the truckload and the Internet Archive has been kind enough to host that data. Without questioning it. Without complaining.

Now it’s time to pay back.

The Internet Archive is a non-profit (that, I disclose, I work for as a “free-range archivist”) that has, since the mid-90s, provided many petabytes and millions of items for free, to the world, to better the world along the way. Movies, radio, books, TV news, software, you name it… the Internet Archive has it, and continues to make it go for everyone. Every day, every night, with an eye on “forever” as a goal, and not just “until we try to sell you an upgrade” or “until we’re bought by someone else”. It’s a library and an archive and it just kicks ass.

Archive Team alone is costing the Internet Archive tens of thousands of dollars. That’s a cold hard fact – we’re doing the work that companies should be doing themselves, and Internet Archive has taken that brunt. But there’s good news.

Next, a partner has come forward to do 3-1 donation matching for December 2012. It’s a holiday hard drive! Every dollar you donate results in $4 for the Internet Archive. I’d been dreaming up campaigns and kickstarters and a whole other range of potential fund-raisers, but the fact is, nothing I can come up with beats a 300% instant return on investment. Nothing.

So please do it. Here’s the breakdown of how Internet Archive spends that money:

Pretty much all this money goes into hard drives, and the hope is to raise enough money for 4 petabytes of disk space, which will wipe out Archive Team’s effect AND budget lots of space for next year.

That would be tax deductable in the USA only wouldn’t it. So for the rest of us, we’ll just have to do it out of the goodness of our heart.

Quick question, when the IA gets bitcoins, what happens to them? Are they immediently converted to dollars (at an exchange) and then that amount is considered as the amount to match? Are they immediently converted to dollars by the process of an employee handing over cash and then getting bitcoins in return (and then spending them somewhere else presumably)? Or what?

Donated. Weird request, if you get a chance: could you possibly publish the specs for the typical fileserver being built for use by archive.org? I recognize the case, but the internals/OS/filesystems and the choices and expertise involved are some metadata I’d love to see in a public archive