Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

History is littered with hundreds of conflicts over the future of a community, group, location or business that were "resolved" when one of the parties stepped ahead and destroyed what was there. With the original point of contention destroyed, the debates would fall to the wayside. Archive Team believes that by duplicated condemned data, the conversation and debate can continue, as well as the richness and insight gained by keeping the materials. Our projects have ranged in size from a single volunteer downloading the data to a small-but-critical site, to over 100 volunteers stepping forward to acquire terabytes of user-created data to save for future generations.

The main site for Archive Team is at archiveteam.org and contains up to the date information on various projects, manifestos, plans and walkthroughs.

This collection contains the output of many Archive Team projects, both ongoing and completed. Thanks to the generous providing of disk space by the Internet Archive, multi-terabyte datasets can be made available, as well as in use by the Wayback Machine, providing a path back to lost websites and work.

Our collection has grown to the point of having sub-collections for the type of data we acquire. If you are seeking to browse the contents of these collections, the Wayback Machine is the best first stop. Otherwise, you are free to dig into the stacks to see what you may find.

The Archive Team Panic Downloads are full pulldowns of currently extant websites, meant to serve as emergency backups for needed sites that are in danger of closing, or which will be missed dearly if suddenly lost due to hard drive crashes or server failures.

ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive (or other archive sites).

To use ArchiveBot, drop by #archivebot on EFNet. To interact with ArchiveBot, you issue commands by typing it into the channel. Note you will need channel operator permissions in order to issue archiving jobs. The dashboard shows the sites being downloaded currently.

Kill Google AMP before it kills the web

Trust, independence, credibility – we've heard of those

Open source insider There's been a good deal of ongoing discussion about Google AMP – Accelerated Mobile Pages.

Quite a few high-profile web developers have this year weighted in with criticism and some, following a Google conference dedicated to AMP, have cautioned users about diving in with both feet.

These, in my view, don’t go far enough in stating the problem and I feel this needs to be said very clearly: Google's AMP is bad – bad in a potentially web-destroying way. Google AMP is bad news for how the web is built, it's bad news for publishers of credible online content, and it's bad news for consumers of that content. Google AMP is only good for one party: Google. Google, and possibly, purveyors of fake news.

It's time for developers to wake up and, as Jason Scott once said of Facebook, stop: "Shoveling down the sh*t sherbet" Google is now serving with AMP.

Announced in 2015, duly open sourced and integrated into Google’s mobile search, Google has pitched AMP as a way to speed the mobile web. It employs something the ads slinger calls AMP HTML that the firm describes as a “new open framework built entirely out of existing web technologies.”

What it is, is a way for Google to obfuscate your website, usurp your content and remove any lingering notions of personal credibility from the web.

If that appeals to you, here's what you need to do. First, get rid of all your HTML and render your content in a subset of HTML that Google has approved along with a few tags it invented. Because what do those pesky standards boards know? Trust Google, it knows what it's doing. And if you don't, consider yourself not part of the future of search results.

Why a subset of HTML you ask? Well, mostly because web developers suck at their jobs and have loaded the web with a ton of JavaScript no one wants. Can't fault Google for wanting to change that. That part I can support. The less JavaScript the better.

So it's not really about speed. As with anything that eschews standards for its own modified version thereof, it's about lock-in. Tons of pages in Google AMP markup mean tons of pages that are optimized specifically for Google and indexed primarily by Google and shown primarily to Google users. It's Google's attempt to match Facebook's platform. And yes, Facebook is far worse than AMP, but that doesn't make Google AMP a good idea. At least Facebook doesn't try to pretend like it's open.

The second thing you need to do is get rid of all your analytics data. Instead, you can peek at a small subset of the data Google gathers. That's the AMP analytics deal in a nutshell.

Why would anyone want to strip out their own analytics, homegrown features like interactive maps or photo galleries and create pages that won't even be shown with their own URL or branding? To get in Google's top stories carousel of course. All the cool publications are doing it.

And that’s a problem.

Anybody can cram an illegitimate idea into a web page and – so long as it's encoded as AMP content – it'll look like it's from a legit new organization endorsed by Google. Because everything in AMP looks the same. Content shown in Google's AMP view is stripped of all branding as if the content were from a legitimate news agency. There's a not so subtle message behind this lack of branding: it's that the source of information doesn't matter so long as Google got you there.

So, liberal and left-leaning newspaper The Guardian, one of Google AMP’s early adopters, gets to share space with Russian propagandists, as Andrew Betts of Fastly recently pointed out. Betts found content from Russia Today, an organisation 100 per cent funded by the Russian government and classified as propaganda by the Columbia Journalism Review and by the former US Secretary of State

Google AMP, by its design, disassociates content from its creator. Google does not and never has cared about creators, all it wants is content, the more the better to churn through its algorithms, surround with advertising and serve up to the world.

Perhaps this is all too hyperbolic for you. Here's the facts though: Google AMP is a Google project designed such that you must restrict your layout options, forgo sending visitors to your website and accept whatever analytics data Google is willing to share. If that sounds like a good deal to you, email me using the link at the top of the page, I can get you a killer deal on a bridge.

All the Google AMP cheerleaders I've been able to find are all Google employees who either created the project or are paid to promote it, which ought to tell you something right out of the box. And that Google AMP conference? Yeah that was all Google. As far as I can tell, using Google's own AMP promotional pages and some experimental searching as a guide, the only people actually using AMP are publishers so desperate to find a new way to make money they'd try anything.

The rest of us though can change this. The rest of us collectively have the power to reject the AMP deal. To say “no” to Google. Because here's the thing, it's true that AMP content gets high priority in Google's carousel, but it's equally true that if there is no content in AMP there will be nothing to prioritize. It might seem like Google is the 800 pound gorilla here, but it only seems that way.

As I've said before the power of the web lies in its decentralization, it lies with its edge nodes – that is, with you and me. If we reject AMP, AMP dies.