Amazon is now hosting allUnited States TIGER Census data in its cloud. We just finished moving 140 gigs of shapefiles of U.S. states, counties, districts, parcels, military areas, and more over to Amazon. This means that you can now load all of this data directly onto one of Amazon’s virtual machines, use the power of the cloud to work with these large data sets, generate output that you can then save on Amazon’s storage, and even use Amazon’s cloud to distribute what you make.

Let me explain how this works. The TIGER data is available as an EBS storeEBS, or Elastic Block Storage, which is essentially a virtual hard drive. Unlike S3, there isn’t a separate API for EBS stores and there are no special limitations. Instead an EBS store appears just like an external hard drive when it’s mounted to an EC2 instance, which is a virtual machine at Amazon. You can hook up this public virtual disk to your virtual machine and work with the data as if it’s local to your virtual machine – it’s that fast.

The TIGER Data is one of the first Public Data Sets to be moved off of S3 and switched to an EBS. By running as an EBS users can mount the EC2 instance as a drive and easily run their processes (like rendering tiles with Mapnik) with the data remotely. If you're a geo-hacker this makes a rich set of Geo data readily available to you without consuming your own storage resources or dealing with the normally slow download process.

I love the idea of Amazon's Public Data Sets. It's an obvious win-win scenario. The public is able to get access to rich data stores at a relatively cheap price and Amazon is able to lure said public onto their service. Smart.