Database Mirroring

Our world is filled with magnificent, useful databases. Our cell phones connect to a distributed database of phone numbers, iphone and android apps usually connect to some sort of database, our contact list is a little mini-db. Myspace, facebook, and linked-in are social networking databases that improve social mobility for those who are willing to try. Most every website, blog, and forum are run by databases, and search engines like Google, Yahoo, and Bing are just databases of database-driven websites.

Why

Structured Query Language was originally designed to be an end-user language. We have this trend of oversimplification and big-red-button pushing that often takes most of the power of these databases away from us. We can't run aggregate functions on a record-by-record web frontend. It's costly to rip an entire database record-by-record, but sometimes it must be done.

Caution

Cursorily search for a database-dump download on the website before you rape their bandwidth. Wikimedia offers database downloads, and web2.0 apps are likely to have APIs to facilitate ripping. Database dumps will be fantastic, but rarely available, and APIs are likely to be limited to avoid database ripping. You might need to resort to DOM Parsing.

Databases I'd Like To Mirror

Correctional Facility Inmate Databanks

John E. Polk

Local jails

Myspace

Facebook

Twitter

Linked-In

Blogs

Government Public Records

traditional 'public records': loan info, marriage certificates

tax info

Circuit Court arrest/offense record

DBA registrations

Incorporations

License plate :: person database?

Local businesses: locations, owners, etc.

Can we somehow derive expected income or market share from census data or something?