Screen scraping for data in 2011? Really?

Dave Bouwman just posted about developing a mobile app for the ESRI dev summit. The thing that pipped my interest what that in order to get the data, he had to do some screen scraping. Now this is no slight on him, as it was no doubt the only way to get it. What boggles my mind though, is that in 2011 us developers are still having to resort to screen scraping to acquire public data.

I guess it’s a fundamental flaw in the way we think of publishing data. The process we go through is this…

Collect the data

Decide on the best way for people to consume it

Publish it for people to read

The problem is with steps two and three. You see when I decide on the best way for people to consume it, I’m looking at it through my world view and experience. There may be, and probably are much better ways to consume the information for people with other priorities to me. Take Dave’s example. ESRI didn’t think that people may want to know what’s on right now, or have a quick reference on their mobile device.

We need to get to the point, where we acknowledge that other people can probably make better use of our public data if we just get out of the way.

Now I’m not saying we shouldn’t present tables of data etc. We do however need to act as one of potentially many consumers of our data. Where we do have influence, let’s start publishing links to the data in JSON/XML for small data sets. Where the data is large, provide API’s.