Archive for March, 2006

I’ve known about some work coordinating Postgresql and XML for years, looking at that post and it’s predecessor, it’s been in the works for at least 3 or 4 years. While I wasn’t watching, it made it into the Postgres standard distribution in the contrib section.

The general idea is that you can store an xml blob in the database, and then use it as semi-structured information for queries and the like. It’s not relational, normalized, nor is it a replacement for actually putting things into rows and columns like a database should be, but I actually have a use for it. (It can also be used to do XSLT transforms for output, but that’s not my problem now).

For a while, I’ve been caching complete results in an xml format in my database after parsing out the bits that I need — stuff that I’m not immediately interested in that might have fields added to it as time goes on, but that’s really not worth pulling out into it’s own relational model right now. Occasionally, it’s nice to be able to do some ad hoc mining of those blobs to see how well things have been performing, and that’s where some XML parsing in the database really beats eyeball or regex parsing.

So now that I’ve just been saving these chunks, I can issue a query like:

and see two fields from deep into the xml pulled out for me to filter and join against.

This is one of those cases where there is an impedance mismatch between relatively dynamic data and static sql modeling — similar to the impedance mismatches that make object relational mapping so much fun. This sort of thing has been described in terms of type safety and compile time type checking. Often times you just need a way to defer parsing or understanding chunks of data until later. That article mentions blobs of XML in interfaces, this one is in data storage.