2004.10.21

OK, so [WHAT- hey now stop that!]

a few things I've been mulling over

These are some placeholders for more coherent conversations later.

There has been a lot of banter back and forth on O/R mappings in the past few months. Ted Neward made the Vietnam comment and that has stuck with me for a number of reasons. I think that the major reason it has stuck with me is that the statement could only, in my estimation, really apply to the notion of objects at all in an integration world, rather than to O/R. I think the reason for that, in my mind, is that when Ted et al. talk about O/R, they are talking about a *product.* When I talk about O/R, I am talking about *techniques.* My take is simple: if you are persisting your data in a relational database, AND you are reifying that data at runtime in objects, then you HAVE an object-relational mapping problem. That means you have to come up with an object-relational mapping solution. I think the real question is more along the lines of Ted's comment about Hibernate, i.e. at what point do you declare victory and move on. The solution space for O/R is quite broad, and the solutions range from ad-hoc/unaware to generative to fully metadata driven, reflective solutions. I will post a bit more on O/R at some point, because it is something I just need to get off my chest (out of my head).

I think the more interesting question is whether it makes sense to use objects in the middle anymore at all. I think the answer is "sure, why not?" But at the same time, I think that there are a lot of compelling xml related technologies that are making that middle less and less compelling. My good friend chris talks about how things seem to be getting turned inside out again, which is to say, maybe our data and its associated behavior don't co-habitate, but they at least live in the same apartment complex or whatever. I think that schematron does a good job of externalizing lightweight business rules about the instances, and it optionally scales up to support document-oriented workflow.

I had a discussion with Stuart Celarier of Fern Creek after Daniel Cazzulino's schematron talk, and Stuart was particularly not inspired by schematron in general and the notion of a document that is passed and grown between different services participating in a workflow in particular. His point seemed to be that each of the interactions in the workflow should be a separate document, each allowed to have its own schema. (Sorry if I misrepresent your points, Stuart.) He also didn't seem to inspired by Daniel's statement about not needing multiple schemas. I think that the many-worlds aspect of schemas relative to infosets makes what Stuart was going after a compelling argument. In Enterprise Integration Patterns (again, I think one of the best books on service orientation on the planet today), they talk about three distinct message patterns, command, event, and document. I get the impression that what Stuart believes is necessary and sufficient is the command message, but I tend to buy into the notion that there are different reasons for different types of messages.

To me, a document-oriented approach has a number of tasty attributes, architecturally speaking. First, the conversational state accumulates and gets carried along as the document grows from message to message. This, of course, allows for fewer stateful operation, and statefulness often has a correlation to scalability. Allowing the data to float around in the overall workflow, and split and merge, will let you develop interesting peer-peer architectures. Second, the notion of the document is relevant in the sense that we as people think of documents as legally binding, and that can help provide guidance on what goes into the document. The information that you pass between services is the stuff that all of the services participating in the workflow at the same level can see. This forms a canonical, shared schema between the parties - a mini (gulp) ontology of sorts. [Don't say the 'O' word. It's not PC] The validation requirements change over time, and so something like schematron phases can be pretty powerful, especially because it is extremely simple and leverages xpath, allowing for a flexible in memory processing model. It also has detractors, like the conversation getting fatter and fatter with time.

This type of approach has been useful in the entity aggregation work I have been a part of at my client. They don't know it yet, but the new and improved version is going to be schematron just above the wire, and then shove the xml straight into database. Nothing else in the middle. This is pretty different for me, because I have been involved primarily in OLTP systems or operational data stores as web interfaces to legacy systems, and I haven't really dealt much with data warehousing, but in the sense that we're aggregating entities from lines of business and we absolutely don't want to understand all of the business "logic," having an object model in the middle just doesn't make much sense, but the validation does. XML Schema as it exists in .NET today definitely doesn't cut it. The validation error messages tend to not be useful, and so we have to do some level of home grown validation. The simplest thing was to have boolean xpath assertions and associated validation error messages, and the implementation started looking a lot like a trivial schematron, e.g. pattern down, or maybe even just rule down.

It's been a pretty good day of talks so far.

(I modified this entry slightly to fix some grammatical problems that were highlighted elsewhere. Oops. Gracias.)