Decentralizing media types

Here's a case, I think, where the scale of solutions in a corporate Intranet are different from the solutions at Internet-scale.

Say you're in an IT department, want to use RESTful web services for your SOA, but have your own canonical XML schemas for representing data in many of your business domains. How do you register those media types?

You could

use the plain application/xml media type and hope people will sniff the XML namespace and hope it accurately describes what's in the document (most common, not very RESTful)

use my own media type with my own private registry (pretty common but not necessarily interoperable + consumers require a priori knowledge of where the registries are)

use the most general media type you can for the representation and a URI as a media type parameter that points to a registry with more metadata (which could lead to some interoperability, cacheability, etc.)

go back to using SOAP and UDDI. (....)

Obviously #3 seems to make the most sense, with caveats. I echo other commenters when I say that "application/data-format" is too general, that the metadata shouldn't just be RDDL (an HTML page may be more useful in practice!), and that the number of registries should be minimal.

Media type proliferation is a governance problem. On the Internet, the IANA is the governing body. In an Intranet, .... it depends on your governance model. What's clear is that having everyone's IT department register their own vnd media type seems both silly and untenable because those media types will not likely be general. So they'll have their own corporate&partners registry.

As for mixed vocabulary semantics, we do have a problem -- but RDF/OWL is a non-starter for most IT departments. I agree this should change some day, but, baby-steps are needed. So, what can an IT department that wants to use RESTful media types for its SOA do to indicate representation meaning *today*, without adopting the Semantic Web?

For this I imagine a registry that points to a model, whether written text, UML, ERD, or something more formal, that shows an architect or developer how the mixed elements relate to one another. In other words, use configuration management as a palliative. This does not solve the problem in general, but it arguably makes for a workable solution in a smaller scale.

So, coming back to decentralized media types, here's what I see:

There are many that feel a need to introduce a standardized "more information on this representation" hook , beyond just the IANA media type.

... But to work best with the deployed web, and to be most general-purpose, it seems this URI should be somewhere in the HTTP header.

The debate is mostly matter of whether a) there is such a thing as a general purpose "more info on this media type" resource , and b) if so, where to place the link, so that it fits well with the deployed Web and doesn't necessarily cause problems for a future Semantic Web.

Note that the "type" attribute for the Atom content element does NOT need to provide a MIME type that has been REGISTERED with IANA. The Atom spec only requires that the value "MUST conform to the __syntax__of a MIME media type." (emphasis added)

Using this approach you get the benefits of a general/universal format like Atom, with the ability to label your custom content in a legal/standardized way.

There's nothing wrong with it, it fits in with bullet #2: putting metadata hints in the representation itself, though in this case it's providing a wrapper.

This is how the deployed web works today. It's a good alternative to mucking with existing headers.

The drawbacks are that
1. we're now extending the HTTP envelope with an XML envelope called an Atom entry. This doesn't play well with deployed intermediaries and there are debates about performance. But, then again, this was the original intent behind SOAP.

2. It still doesn't help describe mixed vocabularies, lest we create a media type for every possible combination.

3. an agent would have no idea how to find information on that unregistered media type (not too big a deal , could just ignore the media)