Microsoft “Oslo” MGraph – the next XML?

Microsoft’s upcoming “Oslo” modeling initiative is about tools and languages. MGraph is the piece within the language “M” that defines values, while MSchema is for schemas, functions and constraints and MGrammar is for textual DSLs. “Oslo” is still CTP and it will take some time until all concepts are available for production use.

By then, Microsoft plans to publish an open specification, such that everyone who wants to can implement the “M” language. Their ambition is to make it be as broad as XML is today.

Anyone can implement it. We want this approach wide spread on a variety of different platforms. We want it to be as broad as Xml is today.

What is MGraph?

MGraph at it’s base is not a language. It is a simple contract for storing structured data in form of a labeled directed graph. This is a set of nodes where each node has an optional label and a set of successors, each of which may be a node or any other object.

The idea behind that is, that every structure can be exposed as a MGraph just by implementing the following interface, which is the core API behind MGraph.

MGraph as a language

What I do care more about is how MGraph’s textual notation looks like and how it compares to XML.

On MSDN you can find a language specification covering MSchema and MGrammar which both use parts of MGraph, but in a slightly different manner. Microsoft definitely plans to bring those pieces together.

Today MGraph is used for values in MSchema extent initializers as well as for the AST (Abstract Syntax Tree) productions in MGrammar.

The basic syntax of MGraph very similar to JSON:

label {
otherLabel { "value" },
"value",
},
"value"

As mentioned previously a successor can be either a node or a value. A value is just written directly, while a node is split into a label followed by its comma-separated successors within curly braces.

The same data as a XML-fragment would look like this:

value
value
value

One of the major differences is, that MGraph doesn’t distinguish attributes and elements. As XML is used today, anyone use attributes and elements according to their personal taste anyway.

Typed values

The next great difference is, that values are not just strings, but typed. Some of the intrinsic types are Text, Logical or Number.

Escaped Labels

While XML-elements and attributes are restricted to QName, a label in MGraph can be any object. They way how this is expressed in the textual syntax is not finished yet, but in MGrammar productions more complex strings are defined with an id-function.

id("some label xyz")
{
true,
id("another node") { "value" }
}

Ordered and unordered successors

In order to make a mapping to relational structures easier, successors are not sorted by default. In order to sort the successors, they have to be encapsulated in an integer-labeled node.

{
0 { "value1" },
1 { "value2" }
}

Which alternatively also can be expressed by brackets instead of braces. In the “M” jargon this is called a sequence.

[
"value1",
"value2"
]

Single successor nodes, or labeled values

A named value in MGraph is just a labeled node with a single value successor. The equals-sign is just some syntactic sugar for better read- and writability. In the “M” jargon this is called Entity, but this name is subject to change. Record structure might be a better name.

person
{
name = "John Smith",
age = 24
}

equals to

person
{
name { "John Smith" },
age { 24 }
}

Better than XML?

XML is great. Mostly because it can be read by almost every system, not because it has such a nice syntax. It was never meant for the purpose it is used for today either. It is a markup language for defining additional metadata onto text.

But what XML is broadly used for today, is configuration files, transport messages and even internal DSLs. For this kind of information, that has more structuring elements than data, XML is way to verbose.

Therefore I think MGraph with its tight syntax has the potential to become a great and broad alternative.

What do you think?

Comparing XML, JSON and MGraph

Comparison of MGraph, JSON and XML using the Google Maps geo-code of my home address.

Related

22 Responses

I think there are some nonconformitys in the xml example, I think WordPress have insert some smilies, I hope you haven’t counted this characters.

MGraph seems to be an cool alternative to XML and JSON, its small and have stereotypes, but if it comes broad that is the big question.
The support for XML and JSON is IMHO too big, XML in the Business Application domain and JSON in the Web/AJAX domain, so I think it will be very difficult for Microsoft to get the critical mass of installations.

I think two nice features that XML has but MGraph is missing are:
1) You can specify in the XML file itself which XSD schema should be used to verify it. In practice, XML documents can be self-validating.
2) Namespaces – which lets you mix tags and attributes from different purposes. That is the key to XML’s strong composability.

It would be nice to write a library that can convert from XML to MGraph and back…

Foremost, thanks for the good intro to MGraph. I never understood MGraph up until I saw your blog. So thanks a lot.

On the other hand, I dont think MGraph will replace JSON or XML just because both XML and JSON came as open standards and MGraph and MSchema will stay in Microsoft domain, even if MS publishes it as open standard. Will have acceptance similar to WordML compared to OpenOffice.org just because this came from MS.

Also your count for JSON and MGraph could have been same except the double quotes for each “Name” used by JSON Vs MGraph. I have not tried, but using quotes around “Name”(s) might allow for spaces to be embedded (May be I should try it tomorrow !).
I do agree XML is very very wordy to represent simple repetable data. I almost created my own representation of metadata markup sometime ago like yaml but did not spend enough time to take it further. I am sure the following representation would be faster for parsers to consume.

With all due respect, I think you’re missing the big picture behind Oslo’s M et. al. In a nutshell, here’s how the world is going to change w/ respect to software development:

Goal 1. metamodel disparate implementations.

Goal 2. close the gap between functional specification (the WHAT) and technical specification (the HOW).

Goal 3. put problem definition and solution specification into the hands of non-techies.

And, the X10 productivity will be realized. In fact, that’s rather reserved for vs. the overwhelming majority of conventional paradigm and practice. X1000 or more is doable, and that’s not hype. Since I’m finding very little on concrete examples, I’m thinking of blogging some. My shop’s done this 10+ years and getting an out-of-the-box toolkit is welcome, because maintaining our tools and focusing on the end deliverable turns into X2 or more work!

Here’s a brief rundown:

With Goal 1 you get the ability, for example, to dataflow entities from a web page to the backend database. A data entity can pass through client-side markup and script implementation, to server-side execution implementation, to backend database implementation. If you’ve scrunched it all into a single metamodel, you can deal with it much easier in every development lifecycle aspect: conception to production support.

With Goal 2, you get into “automated specifications” that are (a) explicit, (b) concrete in terms of requirements, use cases, etc. And, (c) automated so that you can *parse* them into *implementation* (as well as round-trip implementation back into specifications).

With Goal 3, get out of the imperative programmaing paradigm and into declarative modeling paradigm. M et. al. is not yet-another way to do XML, JSON, etc. Rather, it is a way to *declare* the building blocks of a model. That gets parsed into metadata (see Goal 1) and from there you move into Goal 2. The data gets applied to a *pre-existing framework* which results in a full-blown production quality app being produced by someone with little-to-no tech skills, but understands his/her problem domain very well.

With Goal 4, M will be used to define zillions of DSLs, at first. Then, you’ll see an industry shakedown into standardized DSLs for about every problem space out there. And, these won’t be anything like programming languages. They’ll be declarative in nature, coupled with graphical representations (or generated from that in many cases).

This is where Oslo is headed. Microsoft is mum on the generator portion, the part providing the pre-built frameworks into which declarative data modes are applied. But, it’s not a new idea in the least bit.

I’ve been digging into MGraph a bit further, and there are two very positive points :
1. It’s very easy to parse (like jason, unlike xml)
2. There are representation for typed native types (unlike both jason and xml… ok for xml with typing namespaces but it becomes really a mess to parse and becomes extremly verbose).

It has been mentioned before, but instead of always pointing to XML, advocates of MGraph should better compare it to YAML which is much more similar. This lack looks like the developers of M are either ignorant or blind which both does not make MGraph look very sophisticated.