My comments on the ETRM 4.0 draft

This was my response to the call for public comments on the Information Technology Division’s (ITD) Enterprise Technical Reference Model (ETRM) 4.0 draft.

I’d like to write to you as a long-time Massachusetts resident and taxpayer. My employer (IBM) will likely submit their own comments, but I’d like to offer you my own personal views on the ETRM 4.0 draft.

I am proud of the Commonwealth’s tradition of openness in government, enshrined in our Public Records Law and Open Meeting Law. As James Madison wrote, “A popular government, without popular information, or the means of acquiring it, is but a prologue to a farce or a tragedy. A people who mean to be their own governors must arm themselves with the power which knowledge gives them.” So access to government documents, now and for posterity, is critical for public oversight and participation in government, as well as for preserving our heritage. Now that we’ve moved into the digital age, access to government documents requires that these documents be made available in a format that all Commonwealth residents can read. So the move toward open documents formats, as called for in the ETRM, is laudable. A citizen must never be dependent on any single vendor for the software needed to read their government’s documents.

However, I am concerned at the proposed addition of Ecma Office Open XML (OOXML) to the list of acceptable document formats. As you may have heard, OOXML is currently undergoing review by ISO/IEC JTC1 for possible approval as an ISO standard. As part of this review, technical committees in standards bodies around the world are reviewing OOXML and appraising it’s suitability as an International Standard. As a participant in the US committee reviewing OOXML, INCITS V1, I had the opportunity to review the text of the OOXML specification and to discuss it with others. I am sorry to report that I found the OOXML specification to be full of errors and omissions. Of course, no technical document is perfect. But this one, in particular, is of far greater length (more than 6,000 pages) and of far lower quality than any I have seen before. If it has advanced this far in the ISO process it is because of vendor pressure, not because of technical merit.

What is the problem with a buggy standard? Interoperability suffers. That is the problem. There is no doubt that if everyone in the Commonwealth used Microsoft Office 2007 on Windows Vista, that their interoperability will be good. But as soon as we admit choice in applications and operating systems, then interoperability will only occur when all sides follow a common standard. So the technical quality of a standard (accuracy, comprehensiveness, level of detail, consistency, etc.) is directly proportional to the level of interoperability achievable and the cost to achieve it.

The ISO ballot on OOXML will not end until September 2nd, after which a resolution process to fix defects in the text of the standard will take at least an additional 6-18 months. That is, of course, if OOXML gains ISO approval, something which is not certain at this point. So I would recommend a cautious approach, and wait for the ISO process to conclude, or conduct your own independent technical evaluation of the OOXML specification to confirm its technical quality before adding OOXML to your list. Ask other vendors: Is this something you can implement? Ask yourself: Will this truly give the Commonwealth the interoperability and choice that you desire? These are important questions to ask.

Finally, I’d note that the ETRM also calls out OpenDocument Format (ODF) as an acceptable format. ODF was approved by ISO last year. So why do we need OOXML? I personally think that the complexity of document exchange and translation in a multi-format world would take us back to the confusion and frustration of the early 1990’s when we all juggled WordStar, WordPerfect, Word and WordPro files, and could collaborate only poorly. Better to push for a single unified/harmonized standard document format for personal productivity applications, much as we have a single standard (HTML) for web pages.

I’ll leave you with a quote from Tim Berners-Lee, the inventor of the web, from an interview he gave with David Berlind from ZDNet when Berners-Lee was recently in Boston receiving a Lifetime Achievement Award from the Massachusetts Innovation & Technology Exchange.

Berners-Lee said:

It was the standardization around HTML that allowed the web to take off. It was not only the fact that it is standard, but the fact that it’s open and the fact that it is royalty-free.

So what we saw on top of the web was a huge diversity and different business which are built on top of the web given that it is an open platform.

If HTML had not been free, if it had been proprietary technology, then there would have been the business of actually selling HTML and the competing JTML, LTML, MTML products. Because we wouldn’t have had the open platform, we would have had competition for these various different browser platforms, but we wouldn’t have had the web. We wouldn’t have had everything growing on top of it.

So I think it very important that as we move on to new spaces … we must keep the same openness we that had before. We must keep an open internet platform, keep the standards for the presentation languages common and royalty free. So that means, yes, we need standards, because the money, the excitement is not competing over the technology at that level. The excitement is in the businesses and the applications that you built on top of the web platform.

I believe we want to ensure the same qualities in document formats. We want competition and choice among vendors, applications and services, but not among standards. If we compete on standards, then no one wins.

Claiming that ODF will prevent a slide back into “the confusion and frustration of the early 1990’s when we all juggled WordStar, WordPerfect, Word and WordPro files, and could collaborate only poorly” is a tad disingenuous.

The reason why document exchange problems fell by the wayside is because Office displaced all of its competitors. We did not only have one standard; we had one implementation. Hence [almost] no compatibility problems!

What happens if ODF succeeds in loosening the Office juggernaut? We will move from one format + one implementation to one format + many implementations.

Such proliferation will *inevitably* lead to hitches and glitches in document exchange. (Even today, no browser fully supports the open standards XHTML/CSS.) Given how much more complex ODF is–and the fact that no implementation supports it fully and with few bugs–document exchange problems will *increase* if it becomes the standard.

I’m not saying ODF is bad, or that monopolies are good, but spreading ODF could take us back to the very bad days not only of “WordStar, WordPerfect, Word and WordPro” but also Mosaic, Netscape, and Internet Explorer.

Certainly, there are two dimensions to the question: the format and the application. The ETRM, as a reference architecture, is specifying the formats, not the applications. The choice of applications is a separate one.

Glitches certainly will occur with multiple applications, and vendors will fix them. It is a well-known engineering problem with well-known solutions. Look at network protocols, TCP/IP, SMTP, POP3, etc. Not many glitches today.

But certainly having multiple applications with multiple formats will have more glitches than having multiple applications with a single format, right? I’m aware of no argument that suggests that having more formats improves interoperability. If that were indeed true, then why not have 3, 4 or 5 formats? Heck, if having more formats improves interoperability, then let’s have a hundred!

I agree with you that interoperability is maximized with a single format and a single application. But it is better to have a choice of applications and then to chose one, rather than to have no choice and be stuck with a single application by default. That way you can simultaneously maximize interoperability within the department, as well as maximize some other desirable, such as features, cost, support, ease of use, etc.

Your example of web browsers is an interesting one. The problem there wasn’t really the complexity of the standards, but the unwillingness of Microsoft to follow web standards. With a monopoly you can kill most standards by mere neglect. Luckily Firefox came along and brought a little competition to the market. Improved web standards support in Internet Explorer soon followed.

It’s true that if you have a single format and a single implementation then, almost by definition, you have no compatibility problems. There’s nothing to be compatible with.

That is, as long as you have that implementation; if you use a different operating system, or an operating system 20 years from now that doesn’t run that “single” implementation, you are out of luck. And in any case, we don’t have a “single” implementation even now, because MS keeps updating Office and changing the file format. And I can’t begin to tell you how many problems I’ve had transferring files between MS Office for Mac to Office for Windows.

Moreover, Massachusetts (and others) have already decided that being locked into a “single” implementation by a single vendor is unacceptable for public documents. This debate is already over: it has already been decided that multiple implementations is desirable, so the only decision is whether to have multiple implementations of multiple open standards or multiple implementations of one open standard.

…At least, to the extent that you consider OOXML to be fully “open” or fully implementable by multiple vendors.

I concur with Rob’s point down here, except to add that a UNIVERSALLY INTEROPERABLE version of ODF is even more preferable.

By this I mean an ODF or ODEF that is designed to handle differences in file contents which arise from implementation differences. I believe this can be done with a strong enough design & consortium which is not conflicted by business relationships of marketing office suite software implementations.

A successful ODF/ODEF is truly vendor-neutral. The trouble we are having is that the vendors are defining the standard and they are all doing an inadequate job because the profit motive interferes with the public interest in having a genuine universally interoperable document format.

I think you are referring to the OpenDocument Foundation, not to be confused with the ODF Alliance or the OpenDocument Fellowship, or for that matter, the Oregon Department of Forestry. The Foundation is a small but vocal nonprofit, run by Gary Edwards and friends. The ODF Alliance, on the other hand, has over 300 members, including IBM and Sun.

I guess I’d ask the Foundation, what level of interoperability would you expect between ODF 1.0 and OOXML 1.0, considering that ODF was designed, created, reviewed, edited and standardized years before standards work even started on OOXML? I think it is odd to suggest that ODF (and Sun) are solely responsible for interoperability with something that came out later.

I’d also ask, if Sun is trying to prevent interoperability with Office, then why did Sun write an Office Plugin for ODF? That doesn’t sound like something you do if you are trying to prevent interoperability with Office, does it?

I have as much tinfoil in my hat as the next guy, but I just don’t see this conspiracy.

Luc: as a now-independent member of the TC, I also don’t see the conspiracy Sam et al are (in my view quite carelessly) claiming. It’s a simple and convenient argument, but I just don’t think it’s supported by any facts.

Stephen: Your statement, “And I can’t begin to tell you how many problems I’ve had transferring files between MS Office for Mac to Office for Windows” is a case-in-point of what I mean.

Mac Office is actually a different implementation with a different codebase than Windows Office.

If one company with enormous resources can’t even guarantee interoperability across the same format in its own implementations, what are the odds that multiple, often cash-strapped and short-staffed (i.e. open source) outfits can do better?

Or think about it this way:– SVG became a standard in 2001.– ODF became a standard in 2006.– SVG is a part of ODF.– AFAIK nobody has yet produced a complete implementation of SVG.

Six years have passed at yet SVG is still very poorly supported. This does not bode well for ODF. If developers can’t implement the part, how in the world are they going to do the whole?

This is not necessarily the case. While many open source projects are in fact staffed by only a few individuals, many of the more successful ones are in fact capable of matching or exceeding the perhaps large, but definitely limited, resources of comparable closed-source efforts. “Open source” does not necessarily imply that everyone is working in their spare time for free. There are numerous examples of long-lived fre and open source projects which have received considerable contribution from commercial vendors who see a business advantage in improving them by collaboration instead of doing everything from scratch on their own. One such example is gcc, and many others surely exist. I am not an expert on this, but I doubt gcc is the only free and open source project to have received such attention.

Funny enough Massachusets has not adopted the ISO ODF standard but the ‘interim’ v1.1 OASIS version.

Also we are now already waiting for two years for an important upgrade to ODF in the form of formula’s and also still have to wait for a full implementation of the currents specs.

Rob himself claimed here on this blog that because of it’s smaller size and reuse of existing standards ODF would be easier to implement than OOXML. But so far this limited size has not led to full implementations and with v1.2 mayby coming this year (??) it is unlikely that we will see any full implementations in 2008 or even in 2009.

It is very strange for a legislative body to set a standard with two formats that have not proven themselves in real life. Why not just set a requirement on Office document being stored in XML ?

I don’t think it odd that someone would pick the latest standardized version of ODF. Why do you think it odd?

As for ‘complete implementations’ of a standard, consider that there are two things that may cause a vendor to implement a subset: 1) a full implementation is not needed by their customers, or 2) the standard does not specify the features in enough detail to allow a complete implementation. I hope you appreciate that the difference between these two is important. In the one case, a vendor choses to implement only a portion of the standard, while in the other case a vendor is unable to implement it fully, even if he wanted to,

My criticism of OOXML is that it cannot be fully implemented because that standard is incomplete, inconsistent and incorrect.

Also, note that ITD is not a legislative branch. These are unelected professionals working for the executive branch, a part of Massachusetts government that recently changed political parties when a new governor was elected.