The OOXML Compatibility Pack

Just saw something worth noting. I was on a machine running Office XP and tried to open an Office Open XML (OOXML) formatted document. I don’t know why I tried that, but I did.

Word was smart enough to put up the following dialog:

Now, that is something I hadn’t seen before. I think we all knew that Microsoft was planning a compatibility pack for enabling OOXML on Office 2003 and Office XP. But my 2002 version of Windows XP knows about OOXML? I guess this wisdom must have come down in a previously downloaded Office patch.

In any case, if you click yes, you are directed to this page where you are offered a download of “Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats (Beta 2)”. I had the pre-req’s, which included Windows XP SP2 and Office XP SP 3. So downloading a few file conversion filters should be simple and small, right?

Well, simple, but not so small. I was suprised to see that the convertors download was 43MB. That seems a bit large. In comparison, you can download a complete copy of OpenOffice.org, with included support for ODF documents and the Office binary formats, and the entire product is only a 93MB download. The 0.2 ODF Add-in for Word is only 1MB in size. So why does adding OOXML support to Office XP require a 43MB download?

In any case, once it is downloaded and installed, the integration with Office appears seemless. You can open OOXML files from the Windows Explorer by double-clicking on them, you can browse and load them as expected from the File Open dialog in Office, you can re-save files in OOXML format via the File Save, you can create a new document and save it as OOXML, you can even configure Word XP so the OOXML formats are the default format for all saved documents in Word. In fact, you can do all of those things that the Microsoft-supported ODF Add-in is not doing.

As reported earlier, the Microsoft support for ODF puts this ISO standard at a distinct disadvantage, providing no shell integration, removing it from its expected place in the File/Open and File/Save menus, and preventing users from making it the default format in Office.

So, let’s update the file format support matrix:

Criterion

DOC Format in OpenOffice

ODF Format in Word 2007

OOXML Format in Word XP

1. Format supported in default install

Yes.

No. Requires a download and install of separate, unsupported Add-in.

No, but you are prompted to download a free converter pack the first time you attempt to open an OOXML file

2. File Open integration

Yes.

No. ODF is not listed in the default File Open dialog and doing a Control-O will not show ODF documents. However, ODF import is available in a separate menu item elsewhere in the menu system.

Yes.

3. Save new document integration

Yes.

No. In fact no ODF save ability exists in the current version of the Add-in. There is a place holder for the ODF save operation, though it is on its own menu, and would not be shown when doing a simple Control-S to save a new document.

Yes.

4. Can be made the default format

Yes.

No. Although other non-Microsoft formats, such as “Plain Text” can be made the default format, ODF cannot.

Yes.

5. Simple round-tripping

Yes.

No. When an ODF document is loaded, its name is automatically changed and it is made read-only. So loading sampler.odt results in Word having a read-only version of sampler_tmp.docx. Attempting a simple Control-S to save will give an error.

Yes.

6. Shell integration

Yes.

No.

Yes.

I tip my hat to Microsoft for the way they have provided OOXML support in earlier versions of Office. Aside from the size of the download, the process was simple and the integration was seamless. That’s the way it should be. But what makes them think that customers using ODF format would want anything less than this? That fact that they’ve been able to integrate OOXML so well only increases the shame in having integrated ODF so poorly.

I see no reason why the support of MS Office for ODF should be simular to their own formats. Users of MS Office are better of with a format that supports all of the functionality. ODF support is only usefull for compatibility reasons. So I find it very logical that the support is via a plugin simular to support for other Office documents formats like PDF or XPS. It seems you are of the opinion that alle Office progrmas should support ODF as a native file format but quess what, some parties think their format is better for them and will only offer limited support. Live with it !!

You know what is better off for Office users? If so, I am awed by your prescience. Personally I would not presume.

However, some organizations have already spoken for themselves and said that they want to use ODF. Some have also said that they want to continue to use MS Office at the same time. So for them, the Add-in support being offered by Microsoft is clearly inadaquete.

“Live with it!!” you say? Remember, these are paying Office customers. They would just prefer to use an open file format. They deserve far more than scorn.

ODF does not support full MS Office functionality. It would not be prudent to use that format unless you would also limit your users in using MS Office functionality. So that is why MS Office users are better of with Office Open XML.

There is no real advantage in using ODF over Office Open XML. I have followed your blog for a while and also those of some OOXML supporters and except for some nitpicking on minor issues the formats both have it’s own merits.

The clear merit for OOXML is competability with all of MS Office functionality including that of former formats. ODF does not offer any advantage icm with MS Office. The fact that some organisations have said they want to use ODF with MS Office is probalby due to concern about the openness of the OOXML format still often raised as arguments for ODF, but those issues are all in the past now that MS has given it’s covenant not to sue.

THE most important reason for using ODF is when you plan to use OpenOffice as your Office suite as it is basically build on the format of that suite. The ODF format will be good for supporting the functionality of that suite.

As each format is open enough you should let your choice not be led by the format but by the functionality you need from your Office applications. At this moment the functionality offered by MS Office seem the better option especially if you need spreadsheet or presentations functionality where I find OOo to be weak.

OOo is certainly an alternative with it’s own merits (I like the standard PDF support) and if it fits your need why not. It is definitly cheaper than MS Office.

Certainly the format is not a merit to me as long as there is a method of converting the format which there clearly will be.

You say that “The clear merit for OOXML is competability with all of MS Office functionality including that of former formats.” Are you quite sure of this? I have not heard Microsoft make this statement, and I would be surprised if they did. I’ve heard them say that alternatives like ODF did not support the full range of Office functionality, but I have never heard them say that OOXML does indeed support the full range of Office functionality.

I think there is a good reason for this. My reading of the OOXML specification leads me to believe that OOXML does not represent everything that Office can do. In particular, OOXML does not specify macros or scripting, both important Office features. OOXML leaves these features unspecified, as amorphous blobs.

No doubt, Office 2007 can handle these features, and will find some way of saving them to the document, but it will be by using using binary blobs that are not explained in OOXML.

In any case, as individuals we regularly choose based purely on features and price (if that is a constraining factor). That’s fine. But organizations, small and large, and government agencies, etc., would be in a horrible mess if every individual in their organization, or even every department, made their IT decisions in a completely uncoordinated fashion. Diversity and competition is good to have in the marketplace, but you don’t necessary want that within an organization. So, IT standards emerge and corporations and government agencies make choices and adopt standards. So, the format war may be something that the average person will never care about, but to those who set IT policy, it is a very important question.

Honestly, I don’t see these IT policy decisions being made on the basis of feature checklists, like the feature-wars we had between spreadsheets and word processors in the early 1990’s. I was around back then, working on Lotus SmartSuite, and all companies competing then were tossing in features of marginal merit (my opinion) to each release to feed those checklist-hungry editors of magazine reviews. Art Borders in Word? Talking Paperclips? The hangover from that feature orgy was a decade of security flaws in Office.

For the average user, today almost any modern word processor has more features than they need. If anything, I see users migrating to the simplicity and easy-access of web-based alternatives. 95% of the time I can manage with the word processing facilities in Blogger, MoinMoin or email. The other 5% require OpenOffice.org.

Finally, you mention that ODF was based on the formats from OpenOffice.org. I won’t dispute the fact that ODF came from OpenOffice.org, but I do think that this is merely a statement of historical fact, not a limitation. By your same argument, you could say that the most important thing about HTML is it is based on the format that NCSA Mosaic used. But HTML and ODF have moved out beyond their original applications and their spread continues.

From the Ecma 1.4 draft:“The goal is to enable the implementation of the Office Open XML formats by the widest set of tools and 5 platforms, fostering interoperability across office productivity applications and line-of-business systems, as well 6 as to support and strengthen document archival and preservation, all in a way that is fully compatible with the 7 large existing investments in Microsoft Office documents.”

So competability with MS Office document which essentially represent office features is mentioned as a main goal for the OOXML format.

I am not exactly sure how VBA script is added but I did notice that OOXML in it’s standard uses different filenames for macro enabled files so you can easily identify potentially dangerous script files.

VBA script might be a part of .NET and therefore doe not need separate doucmentation except on how to embed a .NET object.

Currently I find both formats lacking somewhat in embedding information. Basically I have not seen any limititation to ODF to what you can embed in the format but not quite the information on how to handle that embedded data ?I think both formats should take care in describing the behaviour more exact as this is a major new source for introducing virusses.

Surely, having a goal is not the same thing as the achievement of the goal. I’d like to hear Microsoft state unequivocally that the OOXML specification which will be voted on by Ecma will specify the storage, the syntax and the semantics of macros, scripts and OLE embeddings.

I encourage anyone who is considering OOXML as a solution to their problems to ask the same question, “Does OOXML specify the storage, syntax and semantics of macros, scripts and OLE embeddings”? If the answer is not simply “Yes”, then we do not have a complete and open specification.

If Microsoft wants to claim the OOXML is both an “open standard” and at the same time say that it is fully compatible with the “large existing investments in Microsoft Office documents”, then they will have problems if things like macros are not specified. Where do you think this large investment comes from? It isn’t just text and styles. A large part of the investment is in script automation. If that part is not specified, then the value proposition fly out the window.

I’d like to see the scripting formats included in the Ecma specification, reviewed by Ecma’s community process, with a patent covenant, etc. References to propriatary specifications on MSDN are insufficient. You can’t make an open standard by merely referencing to a stack of proprietary technologies.

Sorry, I’ve been on the road, at the OpenOffice.org conference in Lyon, and now taking a few days off for a mini-vacation in Paris. Finally got a wireless connection. As I’ve said before, all comments go into a moderation queue so I can remove the ones which are spam or simply obscene.

In any case, I only see two comments from you in the queue, so if there is a third one, please resend.

To your point on CLI — For OOXML to specify something which is compatible with “billions of existing documents”, then the scripting interfaces need to be specified, by inclusion or by reference. This means the the storage formats, the language syntax and semantics, the Office object model, etc. As you know, the CLI only specifies the language type system and the virtual machine.

1.“It seems you are of the opinion that alle Office progrmas should support ODF as a native file format “…MSOffice has the ability to save as RTF or even ASCII format. When saving to these formats all pictures and lots of formatting information is lost. Why does Microsoft not include the ODF export at the same location as the RTF and ASCII export? I find only the answer Rob gave us: Trying to exclude ODF as much as possible. If MS has a technical or usability reason I would really like to hear it.

1.“If you could articulate a single benefit that ODF provides over Open XML for the “average user”, perhaps more people would be persuaded.”The most important benefit for the average user is, that ODF is used as the native format by many office applications, OOXML is used natively only by MSOffice. The second advantage is, it is standardized, so the documents created today will be fully readable in 20 or 50 years, OOXML which contains macros will most likely not be as long – time stable as ODF with macros. The third advantage is, that there are several applications available at zero license cost which natively use ODF. The average asian or african user currently steals MSOffice to be compatible with that ugly “MSOffice97 de-facto standard” in the industry. Not stealing MSOffice is currently no option for a small welding company at a chinese village, because he could not communicate with his business partners effectively enough.

The ODF-creators explicitly begged MS repeatedly to join into the ODF specifying process. MS refused to participate, which leaves us with two XML standards, one which is completely new, one which reuses several established standards. Microsoft deliberately messed up an opportunity to standardize the office application file formats, at least that is what it looks like from the outside. It is quite comparable to what happend to the HD-DVD standard. It got messed up by some companies who thought “we can make a thing which has 5% more storage and is 10% faster”. WOW! Really an advance! What they are ending up with now is a Market which is no market at all. Nobody buys HD drives unless they can read and write whatever is in widespread use. The rest of the economy is paying billions for this underuse of technical ability.