An ODF/OOXML File Format Timeline

I suppose the downside of a blog post containing only a picture is that there is nothing for anyone to quote. So here are a few themes that struck me while putting this chart together:

Microsoft once made file format information on the binary formats readily available, in fact encouraged programmers to use the binary formats. But then around 1999 they reversed course, and eliminated such documentation. At the time, working at Lotus, I had no idea what motivated this change. It was only years later, when Microsoft internal memos were released in cases like Comes v. Microsoft, that the full picture emerged. The file format was viewed by Microsoft as a strategic tool, used to support the overall Microsoft platform, not the user. The format was designed to preserve their vendor lock-in. The availability of the file format documentation to competitors was limited, as a matter of corporate policy.So this reminds us that just because something is documented and available today does not prevent Microsoft from changing their mind at a later point and removing the documentation, failing to update it with new releases, or making it available only under a more restrictive license. Since Ecma owns the OOXML specification, as well as the future maintenance of it, any belief in the long-term openness of this format depends on your trust of Microsoft’s future behavior in this area.

Like any durable goods monopoly (and few things are as durable as software) Microsoft’s largest competitor is their own install base. Microsoft has made many attempts at moving beyond the binary formats in the past, with Office 2000, Office XP and Office 2003. But in each case it failed. These were all false starts and abandoned attempts. So we should look for signs that OOXML is actually Microsoft’s real direction and not another false start or dead end. My guess is that OOXML is merely a transitional format, much like Windows ME was in the OS space, a temporary hybrid used to ease the transition from 16-bit to the 32-bit platform that would eventually come (Windows 2000). Microsoft doesn’t want to support all of the quirks of their legacy formats forever. That just leads to bloated, fragile code, more expensive development and support costs. They would rather have clean, structured markup, like ODF. But the question is, how do you get there? The answer is straightforward: First, eliminate the competition. Second, move users in small steps, promising the comfort of continuity and safety. Third, once you have eliminated competition and have the users on the OOXML format that no one but Microsoft fully understands, then you may have your will of them. For example, introduce a new format that drops support for legacy formats and force everyone to upgrade. They are pretty much doing this already on the Mac by dropping support for VBA in the next version of the Mac Office.Even a cursory look at OOXML shows that it was not designed for long-term use, even by Microsoft. So the question I have is, what is the real format that they are going toward?

Microsoft, after pretty much ignoring document standards for over a decade, suddenly got religion in late 2005 and rushed whatever they had on hand into Ecma. Remember, just months earlier they had recommended the Office 2003 Reference Schemas to Massachusetts for official use. I’m certainly glad Massachusetts did not fall for that by putting their resources on another dead format in the Microsoft format graveyard. OOXML was not designed to be a standard. It is just a proprietary specification that Microsoft has dumped, at the last minute, into ISO’s lap, in an attempt to translate their market domination into a standards imprimatur in order to further cement their market domination. It is a win-win situation for them. Either they have a effective monopoly in office applications and an ISO standard, or they have an effective monopoly in office applications. Nice situation for them either way.

I seem to recall that OLE2.0 came with the OLEFS, binary format of MSword and the like fully undocumented, in about 1992/93; the first place it got described was some joint ms/hp proposal for an alternative image format to JPEG. I dont remember the date of the latter.

No, this time they will stick. It’s just that MSOOXML is not the same as EOOXML and will probably go through several versions as a result of service packs and product releases to ease customers into “keeping up to date”.

Microsoft has already sown the seeds of the VBA replacement, called Visual Studio for Applications (VSA). It was released in 2005 with the .Net framework 2.0. Look for a deprecation of VBA with the next release of Office (14).

So, we have a dominant vendor of standalone office productivity software (Microsoft Office).

We have a secondary vendor (Lotus with SmartSuite). I think Lotus (a division of IBM, now) would be perfectly happy never to sell another copy of SmartSuite; in fact I think if someone like Lenovo were to approach IBM, there’s a good chance IBM would sell off that business and we’d have Lenovo SmartSuite, growing in China.

We have another vendor, Sun Microsystems, with StarOffice. I don’t think Sun make much of their revenue from StarOffice; it’s really so that they can get office productivity software going on SparcStations under Solaris, to give them (and their clients) some choices. Sun mainly sell ‘engineering services’ nowadays; warranties for Java.

And OpenOffice. Anyone can download that, any time they want, from http://www.openoffice.org/ ; source code and all; and do whatever they like with it. No charge.

It feels like when you’re in a plane, taxiing for take-off. On the runway, you need the wheels down. They are like Microsoft Office and Lotus SmartSuite. Faster, faster, faster, ‘Rotate’, tip the flaps, you’re up in the air and flying. Climb to cruising altitude, point to where you want to go. That’s what planes are for. The bit on the ground was just how you get started.

Charles said “Microsoft has already sown the seeds of the VBA replacement, called Visual Studio for Applications (VSA). It was released in 2005 with the .Net framework 2.0. Look for a deprecation of VBA with the next release of Office (14).”

No, Microsoft cannot infuriates their install base.

There is way too much money in existing VBA stuff embedded in Word/Excel/Powerpoint documents right now that if there is one thing you can bet, it’s that it’s here for a looooonng time. And that no half-assed technology such as VSTO is going to replace it anytime soon.Sure, two or three people at Microsoft want to do that, but for instance VSTO assemblies live outside documents, in other words this stuff is not allowed by the suits in the IT department. Remember Charles, power users in the enterprise are a minority.

I think Lotus (a division of IBM, now) would be perfectly happy never to sell another copy of SmartSuite;

Well, if IBM decides that the SmartSuite source tree is much too good to simply dump, and they were willing to put it on Sourceforge under the Common Public License or some other OSI-approved license, I’m sure there would be quite a few people – including myself – who would be willing to take IBM up on the offer.

Among other things, it would open up the SmartSuite file formats so they could be supported more fully by other office suites – a complaint that I’ve read every now and then.

And SmartSuite would be adapted to use ODF – yet another contender that would make a hash of the Microsoft contention that ODF is OpenOffice.org under another name.

It would even support a contention I’ve made repeatedly to Microsoft and others, that once such-and-such a company has gotten rid of such-and-such a software product, it should turn it over to its fan-base; instead of competing with its installed base the way Microsoft currently does, it would use it to debug its previous product/s and then to move in new directions.

It seems like I might have commented on your thought somewhere else before, but I can’t find it.

At this point in their lifecycle, posting the SmartSuite products as open source is not really possible. IBM has considered it in the past. The base problem is that there is too much licensed technology/software that is part of SmartSuite… and in some cases, the companies that wrote them are gone or so diluted that getting permission / agreement to open source that stuff is legally or otherwise difficult.

I understand the desire and wish that it could be different. This is Rob’s blog and he worked on the suite, so he might have other data.

Echoing what Ed said, making a commercial product, especially an old one, open source is a huge undertaking from the IP perspective. With a new code base, it is easier. For example, I worked on Lotus XSL a few years ago, and donating that to Apache to make Xalan was simple because we could easily demonstrate that it was 100% original code. But with something like SmartSuite, it has a lot of 3rd party code for which we would need to secure permission, some from companies that are no longer around.

Similarly, when a TV show is released in syndication or on DVD, they need to renegotiate the music rights with the composers and artists for any music used in the show. In some cases, like WKRP in Cincinnati, the musical changes required in the DVD versions, were extensive.

>No, this time they will stick. >It’s just that MSOOXML is not >the same as EOOXML and will probably> go through several versions as a >result of service packs and product >releases to ease customers into >”keeping up to date”.

“3.2.7 ext (Extension) Each ext element contains extensions to the standard SpreadsheetML feature set.Parent Elements: extLst (§3.2.10) Child Elements: Any element from any namespaceSubclause: n/aAttributes: uri (URI): A token to identify version and application information for this particular extension. The possible values for this attribute are defined by the XML Schema token datatype.[end of subclause, no more information given!!!!]”

“5.1.2.1.14 ext (Extension)This element [of type CT_OfficeArtExtension] specifies an extension that is used for future extensions to the current version of DrawingML. This allows for the specifying of currently unknown elements in the future that will be used for later versions of generating applications.…Attributes: uri (UniformResource Identifier): Specifies the URI, or uniform resource identifier that represents the data stored underthis tag. The URI is used to identify the correct ‘server’ that can process the contents of this tag. The possible values for this attribute are defined by the XML Schema token datatype. [end of subclause, what ‘server’????!!!]”

I’ve heard also, from another IBMer whose name I don’t recall, make a similar point about IBM OS/2 2.x and later, that to release it as Open Source, would leave huge chunks missing.

Which is what happened with Netscape and Mozilla, and which was resolved quite quickly in the case of encryption, if I remember correctly.

As far as missing rights holders go, I’ve had a bit of experience with that myself. I know a bit of how frustrating it can be, when the person concerned seems to have vanished from the face of the earth. Which is partly why I think such cases should be declared abandoned and in the public domain – rather like an abandoned and derelict vessel in busy shipping lanes can earn up to near its full value in salvage fees and maritime lien.

Awkward phrase. The usual phrase is “then you may have your way with them”, connoting sexual rapaciousness. Other choices might be “then you may bend them to your will”, or “then you may exercise your will over them”.

It is certainly an archaic construct, though it rings true in my ear. A notable use, in the first person, and one that lingers in my ear is Paul Scofield’s Thomas More in the movie “A Man for All Seasons”, after the play by Robert Bolt, where More, on trial for treason, finds that his case is lost, by perjured testimony against him. He says to the court, “I am a dead man. You have your will of me.”

It is hard to make generalities about the Lotus binary file formats. Remember, Freelance Graphics and WordPro were both acquisitions and each came with their own binary formats, with different design principles. The 1-2-3 format was a record-based format, a repeated set of opcode types, lengths and records. Freelance Graphics was a serialization of C-language structures. These were not the internal runtime data structures, but special data structures which were used only for writing the file format. The runtime structures, which did evolve from release to release, would be carefully copied into these persistence structures for saving. And WordPro used Bento structured storage, which was a compound container format designed by Apple.

The 1-2-3 file format was published in book form and the Freelance format was available on request. I don’t believe that the WordPro format was ever documented.

sorry, but Microsoft has a long history of doing this. Internet Explorer for Mac was discontinued. Why because of Apples safari program. Windows media player was discontinued. why, because of apples Quicktime.Virtual PC a program by Connectix, was bought out by Microsoft. is being discontinued, why, because of mac’s Bootcamp VM software.As Microsoft realizes that they can’t control the market, they eliminate the support, the competition, and the software.As they continue to grow in the Open Linux world, they will eventually develope software that will be licensed to them, so they can eliminate that too. The two words ( user friendly & Free ) are hated by Microsoft.

Previous events to mention would be:
– TeX and LaTeX anf troff: Knuth, Lamport and Kernighan were very aware their systems were document interchange formats
– SGML in 1986
– HTML in 1991
– The variously incompatible .DOC versions. This had a major effect within Microsoft as they realised that .DOC wasn’t used as an intermediary between keyboard and paper, but was an interchange format.
– Boeing’s dissatisfaction with .DOC and continued use of FrameMaker, which set off the end times within Microsoft of .DOC as a future format.