Microsoft prepares developers for ODF in Office 14

In a move designed to intentionally eliminate all surprises, Microsoft posted a detailed guide to its planned implementation of OpenDocument Format in the next edition of Office, to an interoperability Web site it launched last March.

By designating point-by-point how it intends to implement elements of the ODF 1.1 standard in Word, Excel, and other future editions of the suite still, for now, named "Office 14," Microsoft may quite literally be seizing the initiative. Specifically, by pre-empting its own effort in documenting how it will implement Open XML -- the internationally standardized derivative of the XML-based format Office 2007 already put in motion -- the company appears to be taking public steps to document what could easily become the most deployed ODF-supporting application come next year.

In so doing, it could end up setting the standard, if you will, for following the standard, playing its opponents game to its own advantage.

For example, what does it mean for an inline element to be "anchored to" a paragraph? In Microsoft Word with the old OOXML format, an anchor to paragraphs often appears in the upper left corner of that paragraph, but it can also be anchored to the "text" someplace in the middle.

This morning's detailed implementation document sets forth the rule that the company plans to follow: "Within text documents, images, embedded objects and other drawing objects may be anchored to a paragraph, to a character, or as a character," states Rule 5.8, under the heading "Inline Graphics and Text Boxes." "If they are anchored to a paragraph, they appear within a paragraph at an arbitrary position. If they are anchored to or as a character, they appear within a paragraph at exactly the character position they are anchored to or as."

Here's another quandary: How do you represent tracked changes to a document? In a Word file, with "Track Changes" turned on, you can effectively delete any length of text from one point to the next; the word processor will display the text that's deleted from the final printout, but usually in a different color (often red) with underscoring. But let's say the deleted passage starts in a table, and proceeds outside the table to encompass a span of ordinary paragraphs, in the middle of which is a bulleted list. That's three or perhaps four elements that are being deleted, not just one, depending on how you structure the deletion in XML.

Rule 4.3 demonstrates how Microsoft intends to handle this dilemma with ODF: It will create an independent XML element structure that details the audit information about the changes being made -- for instance, who's doing the changes and when. That's a self-contained element, and all it has are the properties for the responsible party, and an ID tag for the change itself. Next, Office 14 will embed an XML tag inline in the text where the change (such as a deletion) begins, and another tag at the point in which it ends, with both tags containing the ID that links them to the properties block.

There are all types of implementation issues that have to be addressed by this document, especially in light of the fact that ODF -- despite its status as a "standard" -- chose to implement certain elements in arbitrary, not previously standardized, ways. In a blog post last August, Microsoft senior product manager Doug Mahugh demonstrated a few instances where ODF's arbitrary choices and Microsoft's equally arbitrary choices for OOXML resulted in situations where one implementation could not map to the other equivalently.

"In simple cases, it isn't a problem for Word to preserve document structure and semantics when saving an ODF file," Mahugh wrote. "For example, a document heading can be saved with a heading style that has an associated outline level.

"In more complex cases we preferred a neutral approach when saving to ODF rather than implying semantics that the user did not intend," he continued. "For example, in Word one can color code the bullets in a bulleted list by applying a color attribute to the paragraph character for the list item. Word can persist that attribute when saving to OOXML, but ODF does not have the concept of paragraph characters with attributes. If we were to apply the color attribute to the paragraph style that would cause the entire list item to take on the color, and this might imply more than the user meant. So we choose to drop the bullet color, rather than color the whole list item."

In other words, Microsoft chose not to change the color of the text when translating a bullet list item whose text is one color but whose bullet is another, into a format which perceives both the bullet and the text as bound by the same characteristics.

This morning's product of the Document Interoperability Initiative (DII) is not intended to explain how Microsoft plans to translate ODF, however. Rather, as Mahugh first revealed to us last May, the company intends to support ODF as one of the user's choice of default formats in the new Office 14, so this set of guidelines will detail how Microsoft chooses to implement ODF 1.1. When a similar set of guidelines is published for Open XML -- in the coming weeks, a Microsoft spokesperson told BetaNews this afternoon -- individuals may be able to compare implementations to gauge the road ahead for those who will be taking on the task of translation between formats.

"Microsoft is taking a comprehensive approach toward interoperability and believes that for true interoperability all vendors must be good stewards by participating in the maintenance of standards, must be transparent in their implementation of standards, and must collaborate with others across the industry," reads the spokesperson's statement to BetaNews this afternoon. "This is important to help customers achieve the interoperability they need to be successful."

UPDATE The first official implementation of ODF in an Office app in advance of Office 14 will actually be in Office 2007 Service Pack 2, which a Microsoft spokesperson told BetaNews this afternoon is slated for no later than April.