I created a test document in each of those editors and then loaded each test document in each of the other editors. I showed what worked, what didn’t, and made some suggestions on how interoperability could be improved. I found only two notable failures, when the Microsoft/CleverAge Add-in for Excel loaded KSpread and Symphony documents. The other scenarios I tested were OK:

Created In

CleverAge

Google

KSpread

Symphony

OpenOffice

Sun Plugin

Read In

CleverAge

OK

OK

Fail

Fail

OK

OK

Google

OK

OK

OK

OK

OK

OK

KSpread

OK

OK

OK

OK

OK

OK

Symphony

OK

OK

OK

OK

OK

OK

OpenOffice

OK

OK

OK

OK

OK

OK

Sun Plugin

OK

OK

OK

OK

OK

OK

I lot has happened in the two months since I did that analysis. Several of the applications I tested have been updated:

I haven’t been able to get the release candidate of KOffice installed, so I’m still including KSpread 1.6.3 in my tests, but for the rest I have created new test files in each editing environment, saved them to ODF format and then loaded the resulting documents into each of the other editors. From these test documents I was able to perform 42 different test combinations.

I’ll explain a bit more how I tested, then give you the table of results, and finally make some observations and recommendations.

The test scenario I used was a simple wedding planner for a fictional user, Maya, who is getting married on August 15th. She wants to track how many days are left until her wedding, as well as track a simple ledger of wedding-related expenses. Nothing complicated here. I created this spreadsheet from scratch in each of the editors, by performing the following steps:

Enter the title in A1 “May’s Wedding Planner” and increased font size to 14 point.

Enter formula = TODAY() in B3 and set US style MM/DD/YY date format/

Enter the date of the wedding as a constant in cell B4, also setting date format.

Added simple calculations on cells B6-B8, to calculate days, weeks and months until the wedding.

A11 through E16 is a simple ledger of the kind that is done thousands of times a day by spreadsheet users everywhere. Once you have the formula set up in column E (Balance = previous balance + credits – debits) then you can simply copy down the formula to the new row for each new entry.

CleverAge appears to have heeded the advice from my earlier blog post and now correctly processes KSpread and Symphony spreadsheets. This is great news and they deserve credit for that work. But this is a small bit of good news in a table that now shows awful lot of red. Let’s see if we can figure this out.

First, some combinations that worked previously, when I tested two months ago, are now not working:

Symphony 1.3 beta hangs when attempting to load the spreadsheet created with the CleverAge 3.0 ODF Add-in. Symphony 1.1 also hangs when trying to load that same spreadsheet. However both versions of Symphony work fine when loading the CleverAge 2.5 spreadsheet from two months ago. The CleverAge document appears to be valid, so my guess is that this is a bug in the Symphony 1.3 beta. I’ll pass this document on to the Symphony development team to see what they say.

KSpread 1.6.3 does not read formulas from OpenOffice 3.01 documents. KSpread had no problems with OO 2.4 documents. The problem appears to be that OpenOffice 3.01, by default, writes out documents according to the ODF 1.2 draft which puts formulas in the OpenFormula namespace. But KSpread is expecting them in the legacy namespace. The result is that spreadsheet formulas are dropped when the document is loaded in KSpread.

In a similar way, Sun’s new ODF Plugin writes out documents according to the ODF 1.2 draft. KOffice is unable to handle these files. This also causes problems for Google Spreadsheets and the Microsoft/CleverAge Plugin for Excel, which report errors “We were unable to upload this document” and “The converter failed to open this file”.

The new entry to the mix is Microsoft Office 2007 SP2, which has added integrated ODF support. Unfortunately this support did not fare well in my tests. The problem appears to be how it treats spreadsheet formulas in ODF documents. When reading an ODF document, Excel SP2 silently strips out formulas. What is left is the last value that cell had, when previously saved.

This can cause subtle and not so subtle errors and data loss. For example, in the test document I presented above, the current date is encoded using the TODAY() spreadsheet function. If the formulas are stripped, then this cell no longer updates, and will return the wrong value. Similarly, if Maya tries to continue her ledger of expenses by copying the formula cells from column E down a row, this will cause incorrect calculations, since there is no longer a formula to copy, so she would just be copying the prior balance. In general, SP2 converts an ODF spreadsheet into a mere “table of numbers” and any calculation logic is lost.

In the other direction, when writing out spreadsheets in ODF format, Excel 2007 SP2 does include spreadsheet formulas but places them into an Excel namespace. This namespace is not what OpenOffice and other ODF applications use. It is not the ODF 1.2 namespace. It isn’t even the OOXML namespace. I have no idea what it is or what it means. Not every ODF application checks the namespace of formulas when loading documents, but the ones that do reject the SP2 documents altogether. And the ones that do not check the namespace try and fail to load a formula since it is syntactically different than what they expected. The applications essentially display a corrupted document that is shows neither the formula nor the value correctly. For example, a SP2 document, loaded in MS Office using the Sun ODF Plugin looks like this:

Similar corruption occurs when loading the Excel 2007 SP2 spreadsheet into KSpread, Symphony and OpenOffice. Google doesn’t import the document at all.

I must admit that I’m disappointed by these results. This is not a step forward compared to where we were two months ago. This is a big step backwards. Spreadsheet interoperability is not hard. This is not rocket science. Everyone knows what TODAY() means. Everyone knows what =A1+A2 means. To get this wrong requires more effort than getting it right. It is especially frustrating when we know that the underlying applications support the same fundamental formula language, or something very close to it, and are tripped up by lack of namespace coordination. Whether it is accidental or intentional I don’t know or care. But I cannot fail to notice that the same application — Microsoft Excel 2007 — will process ODF spreadsheet documents without problems when loaded via the Sun or CleverAge plugins, but will miserably fail when using the “improved” integrated code in Office 2007 SP2. This ain’t right.

I have some suggestions for how to move things forward again. There will be a lot less red on the above table if two simple changes are made:

Sun should write out formulas in ODF 1.1 format, using the legacy “oooc” namespace prefix that the other vendors are using. Remember, the other vendors are using that namespace specifically for compatibility with OO’s ODF documents. This is the current convention. To unilaterally switch, without notice or coordination, to a new namespace, is not cool. When ODF 1.2 is an approved standard, then we all can move there in a coordinated fashion, to cause users minimal inconvenience. But the above table clearly shows the confusion that results if this move is not coordinated. I know OO 3.01 has an option to save in ODF 1.0/1.1 format. IMHO, this setting should be the default. I’m not sure if the Sun Plugin has a similar configuration option, but I hope it does.

In addition to writing out compatible formulas as per the above comments on the Sub Plugin, Microsoft should remove the code in SP2 that causes it to reject every other vendor’s spreadsheet documents. Give the user a warning if you need to, but let them have the choice.

Finally, let me try to anticipate and debunk some of the counter-arguments which might be raised to argue against interoperability.

First, we might hear that ODF 1.1 does not define spreadsheet formulas and therefore it is not necessary for one vendor to use the same formula language that other vendors use. This is certainly is true if your sole goal is to claim conformance. If your business model requires only conformance and not actually achieving interoperability, then I wish you well. But remember that conformance and interoperability are not mutually exclusive options. An application can be conformant to a standard and also be interoperable, if you use the legacy formula namespace and syntax. So the desire to be conformant is not an excuse for not also being interoperable, or at least not a valid excuse. One might also wryly note that Microsoft has several Directors of Interoperability, not Directors of Minimal Conformance, and they workshops are called Document Interoperability Initiatives, not Minimal Conformance Initiatives. The difference between minimal conformance and interoperability is well illustrated in these tests.

Remember, it is not particularly difficult or clever to to take an adverse reading of a standard to make an incompatible, non-interoperable product. Take HTML, for example. It does not define the attributes of unstyled (default) text. So I could create a perfectly conformant browser implementation that makes all default text be 4-point Zapf Dingbats, white text on a white background. It would conform with the standard, but it would be perfectly unusable by anyone. If you try hard enough you can create 100% conformant, but non-interoperable, implementations of almost most standards. Standards are voluntary, written to help coordinate multiple parties in their desires for interoperability. Standards are not written to compel interoperability by parties who do not wish to be interoperable.

(A side point is that SP2’s implementation of ODF spreadsheets does not, in fact, conform to the requirements of the ODF standard, but that is another story, for another blog post.)

We might also hear concerns that supporting other vendors’ ODF spreadsheet formulas cannot be done because this formula language is undocumented. The irony here is that the formula language used by OpenOffice (and by other vendors) is based on that used by Excel, which itself was not fully documented when OpenOffice implemented it. So an argument, by Microsoft, not to support that language because it is not documented is rather hypocritical. Excel supports 1-2-3 files and formulas and legacy Excel versions (back to Excel 4.0) neither of which have standardized formula languages. Why are these supported? Also, the fact that the Microsoft/CleverAge add-in correctly reads and writes the legacy ODF formula syntax shows not only that it can be done, but that Microsoft already has the code to do it. The inexplicable thing is why that code never made it into Excel 2007 SP2.

We’ll probably also hear that 100% compatibility with legacy documents is critical to Microsoft users and that it is dangerous to try to save Excel formulas into interoperable ODF formulas because there is no guarantees that OpenOffice or any other ODF application will interpret them the same as Excel does. So one might try to claim that Microsoft is protecting their customers by preventing them from saving interoperable spreadsheet formulas. But we should note that fully-licensed Microsoft Office users have already been creating legacy documents in ODF format, using the Microsoft/CleverAge ODF Add-in. These paying Microsoft Office customers will now see their existing investment in ODF documents, created using Microsoft-sanctioned code, get corrupted when loaded in Excel 2007 SP2. Why are paying Microsoft customers who used ODF less important than Microsoft customers who used OOXML? That is the shocking thing here, the way in which users of the ODF Add-in are being sacrificed.

If you are cynical, you might observe that if Excel 2007 SP2 allowed Microsoft/CleverAge ODF Add-in formulas to work correctly, then SP2 would need to allow all vendors’ formulas to work, since the other vendors are using the same legacy namespace. The only way for Microsoft to make their legacy ODF documents work and to exclude other vendors would be (hypothetically) to specifically look in the document for the name of the application that created the document, and allow their ODF Add-in but reject OpenOffice, etc. IANAL, but I think something like that would look very, very bad to competition authorities. So the only way out, if your goal (hypothetically) is to avoid interoperability, is to sacrifice your existing Office customers who are using the Microsoft/CleverAge ODF Add-in. It serves them right for not sticking to the party line in the first place. This’ll teach ’em good.

Of course, I am not that cynical. I was taught to never assume malice where incompetence would be the simpler explanation. But the degree of incompetence needed to explain SP2’s poor ODF support boggles the mind and leads me to further uncharitable thoughts. So I must stop here.

As I mentioned before, this is a step backwards. But it is just one step on the journey. Let’s look forward (and move forward). This is just code. Code can be fixed. We know exactly what is needed to have good interoperability of spreadsheet formulas. In fact most of the code already exists for this. The only thing we need now is to actually go do it and not get too far ahead, or lag too far behind from the other implementations. This is more a question of timing and coordination than hard technical problems.

The only way for Microsoft to make their legacy ODF documents work and to exclude other vendors would be to specifically look in the document for the name of the application that created the documentThis should be simple to test with a text editor, change the name of the application to match one that works and test that.

I’m an open source software advocate and developer. I furiously followed the ISO approval of Microsoft’s OOXML file format. But let me say that discovering that all interoperable Open Document implementations rely on fixed namespace prefix crap makes me very sad.Fixing namespace prefixes means describing file formats at xml syntax level: so, why not fixing the character encoding or even the whitespace? What does mean being interoperable, being smarter or being dumber? And who’s the dumbest here?There’ll be always someone who leverages xml complexities to establish market dominance if software industry doesn’t mature.

If ODF really were about interoperability of spreadsheet docuemnts, it would require

(a) a single namespace for formulas (perhpas OpenFormula?)(b) an absolute minimum of implementation defined behaviour for functions (so clearly not OpenFormula)(c) A maximum of supported functions (with a clear indication of which functions are beyond that support.)

“First, we might hear that ODF 1.1 does not define spreadsheet formulas and therefore it is not necessary for one vendor to use the same formula language that other vendors use. This is certainly is true if your sole goal is to claim conformance.”

What is the point of making standards if the implementation cannot be based on them? How on earth can anyone use a spreadsheet standard that doesn’t define formulas?

“So an argument, by Microsoft, not to support that language because it is not documented is rather hypocritical.”

This is true, but it is also a very, very bad excuse. The question is, why wasn’t it documented then?

“These paying Microsoft Office customers will now see their existing investment in ODF documents, created using Microsoft-sanctioned code, get corrupted when loaded in Excel 2007 SP2.”

This is true, but isn’t that a problem of ODF that simply isn’t mature enough yet?

@Anonymous, I’m assuming Office does not do that kind of hack, for the very reason it is so easy to detect.

@Morten, If your goal is conformance without interoperability then you can achieve that with almost any standard, ODF included. If your goal is interoperability and conformance then you can do that with almost any standard, including ODF. The fact is that many vendors are already interoperable with ODF spreadsheets today. Even Microsoft’s own ODF Add-in for Office was interoperable and conformant. So why the whiplash change in SP2 to break interoperability?

@Enrico, That is just me, being lazy and not typing out the entire namespace URI. The standard does not rely on equivalence of the namespace prefix. However, I have noticed that some implementations seem to check only the prefix and not resolve the it to the URI.

@Andreas,

The latest draft of ODF 1.2 does require the use of OpenFormula for conforming spreadsheet documents. I agree with you in hoping that we can reduce the number of implementation-defined behaviors in OpenFormula.

@Tim,

From the user’s perspective, I don’t think they really care what ODF 1.1 did or did not specify. They know that Microsoft provided an ODF Add-in for Office, sold the press on this, played it up as a big interoperability initiative, milked it for all it was worth, and then came out with SP2 support for ODF that broke all of these customers spreadsheets. Microsoft doesn’t need to justify their choices with me. They need to justify it with their own paying MS Office customers. Perhaps I should be happy? These customers will now find that their spreadsheets are more compatible with IBM Lotus Symphony than with MS Office.

@Eomon, Yes I have tested other document types as well and there are some interesting things to report there as well. But this blog post was already getting rather long, so I made this one deal with spreadsheets only.

For example, in a table with the name SampleTable the cell in column 34 and row 16 is referenced by the cell address SampleTable.AH16. In some cases it is not necessary to provide the name of the table. However, the dot must be present. When the table name is not required, the address in the previous example is .AH16.Note: I added the emphasis.

The formulas in the MS Office 2007 file do not include the dot in the cell reference, so they’re invalid, regardless of the formula language.

Isn’t this very simple? Both ODF and OOXML allow a vendor to wrap own extensions to ODF within a namespace to protect other applications from their alien data. This is a good feature that improves interoperability when used right since the other applications can create custom loaders to also support the extension.

The problem here is really that Microsoft wrapped everything in their own namespace instead of only those parts that cause trouble for other applications.

I have heard that the ODF interoperability TC has decided to define two conformance classes. One that allows extension and one that does not. This is a good idea, but Microsofts (and SUNs) messing with namespaces show that interoperability also require a commitment to not use extensions when a basic alternative is present. I think that one of Oasis TCs (I don’t know which) need to add such rule.

Funny thing is, the entire right column will probably be fixed in a matter of weeks, if not days, by a large crowd of agile and competent open source developers. The small quirks of Excel-poisoned ODF seem easy enough to accept in a benevolent file importer.If Microsoft wishes to follow the intent of the standard instead of just its letter, they too can release a quick update to fix at least part of the bottom row. If they do, we can possibly attribute this to bottomless incompetence, but if they don’t, I will assume it is a sign of contemptuous malice.

A great quote from Microsoft’s 2008 press release announcing that SP2 would support ODF:

“Microsoft recognizes that customers care most about real-world interoperability in the marketplace, so the company is committed to continuing to engage the IT community to achieve that goal when it comes to document format standards. It will work with the Interoperability Executive Customer Council and other customers to identify the areas where document format interoperability matters most, and then collaborate with other vendors to achieve interoperability between their implementations of the formats that customers are using today.”

What every happened to “real-world interoperability in the marketplace”?

Doug explains there why Microsoft does not intent to be compatible with the OOo formulas:# dmahugh said on February 20, 2009 12:13 PM:[…] One interesting area is spreadsheet formula syntax — since ODF has never had such a syntax in any published version of the spec, implementers have used other standardized formula languages such as the one in IS29500 (as is allowed in the current ODF spec). […]

Doug then reacts to a comment by Jesper:# dmahugh said on February 21, 2009 2:38 PM:[…] Jesper, if I understand your point correctly, you’re saying that you don’t mind if documents containing IS29500 formula markup are unable to ever be conformant to “pure ODF 1.2” […] My view is that this would needlessly punish users of ODF who currently save formulas in the only ISO-standard formula markup language available for this purpose. […]

This triggers the following response from Jesper:# Jesper Lund Stocholm said on February 23, 2009 3:10 AM:[…] Seriously, Doug … please don’t play the “reuse exisiting ISO-standards or users will be punished”-card.

Funny thing is, the entire right column will probably be fixed in a matter of weeks, if not days, by a large crowd of agile and competent open source developers. The small quirks of Excel-poisoned ODF seem easy enough to accept in a benevolent file importer.”

Ahh, so it’s HTML all over again is it? Is that what anybody really wants? If people spend their time doing this, then most of the effort that went into ODF will have been for nothing. I hope they refuse to let M$ dictate their file format, rather than be fools and bend to it for so-called `interoperability’.

And .. didn’t they state early on that they were only going to do 1.0? Why is it surprising in any way that 1.2-format data doesn’t work?

As a professional user I am absolutely dissatisfied with software developers which cause havoc in the field of exchanging data. We have wars on this planet. We have hunger, illness, pollution and lot of other problems to solve. It is absolutely inacceptable that we are confronted with these kind of computer problems and I believe programmers that are causing these problems are acting no less criminal than hedge-fond-managers or banksters – we must go ahead with evolution and fight the narrowminded, primitive company-driven view on software that we need to solve problems today with a much stronger approach. People who are causing problems and not solving them are criminals and should go to guantanamo.

I have to be a bit of a defender here, if you are trying to get ODF support out the door and do not want to add the complexity of ensuring that you have compatible formulas, and you know that they are all compatible excell specific formulas as you just happen to be the excel team, then wrapping them in the excel name space is quite likely behavior. I just hope that they do the next phase of the work quickly and get it out soon.

“From the user’s perspective, I don’t think they really care what ODF 1.1 did or did not specify. They know that Microsoft provided an ODF Add-in for Office”

And that is what they have: An ODF 1.1 add-in. The issue is not ODF 1.1 compatibility, it is OpenOffice compatibility. Naturally it is a shame that OpenOffice doesn’t support ODF1.1 but it’s their problem. It is also a shame that ODF 1.1 is not used by many else, but it doesn’t surprise anyone since as a standard it is totally useless for spreadsheets.

Seeing that I really admire you and respect your work, I’ll leave my own comments about SP2 to my own blog.

I just want to contribute with your text about Rob’s Mary (if we can call her that way), and her experience with MS Office 2007 and ODF… It goes that way:

“…And Mary needs to update her spreadsheet with more updated prices, and as a good and modern bride (yep, the lady uses ODF), she always carry that precious spreadsheet on her pen drive.

Away from her computer and seeing that MSOffice 2007 now supports ODF, Mary uses a friend’s machine with MSOffice 2007 to update her spreadsheet.

She open the document and fell very happy and comfortable when she sees “everything there”, update a previous imputed price, and gets noticed that no new calculations are made… Seeing that she needs to give the machine back to her friend, she thinks “OK, I’ll solve that latter, let me just save my update to guarantee that I don’t lose the new price”… And she save (overwriting the original file) and close Excel.

Latter on, when she arrive at home, she opens the document again and discover that all her formulas have magically disappeared !!!

Crying and feeling sad about it, she thinks “Oh dear God, what my future husband will think about me, now that I was screwed by someone else….”.

They have already introduced OpenFormula, support for RDF/RDFa XML metadata formats and the addition of digital signature support. OpenFormula defines, for the first time in ODF, the format of spreadsheet formulas.

I have to be a bit of a defender here, if you are trying to get ODF support out the door and do not want to add the complexity of ensuring that you have compatible formulas, and you know that they are all compatible excell specific formulas as you just happen to be the excel team, then wrapping them in the excel name space is quite likely behavior.
I just hope that they do the next phase of the work quickly and get it out soon.