Notes from ODF Plugfest in Granada, Day One

The ODF Plugfest is a Conference whose goal is to to achieve the maximum interoperability between competing applications, platforms and technologies in the area of digital document sharing, and to promote the OpenDocument format (ODF). This page, as the others that will follow on this website, is a short technical summary, primarily aimed at developers, of what happened during the first day of the conference. Later next week I’ll also post a non-technical summary of the whole event at the Stop.

My own talk, which you can dowload at mfioretti.com, was about a problem I first saw in 2006⁄2007: how do you prevent proprietary components from “polluting” open container formats like ODF?

Alberto Barrionuevo explained how Opentia contributed to the Spanish National Interoperability Framework (NIF). This is a very interesting work, that could and should be replicated in other countries: Opentia took the full text of the 2007 Spanish law about e-government and translated it in mathematical format in order to match, in the most unambiguous possible manner, each legal requirement to equivalent technical ones.

The result is a huge spreadsheet that implements a finite state machine and tells you if and how ammissible each of about 550 protocols, file formats and programming languages is for e-government usage in Spain (for the record, about 80% of them pass the test, at least partially.

Rob Weir of IBM summed up the status of the next version of the standard. ODF 1.2, which is almost done, will be divided in three parts: one for the core schema, one for the container and one for OpenFormula (do you remember that the first generation of ODF compliant spreadsheet suites lacked formula compatibility? This should fix that problem for good). New features will include digital signatures, support for RDF capabilities (see below) and native tables in presentation slides. An Interoperability demonstration of ODF 1.2 will take place at the OOoCon Conference in Budapest next September. Rob also mentioned that everybody can send in suggestions for the next version of the standard, that should include things like modularization, web profiles, enhanced SVG support and Xform integration. You can either answer OASIS calls for public comments, join the OASIS ODF TC or implement ODF 1.2 and send feedback.

Later on, Rob also introduced another theme that could and should get a lot of attention in the next years, and that is also somewhat related to what I said in my own talk: ODF metadata and interoperability. What should happen if your editing software loads and ODF file containing metadata that that software doesn’t understand? Should it preserve or ignore those metadata?

In some cases the answer is easy: metadata that behave like visual attributes, eg the bold tag (where the attribute (boldness) is separated from the data or content it applies to) can be removed or ignored without any real damage. Go beyond that, and interoperable metadata are much harder to achieve.
What if, for example, a document is pasted into another one that has a different value for the same metadata, for example the one indicating who in the organization is responsible for approving that text? An even more interesting case is “Should I digitally sign a document that contains metadata I can’t see?”

Another interesting moment of the day was when Jos van den Oever presented the work for RDF support for ODF 1.2 in KOffice. Generally speaking, RDF should help to add to the text enclosed in a document information about the meaning of that text, in a format that is directly and easily readable by a computer. Consider, for example, a sentence like “Paris is hot”. If it’s in plain text, for a computer it’s very difficult to understand, even by looking at the rest of the document, if it means that temperatures in the French capital are high or that Paris Hilton (or the mythological Trojan prince???) is sexy. Adding an RDF data triple to the Paris string, that labels it as a city, would eliminate that ambiguity. RDF could deeply change our definition of “document”. If all your files were tagged in this way, it would become much easier (and portable) to ask your computer questions like “find me all the cases discussed by my law firm that involved unemployment law, but not in its home town”.

The end goal is to get to a whole desktop that can use RDF, where the user can read and write docs with RDF, directly cut and paste them between applications. The current work on KOffice will allow it to show the user all the RDF triples and the corresponding text strings in a document, and to use those data to check and show locations on digital maps, or export appointments or phone numbers straight into calendars or address books.

Day one of the ODF Plugfest ended with a presentation of several interesting tools that generate, convert or analyze automatically all kinds of ODF files:

ASPOSE.WORD a .NET and Java library for document processing in ODF and many other formats in cloud,
single server or desktop environments

lpOD, an even more interesting (for my personal use, that is) library to create ODF documents. lpOD is already usable in Python (Perl and Ruby will follow) and comes with an online cookbook

ODFDom Library, another library, in Java this time, to library, written to create, access and manipulate ODF files, without requiring detailed knowledge of the ODF specification

(note to all Plugfest participants and all other readers who want to comment or add something to this page: I’ll enable comments from anonymous users as soon as I can configure the corresponding spam filters, but in the meantime please register or send me an email at mfioretti, at nexaima dot net. And if you want to translate this page, just let me know)