An Invitation to the Apache ODF Toolkit

Perhaps overlooked in all the excitement generated by the move of OpenOffice.org to Apache was the fact that a parallel move is occurring with the ODF Toolkit. A few weeks ago we submitted a proposal to Apache to start a new project based on the Java components that were until then hosted by the ODF Toolkit Union. This was done after consulting with ODF Toolkit community and getting approval from the ODF Toolkit Union’s Steering Committee. This proposal was recently reviewed, voted on and approved by Apache. So now we have the Apache ODF Toolkit project in the Apache Incubator.

So what is this project and what is it good for?

This project consists of Java libraries and tools for working with ODF documents. Not editors, not viewers, not anything with a user interface. These are not end-user tools. These are tools for developers who need to write programs that read, write or manipulate ODF documents. These tools do not require that you have any ODF editor installed. They operate directly on the files. So they are ideal for running on a server, for things like report generation, information extraction, document validation, conversion, etc. We have a page of demos that gives a good idea of the range of things possible with the ODF Toolkit.

The ODF Toolkit is important because it enables innovation on top of ODF. By analogy, look at HTML. At one point, the web consisted mainly of hand-authored documents at a handful of academic and government websites. If that was all there was to the web, it would not have been very interesting. What made the web the platform it is today has been the technologies that enable server-side generation of web pages from database queries, or services that analyze web pages and extract and aggregate information. Google was made possible because HTML was an open standard that could be programmatically understood. PHP was possible because HTML was an open standard that could be written.

ODF, unlike the previous generation of binary document formats, is also an open standard. You can read and write ODF documents freely. But writing the code to understand the nitty-gritty of the ODF format is a considerable task. The ODF Toolkit makes this easy for Java programmers. How easy? Here is a “hello world” text document:

Other tasks, like change styles, combining presentations slide decks, searching and replacing text in a document, extracting text from a document are also simple. More examples that give a flavor of the ODF Toolkit are in the “cookbook“.

But along with the “Simple API” the ODF Toolkit has the ODFDOM layer. This layer allows you to get to every part of an ODF document, at the finest grain level. Some tools out there give you only a high level API but then leave you hanging if you want to do something more complicated. Not so with the ODF Toolkit. If you want to drill down and adjust the line spacing of a bullet list in a footnote, then you can do it.

These components enable innovation on top of ODF, innovation that thinks “outside the editors” and “beyond office”.

So how do you get involved? If you want to help with the project then I invite you to sign up on the project’s development mailing list. And if you have questions about using the ODF Toolkit, but don’t want the additional email traffic from the dev list, then you can sign up for the users list. Of course, I’ve signed up for both lists. I hope I’ll see you there!

It seems that this toolkit might be the OpenOffice answer to VBA scripting. I think that one obstacle to switching to OpenOffice for power users of Excel has been the limited ability to write scripts for Calc.