GullFOSS
OpenOffice.org Engineering at Sun
 
 
 
 
More Flickr photos tagged with openoffice

Today's Page Hits: 225

Locations of visitors to this page
« Development at a... | Main | Breaking a lance for... »
Wednesday, 15 Nov 2006
Why OpenOffice.org uses OpenDocument
Michael Brauer

A lot has been written about OpenOffice.org, OpenDocument and ECMA Office Open XML these days, and some of you may wonder whether OpenOffice.org may switch to ECMA Office Open XML as native file format . The answer from my personal perspective is a clear “no”. This has several reasons.

I would like to start my explanation why the answer is “no” with stating a basic principle that is accompanying the OpenOffice.org XML project, which most of you know I'm leading, since its formation in 2000: The documents that our users and customers create with OpenOffice.org belong them, not us. They, not us, must be able to read or process them in the near, far, and very far future. And they, not us, must have the choice to use whatever application they want to do so. And they, not us, have to bear the consequences if this is not given.

For this reason, the XML project already in 2000 decided to develop an XML file format for office applications that ensures interoperability and long-time access to documents, and that may be used by other office applications as well. To achieve these goals we decided for the development of a new file format from scratch, based on existing standards. And we decided against the two other options we had, and that could have saved us a lot of specification and implementation effort: To either turn our binary formats into XML formats by making XML elements and attributes from the structures we had in the binary file format, or to dump OpenOffice.org's internal data structures into XML. We decided against these two options, because we felt that these two options were inappropriate to achieve our goals. I did not analyze Office Open XML myself, but I believe those who did, and they actually found dumps of internal data structures in Office Open XML. It therefore does not meet our own goals, or at least doesn't do to the degree OpenDocument does.

To further advance the basic principle that office documents belong to the users, Sun and others in 2002 founded the OASIS OpenDocument Technical Committee (TC), whose purpose was to create an open standard for office application file formats. There are many definitions of open standards (Sun's, which I fully support, can be found here). For OpenOffice.org, two aspects were in particular of interest:

  1. open source community members must be able to join;

  2. the work must be done in public.

OASIS and the OASIS technical committee rules met and meet these requirements. There is a low-cost individual membership, and for those for which even this is not an option, there are mandatory public reviews. Furthermore, all meeting minutes, e-mail communications and drafts must be made available to the public. Why are these two aspects so important? Because for the basis of the OpenDocument file format, the OpenOffice.org XML file format was chosen. This meant that the work on the OpenOffice.org XML file format itself was abandoned and instead a file format was developed in an OASIS technical committee, but on the basis of the OpenOffice.org XML file format. It is a matter of course that we could not ignore the requirements of the open source community here. How is the situation at ECMA? Well, the publication requirements of the ECMA process seem to be far less strict than those of OASIS, but if you like, you may compare the very precise OASIS Technical Committee rules with those of ECMA yourself. In any case, the OpenOffice.org XML project feels very comfortable with the collaboration with the OpenDocument TC.

I now come to the last and maybe essential reason why ECMA Office Open XML is not an option for being used as OpenOffice.org's native file format: It's the charter of ECMA TC45. It says: “The goal of the Technical Committee is to produce a formal standard for office productivity applications within the Ecma International standards process which is fully compatible with the Office Open XML Formats.”. From my understanding, this means that ECMA TC 45 standardizes a file format that has to be compatible with Microsoft's Office Open XML formats. This is a huge difference to what the OASIS OpenDocument TC does: creating a vendor-neutral office application file format for use in arbitrary office applications. Regardless whether ECMA TC 45's charter allows others to contribute to the ECMA Office Open XML formats (my reading of the charter is that this is not possible if it introduces any incompatibilities to the Microsoft Office Open XML file formats), it is not what we want for OpenOffice.org. For the reason given above, we want to use a native file format that is application-neutral. One may argue now (and some actually do so) that the OASIS OpenDocument TC does the same for OpenOffice.org XML. That's not the case. The OpenDocument TC in fact decided to use OpenOffice.org XML as basis for its work, because it had proven its value in real life already, but that's it. There is not a single word in the OpenDocument TC's charter that the resulting specification has to be compatible to OpenOffice.org XML, and there never was one. Everything else in fact would have been a contradiction to our initial goals: to develop a file format that can be used by different applications as native file format and that provides the users the freedom to choose the application they would like to use. And, yes, OpenOffice.org XML and OpenDocument v1.0 are not compatible.


tags:

Posted by Michael Brauer on 15 Nov 2006  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[1]

Comments

Chris Ward said: Microsoft's XML is rather like a developed typewriter; when you want a new paragraph you say 'carriage-return, carriage-return, tab'. ISO ODF XML is rather like a developed dictaphone; when you want a new paragraph you say 'new paragraph'. This difference in what is important between the 2 formats goes to the heart of the matter. With ISO ODF XML, you express what you mean; and the meaning is preserved when you go to a different application, or to a future development of the same application. But the exact layout on the page may differ, and you may not have the controls to make it look particularly pretty on the page. With Microsoft XML, you express how it should look on the page. Often, you accidentally use more elaborate controls than you meant to; and it is this 'use of more elaborate controls' that gives compatibility problems with other applications and other versions of the same application. The meaning in your document ends up 'locked up', unable to be extracted for use in anything else. But you can dicker around with the document to make it look very pretty on the page. For me, being an engineer, the meaning is the important thing; 'dictaphone' was an advance over 'typewriter'. So keep up the good work, with ISO ODF XML.

Posted by Chris Ward on November 16, 2006 at 12:44 AM CET #

Post a Comment:
Comments are closed for this entry.
« Development at a... | Main | Breaking a lance for... » GullFOSS