GullFOSS
OpenOffice.org Engineering at Sun
 
 
 
 
More Flickr photos tagged with openoffice

Today's Page Hits: 1262

Locations of visitors to this page
« Some data for OpenOf... | Main | ODF is designed for... »
Sunday, 31 Dec 2006
High entry barriers for new OpenOffice.org developers
Jens-Heiner Rechtien

Starting development for OpenOffice.org is said to be hard. It's an application with in the order of 107 lines of code, a huge number of third party dependencies and a multi-platform build system. The code base has quite a history and contains decade old stuff right next to brand new state-of-the-art code. Some starting difficulties for new developers are just to be expected.

Most of the difficulties have their origin in the complexity of the code. The OpenOffice.org developers can modularize and document the code base and this will lower the entry barrier for new developers. In fact they are doing this for years. But no amount of work will make the OpenOffice.org code base easy and intuitively understandable for new developers. It's just not reasonable to expect this.

But there other reasons why the entry barrier for starting development on OpenOffice.org is high, and we can do something about them. Kai Backman has pointed out that the long OpenOffice.org build time is also a deterrent for potential new developers. And he went a long way to prove that there is much improvement possible especially on the Windows platform. He was right and so in 2006 much work went into shortening the build time. Some of the changes were:

  1. The dependency generator now parses all source and header files only once per directory.
  2. Important improvements for our make utility dmake.
  3. Removing old cruft from the makefile system, vastly reducing the number of shell outs which are known to be expensive on Windows.
  4. Introduction of a pre-compiled header system

Many of these changes are already integrated into the SRC680 main code line, some will follow soon. All the changes together have the potential to reduce the Windows build time to about 1/3rd for a single language build compared to what we used to have at the beginning of 2006. Many thanks to Hans-Joachim Lankenau (ause, hjs), Volker Quetschke (vq), Kai Backmann (kaib) and Martin Kretschmer for this work.

Other platforms with more effective files systems, shells, tools or compilers do not profit as much as the Windows platform, but build times are improved as well. And there is another way to improve build times which works on any platform: remove old cruft from the code base.

The module binfilter is prone for such janitor work and several child workspaces have been dedicated to reducing the amount of unnecessarily compiled  code lines. Here are some statistics for 2006:

 SRC680 m177
hr38
 Disk usage (source only)
 148 MBytes
 75 MBytes
 Files 6896 4998
 Lines 3.53 million
 1.71 million
 Compiled files (wntmsci10)
 2253 1827
 Comp. time (wntmsci10, f. local)
3:58:30  2:35:59

Two child workspaces are still in the queue: bfshrink01 by Caolan McNamara which removes an incredible 900+ unused  methods from binfilter and hr41 which removes another 700+ files. Together they will bring binfilter (the largest code module in OOo) to a somewhat better manageable size.

tags:

Posted by Jens-Heiner Rechtien on 31 Dec 2006  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[7]

Comments:

10^7 million lines of code? that sounds a lot like a typo

Posted by nacho on December 31, 2006 at 04:30 PM CET #

What does binfilter do? Will these projects result in compiled versions of Open Office running faster?

Posted by Mark on January 02, 2007 at 06:00 AM CET #

Nacho, you are right, it's a typo. It's 'order of 10^7 lines of code', of course.

Posted by Jens-Heiner Rechtien on January 02, 2007 at 10:25 AM CET #

Mark, module binfilter contains the old binary filters for older StarOffice versions. These filters are used only when an old (pre-XML) StarOffice document is opened. The mentioned changes have no noticeable effect on the normal day-to-day performance of OpenOffice.org/StarOffice. All mentioned changes affect the build time which is important for developers, but is not particular interesting for users of OpenOffice.org

Posted by Jens-Heiner Rechtien on January 02, 2007 at 10:41 AM CET #

What are the dates of the two builds you're comparing? Because if you really have managed to halve the code size inside a year, this is pretty impressive.

Posted by Andrew Brown on January 02, 2007 at 09:23 PM CET #

Andrew, the child workspace (hr38) was integrated end of October last year and is compared to a milestone build (m177) from July 06. But the feat is not nearly as impressive as it might look, due to the special nature of module binfilter. Some years ago the OpenOffice.org developers decided to remove the support for the old binary filters, which was scattered all over the source tree of the OpenOffice.org applications (writer, calc, impress, draw, math) into a separate module. It's much easier now to refactor the source tree without fearing to break compatibility with the obsolete binary formats, which heavily depend on internal structures. The separation was done by code duplication and stripping and afterwards the code in binfilter was freezed. Stripping the source code was the hard part, obviously. After making certain that the stripped code still can handle all old documents it was planned to do a thorough clean up but for some reason we came never around to do this. But the old and unneeded stuff was still a burden for the build system so doing some janitors work here helped to speed up build and handling time. Please note that much of the obsolete stuff was only a burden for the build system, but not for the compiled product.

Posted by Jens-Heiner Rechtien on January 03, 2007 at 10:47 AM CET #

Corrected the embarrassing number of code lines typo.

Posted by Jens-Heiner Rechtien on January 10, 2007 at 01:48 PM CET #

Post a Comment:
Comments are closed for this entry.
« Some data for OpenOf... | Main | ODF is designed for... » GullFOSS