GullFOSS
OpenOffice.org Engineering at Sun
 
 
 
 
More Flickr photos tagged with openoffice

Today's Page Hits: 880

Locations of visitors to this page
« OpenOffice.org QA | Main | Development at a... »
Wednesday, 27 Sep 2006
Performance Improvement on Loading/Storing Writer Documents
Frank Meies

Welcome to the readers of the GullFOSS blog. The Writer team wants to regularly inform you about current developing issues. My name is Frank Meies, I have been a member of the Writer team since 2001.

My first blog entry is about the performance improvements achieved by the introduction of automatic text and paragraph styles in the Writer core. This is the second major performance improvement besides the implementation of “word count during idle time”, which already gave us significant performance gains on storing large documents (see http://www.openoffice.org/issues/show_bug.cgi?id=64985).

What are “automatic styles”? Let's assume some of your text is formatted using the “bold” and “underline” attributes. For this formatting, you can find an automatic style in the content.xml file of your ODT file:

<style:style style:name="T1" style:family="text">
<style:text-properties style:text-underline-style="solid" fo:font-weight="bold"/>
</style:style>

with all bold, underlined text portions referring to the automatic text style T1.

So whereas the file format uses automatic styles, unfortunately the Writer core did not. Having two distinct text portions in your document formatted bold and underlined, each portion was associated with its own attribute set, both containing a “bold” and an “ underline” attribute. Therefore storing a document necessarily had to be performed in two passes: The first pass iterates over the text content in order to collect all applied automatic text/paragraph styles, the second pass exports the text content, establishing the link between attributed text/paragraphs and the appropriate automatic style.

By changing the Writer core the way that two text portions (or two paragraphs), which actually have the same attributes, already share the same attribute set, the collection of automatic styles during storing the document becomes obsolete, resulting in a massive performance improvement, especially for large, heavily attributed documents. The usage of automatic styles in the Writer core also has a positive effect while loading a document: Instead of setting e.g. two attributes “bold” and “underline” for a text portion, only the automatic style containing these two attributes has to be set.

Here are some results of my performance measurements. We compare the current OOo 2.0.4 to cws swautomatic01 (based on OOo 2.0.4), which implements the usage of automatic styles in the Writer core. Note that the results heavily depend on the document content:


Loading:
Perf. improvement OOo 2.0.4 -> swautomatic01

Storing:
Perf. improvement OOo 2.0.4 -> swautomatic01

1000 paragraphs, many character attributes

32.9 %

47.0 %

1000 paragraphs, many paragraph attributes

16.3 %

24.9 %

1000 paragraphs, no attributes

0.0 %

18.5 %

OpenDocument specification

7.1 %

25.3 %

That's all for today. Stay tuned for more interesting news from the Writer team.


tags:

Posted by Frank Meies on 27 Sep 2006  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[2]

Comments

Andrew Z said: That's a creative solution. Are automatic styles is coming in 2.1? Is there yet an issue number or link?

Posted by Andrew Z on September 27, 2006 at 09:08 PM CEST #

Frank Meies said: Hi Andrew,
the cws swautomatic01 is scheduled for OOo 2.1, see http://www.openoffice.org/issues/show_bug.cgi?id=65476
Regards, Frank

Posted by Frank Meies on September 28, 2006 at 08:35 AM CEST #

Post a Comment:
Comments are closed for this entry.
« OpenOffice.org QA | Main | Development at a... » GullFOSS