GullFOSS
OpenOffice.org Engineering at Sun
 
Subscribe

Today's Page Hits: 1245

 
Archives
 
« May 2008
SunMonTueWedThuFriSat
    
3
4
6
10
11
12
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
       
Today
Links
Flickr Photos
More Flickr photos tagged with openoffice
Locations of visitors to this page
all tags: accessibility api aqua architecture automated_tests base build calc chart code community compiler cws database development directx download draw eis events export extensions features filter framework graphics gsl gsoc gullfoss i18n import impress installation irc iso26300 java l10n localization mac macros netbeans odf odff ooo ooocon ooxml opendocument openoffice.org patch pdf performance plugin podcast porting qa quality quaste release report sdk snapshot software specification spreadsheet staroffice statistics statuspage sun svg toolkit tools usability user-experience vba web wiki writer writerfilter xml
« OpenOffice.org QA | Main | Development at a... »
Wednesday, 27 Sep 2006
Performance Improvement on Loading/Storing Writer Documents
Frank Meies

Welcome to the readers of the GullFOSS blog. The Writer team wants to regularly inform you about current developing issues. My name is Frank Meies, I have been a member of the Writer team since 2001.

My first blog entry is about the performance improvements achieved by the introduction of automatic text and paragraph styles in the Writer core. This is the second major performance improvement besides the implementation of “word count during idle time”, which already gave us significant performance gains on storing large documents (see http://www.openoffice.org/issues/show_bug.cgi?id=64985).

What are “automatic styles”? Let's assume some of your text is formatted using the “bold” and “underline” attributes. For this formatting, you can find an automatic style in the content.xml file of your ODT file:

<style:style style:name="T1" style:family="text">
<style:text-properties style:text-underline-style="solid" fo:font-weight="bold"/>
</style:style>

with all bold, underlined text portions referring to the automatic text style T1.

So whereas the file format uses automatic styles, unfortunately the Writer core did not. Having two distinct text portions in your document formatted bold and underlined, each portion was associated with its own attribute set, both containing a “bold” and an “ underline” attribute. Therefore storing a document necessarily had to be performed in two passes: The first pass iterates over the text content in order to collect all applied automatic text/paragraph styles, the second pass exports the text content, establishing the link between attributed text/paragraphs and the appropriate automatic style.

By changing the Writer core the way that two text portions (or two paragraphs), which actually have the same attributes, already share the same attribute set, the collection of automatic styles during storing the document becomes obsolete, resulting in a massive performance improvement, especially for large, heavily attributed documents. The usage of automatic styles in the Writer core also has a positive effect while loading a document: Instead of setting e.g. two attributes “bold” and “underline” for a text portion, only the automatic style containing these two attributes has to be set.

Here are some results of my performance measurements. We compare the current OOo 2.0.4 to cws swautomatic01 (based on OOo 2.0.4), which implements the usage of automatic styles in the Writer core. Note that the results heavily depend on the document content:


Loading:
Perf. improvement OOo 2.0.4 -> swautomatic01

Storing:
Perf. improvement OOo 2.0.4 -> swautomatic01

1000 paragraphs, many character attributes

32.9 %

47.0 %

1000 paragraphs, many paragraph attributes

16.3 %

24.9 %

1000 paragraphs, no attributes

0.0 %

18.5 %

OpenDocument specification

7.1 %

25.3 %

That's all for today. Stay tuned for more interesting news from the Writer team.


tags:

Posted by Frank Meies on 27 Sep 2006  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[2]

Comments:

That's a creative solution. Are automatic styles is coming in 2.1? Is there yet an issue number or link?

Posted by Andrew Z on September 27, 2006 at 09:08 PM CEST #

Hi Andrew,
the cws swautomatic01 is scheduled for OOo 2.1, see http://www.openoffice.org/issues/show_bug.cgi?id=65476
Regards, Frank

Posted by Frank Meies on September 28, 2006 at 08:35 AM CEST #

Post a Comment:
Comments are closed for this entry.
« OpenOffice.org QA | Main | Development at a... » GullFOSS