GullFOSS
OpenOffice.org Engineering at Sun
 
 
 
 
More Flickr photos tagged with openoffice

Today's Page Hits: 971

Locations of visitors to this page
« OpenOffice.org 3.0... | Main | New: OOo-Dev 2.3... »
Thursday, 21 Jun 2007
Another 12857% Speed Improvement
Eike Rathke

With integration of CWS dr54 a quite small but very effective change regarding updates of references to external documents when loading a spreadsheet will become available. In a customer's test case document this brought down loading time from over 15 minutes to 7 seconds, which is by factor 128 or more.

Some details: imagine a spreadsheet with cells listening to large ranges of external references such that there are 35000 formula fragments listening to an identical range (A) of 50000 cells, another 9000 formula fragments listening to an identical range (B) of 50000 cells, and yet another 9000 formula fragments listening to yet another identical range (C) of 50000 cells. After having loaded the document the user is asked whether to update data of the external references. If answering yes the external data is loaded and the relevant parts replace the already existing cached data.

The replacement has the consequence that for each changed cell content the change is broadcasted to the listening cells, and although a resulting action is taken only if the target cell wasn't already notified, the overhead of the broadcasting mechanism was responsible for the massive slowdown. To stick with the numbers from above for simplicity:

Broadcaster Actions

A: 50000 * 35000 = 1,750,000,000
B: 50000 * 9000 = 450,000,000
C: 50000 * 9000 = 450,000,000
-------------
2,650,000,000

 

So, even if for range A after the first cell change was broadcasted all listening formula fragments were dirty, quite significant portions of the algorithm were still ran through for the remaining 1,749,999,999 notifications. Same for ranges B and C. This is where I placed the scalpel to cut things off and introduced a caching mechanism that remembers ranges formulas are listening to when broadcasted during a certain action. This gives:

Broadcaster Actions

A:     1 * 35000 =        35,000
B: 1 * 9000 = 9,000
C: 1 * 9000 = 9,000
-------------
53,000
Cache Lookups
A: 49999 *     1 =        49,999
B: 49999 * 1 = 49,999
C: 49999 * 1 = 49,999
-------------
149,997

 

Wow! Great!

tags:

Posted by Eike Rathke on 21 Jun 2007  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[8]

Comments

Kami said: Impressive... I hope you will find more point to speed up Calc. Thanks.

Posted by Kami on June 22, 2007 at 07:09 AM CEST #

OOoForum said: That will be better to attach this spreadsheet file. I could to benchmark myself

Posted by OOoForum on June 22, 2007 at 10:10 AM CEST #

Eike said: The spreadsheet is a customer's document of confidential nature, I can't publish it. Otherwise I would had stated an issue's number as well that had a document attached. However, given the description of the ranges in the blog entry such a document is quite easily constructed.

Posted by Eike on June 25, 2007 at 12:57 PM CEST #

Nick said: I am not sure what you mean by external references and does it have special impact here. Does it have the performance improvement to the sheets which refer the cells of same sheet also. What I understand is, any changes to cell will broadcast the notification to all cells, which were not necessary and changes to this algorithm is done to cut short the notification to affected cells only. Please clarify on the same.

Posted by Nick on June 25, 2007 at 01:48 PM CEST #

Eike said: References to external documents look like =SUM('file:///home/you/mydoc.ods'#$Sheet1.A1:A50000) for example. See also the online help under "references;to cells in another document". And no, a cell change did not lead to all cells being notified, just those affected by the change. However, in the example given, an update of the internal range that caches 'file:///home/you/mydoc.ods'#$Sheet1.A1:A50000 lead to separate notifications of the referencing formula cell for each of the modified cells in that range. That broadcast action now is remembered and not repeated for subsequent changes in the very same range until the update finishes.

Posted by Eike on June 26, 2007 at 10:24 AM CEST #

Jack Fuller said: Is this change now implemented in OO 2.2.1?

Posted by Jack Fuller on June 29, 2007 at 08:34 PM CEST #

Eike said: No, it is implemented for OOo2.3

Posted by Eike on June 29, 2007 at 09:00 PM CEST #

Joe said: If you're looking for more places to speed up Calc - try CSV import. Excel takes ~3 seconds to open a 3MB file, and Calc takes what feels like 30 seconds.

Posted by Joe on July 01, 2007 at 07:14 PM CEST #

Post a Comment:
Comments are closed for this entry.
« OpenOffice.org 3.0... | Main | New: OOo-Dev 2.3... » GullFOSS