GullFOSS
OpenOffice.org Engineering at Sun
 
Subscribe

Today's Page Hits: 815

 
Archives
 
« May 2008
SunMonTueWedThuFriSat
    
3
4
6
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
       
Today
Links
Flickr Photos
More Flickr photos tagged with openoffice
Locations of visitors to this page
all tags: accessibility api aqua architecture automated_tests base build calc chart code community compiler cws database development directx download draw eis events export extensions features filter framework graphics gsl gsoc gullfoss i18n import impress installation irc iso26300 java l10n localization mac macros netbeans odf odff ooo ooocon ooxml opendocument openoffice.org patch pdf performance plugin podcast porting qa quality release report sdk snapshot software specification spreadsheet staroffice statistics sun svg toolkit tools usability user-experience vba web wiki writer writerfilter xml
Friday, 09 May 2008
Updating XSLT based filters – enhancing XHTML export
Jogi Sievers
Svante Schubert (Co-lead XML project) has finished an update for the XHTML export filter. It was the 14th of April where the CWS arrived ready for QA-status. In the meanwhile Svante and Mathias Bauer also have discussed about updating the XSLT based filters via an extension [issue 88270] to be independent from update cycles of the whole office application. So the iTeam decided to expand the scope of the CWS DEV300/xsltfilter09 and I am really happy that we have done that.

One week later – 21st of April – the CWS was again ready for QA and with the integration of the CWS xsltfilter09 to the DEV300 master workspace the OpenOffice.org will get some really great enhancements for the XHTML export and also the other XSLT based filters (e.g. DocBook XML):

Great work Mathias and Svante!

tags:

Posted by Jogi Sievers on 09 May 2008  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[3]

Wednesday, 12 Sep 2007
Rebasing your CWS, fast!
Jens-Heiner Rechtien

Some months ago we started to look out for a replacement of CVS, our current SCM (Software Configuration Management) tool. Progress has been slow on this matter, for a number of reasons. In short, we have not yet decided which will be the new tool. I'll present the current state of the discussion in more detail next week on the OOoCon 2007 in Barcelona.

Meanwhile we continue to use CVS, with its perceived problems, mainly the lack of performance. To recapitulate, CVS is slow because it does store change sets on a file-to-file basis. To find the difference between one OOo milestone and another, CVS has to check all files involved - given the number of files in the OOo source this is a time consuming task. The simple act of tagging milestones changes every living archive inside the OOo source code repository. Tag or branch operations over the whole OOo source code tree literary move gigabytes of data on the CVS server.

The CWS system copes with the CVS performance characteristics by restricting the tag operation on just the modules the devloper needs in the CWS. Depending on the modules and the load on the server this takes from some seconds to some minutes. Not nice, but bearable.

A CWS which is open for more than just a few days needs to be rebased to a newer milestone eventually. And this is the point where it starts to hurt. Our rebase tool - cwsresync - retrieves the changes between the current and the new milestone and then applies a number of CVS operations on every file which has changed between the milestones. For a long running CWS - with a number of modules added - this can be many thousand files. Since cwsresync up to now relied on the rather inflexible CVS command line client for doing the job, it had to do the CVS operations file by file. If - say - 1000 files needed to be touched during a rebase, cwsresync would start the cvs client 1000-2000 times during the "cwsresync -m" step and typically about 3000 times for the "cwsresync -c" step. Each time, the client has to open a connection to the CVS server, authenticate, bear the network lag etc. Timings showed, that every run of the command line client takes about 10s on average, summing up to more than 8 hours for the "cwsresync -c" step alone. Since 1000 files are not even a particularly huge number of files for a rebase, OOo developers experienced cwsresync runs which took days.

It's very easy to see that the cost of starting the CVS client and opening a connection to the server totally dominates the time needed for a rebase. Why not do some bookkeeping about which file falls in which category (binary, added, removed etc) and then batch the CVS operations? Here the inflexibility of the command line client comes into play, especially the error handling was very hard to get right. I feared the irrecoverable mingling of a CWS if someone used newer CVS clients or servers where small details of error reporting changed, so I dropped this approach for the initial release of cwsresync.

But with SRC680 m227 I was (finally!) able to get a long promised and much improved cwsresync up and running which does just this. The new cwsresync is implemented around an old pet project of mine which is called PCVSLib, a native Perl implementation of a CVS client library right on top of the CVS protocol. PCVSLib took a number of ideas from the netbeans CVS client library, which I would like to grateful acknowledge here.

PCVSLib allows a very fine grained control over CVS operations so that cwsresync can now work with batches of operations and only one connection to the CVS server per module is opened. And this certainly shows in the benchmarks!

Example: Quite old CWS, based on SRC680 m203:

Module vcl with 229 new files, 200 removed files, 2 merged file, 339 moved tags.

Time needed for "cwsresync -m m228 vcl"

 cwsresync with PCVSLib
 1 min 14s
 cwsresync with command line client
 26 min 43s


Now, "cwsresync -m" was always the fast part. "cwsresync -c" for the above example takes only a minute or so with the new cwsresync and an estimated (229+200+2+339)*3*10s  = 23100s  for the old cwsresync, that is more than 6h for just one module.

Other changes to cwsresync include better detection for certain "alert" conditions and a hopefully more readable output.
 

tags:

Posted by Jens-Heiner Rechtien on 12 Sep 2007  |  PermaLink |  Bookmark to del.icio.us Bookmark to del.icio.us |  Digg this Digg this  |  Comments[1]

GullFOSS