GullFOSS
OpenOffice.org Engineering at Sun
 
 
 
 
More Flickr photos tagged with openoffice

Today's Page Hits: 1729

Locations of visitors to this page
Main | Next page »
Tuesday, 21 Jul 2009
Saving sheets separately: Preliminary results
Niklas Nebel

Some months ago, I did an experiment with creating XML elements only for changed cells. That approach has two drawbacks:

So now I tried a similar approach, based on sheets instead of cells. It consists of the following:

So far, I have done the "automatic styles" part only for cell styles. With this version, I did some performance measurements using the old George Ou example, which contains text, numbers and dates on 16 sheets. The time for saving depends on how many sheets were modified:

Time for Saving

The red line is the measured time in the CWS, based on milestone m49. Later milestones contain separate optimizations for saving (see here, here and here in the wiki), m52 is shown in yellow. These optimizations will still help for the sheets that aren't copied, so after rebasing the expected result should be something like the green line. Compared to m52, that's half the time if only one sheet was modified.

Handling of other style types (especially text and shape) is still missing, but shouldn't affect the time for simple-cell-content-only documents. Results for other types of documents will follow. Implementation is ongoing in CWS "calcsheetdata", and there's also a wiki page. Stay tuned.

tags:

Posted by Niklas Nebel on 21 Jul 2009  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[3]

Friday, 08 May 2009
Handling sheets separately, part 1: Row Heights
Niklas Nebel

One of the possible approaches to improving Calc load performance is to separate the processing of individual sheets. The basic idea is that often a file with several sheets is loaded, but only some of the sheets are actually used for editing. A part of processing that can well be separated is the updating of automatic row heights, at least if no shapes have to be adjusted. With the implementation in CWS dr70, row height updates after loading are now limited to sheets with shapes, and the active sheet from the view settings. Other sheets are updated when the row heights are needed (activating the sheet, or printing). There is a progress bar in this case, so the user sees the reason for the delay.

screen shot

The resulting improvement (in time before the active sheet is shown and ready for editing) depends a lot on the nature of the file. The chart below shows relative CPU time for a single-sheet file, the old George Ou example, a file with only date cells on many sheets, and the file from issue 85178.

performance chart

There is also a wiki page on the subject. Possible next steps are:

tags:

Posted by Niklas Nebel on 08 May 2009  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this

Wednesday, 29 Apr 2009
Going Faster with Less Fuel
Eike Rathke

Recently, as task of the OpenOffice.org Performance Project, I was working on refactoring Calc spreadsheet area broadcasters that are used to notify formula cells listening to ranges such as A1:A256 whenever a cell's value is changed in that range. The results are overwhelming, I didn't even expect such a big improvement.

When loading test case documents (admittedly heavily constructed for this purpose and rarely to be encountered in the wild) with lots of formulas referring different areas in a certain constellation, load times went down to 55%-63% of the original implementation, accompanied by memory consumption down to 75%. Recalculating a change that involves 65536 respectively 131072 different ranges formula cells are listening to went down to 49% respectively 29%. Additionally, time to close the larger document went down from ~10s to ~1s. Savings may be less in real life, because almost no document stresses the implementation as the test cases do, the improvement is there though.

The change is in CWS calcperf04, targeted to OOo3.2 and currently ready for QA.

For details, numbers and links to test case documents please visit the Refactoring Area Broadcasters wiki page.

tags:

Posted by Eike Rathke on 29 Apr 2009  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[2]

Monday, 23 Mar 2009
Solver for Nonlinear Programming
Andreas Schneider

Just released (well, actually already last Friday) has been an extension for Calc which adds two new solver engines.

These engines are based on Evolutionary Algorithms and allow solving nonlinear programming models (like curve fitting as shown in the example below). To use them (instead of the OpenOffice.org Linear Solver that already ships with OOo) you simply have to select them in the options page of the Solver Dialog. For linear problems it is still advised to use the OOo Linear Solver since it will be considerably faster for these problems. Also if you want to know which of the solvers to use, you should consult the documentation of this new extension which gives a lot insight and also explains all the options and parameters that are available. Finally it also includes an example on how to use these solvers from within a macro.

So for everyone who always wanted to solve some nonlinear problem: the time has come to use OOo for it. ;-)

The solver in action.

tags:

Posted by Andreas Schneider on 23 Mar 2009  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[7]

Friday, 27 Feb 2009
An experiment with incremental saving in Calc
Niklas Nebel

Just to have some numbers about what is possible, I did an experiment with incremental saving – generating the XML elements only for changed cells.

Result 1 Result 2 Result 3

You can read about the details in the wiki.

tags:

Posted by Niklas Nebel on 27 Feb 2009  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[2]

Friday, 28 Nov 2008
Working on Calc Performance Bottlenecks
Eike Rathke

Users sometimes rightly complain about the time it takes to load and calculate complex spreadsheet documents. Recently I worked on profiling documents to identify performance bottlenecks. The first outcome is summarized on the specific bottlenecks wiki page. You will like to hear that the Zaske case how I call it, where Excel needed 1.2s and Calc 24s to update results, is solved and Calc now, in a CWS, is on a level with Excel.

The next weeks, except for getting some CWSs ready for QA and pending work done for the ODF formula subcommittee, I'll mainly focus on changing implementation and do further profiling. I'm quite sure there are several opportunities for improvement left..

tags:

Posted by Eike Rathke on 28 Nov 2008  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[2]

Monday, 17 Nov 2008
Some Calc features for 3.1 from RedFlag 2000
Niklas Nebel

It was a great pleasure to meet RedFlag 2000's Calc developers at OOoCon in Beijing. Of course the successful collaboration continues, resulting in some cool new features:

The formula features are in CWS "odff05", the other features in CWS "calc47", both scheduled to go into 3.1.

tags:

Posted by Niklas Nebel on 17 Nov 2008  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[16]

Friday, 25 Jul 2008
Implementation of the OpenDocument Format Formula specification in OpenOffice.org 3.0
Eike Rathke

[back-blogging, the act of blogging about things that happened in the past]

Most of my time last year and this year went into the OASIS OpenDocument Format Formula specification (aka ODFF aka OpenFormula) and its implementation for the OOoCalc spreadsheet application.

For OOo3.0 the highlights are:

Differentiation between the table:formula attribute's oooc: namespace prefix and the new of: namespace prefix of the OpenDocument >= v1.2 file format.

Function names stored in an OpenDocument file are defined by the OASIS OpenDocument formula subcommittee, and especially names of functions that are implemented as Add-Ins and part of the basic function set differ from those that are stored and expected by earlier releases of OOo that wrote the programmatic name, such as com.sun.star.sheet.addin.Analysis.getDuration that now is simply DURATION.

As that increased the set of formula languages to be supported to 3 (native UI visible, old file format and new ODFF names) plus different cell reference conventions (A1 vs. R1C1 vs. ODFF's bracketed [.A1] schema), parts of the formula compiler's tokenizer and stringizer were rewritten to use different grammar sets.

It now is even possible to use English formulas in a non-English UI, but since that is an undocumented, experimental and unsupported feature, the advanced curious adventurous user could go to Environment Variables (wiki) and look for OOO_CALC_USE_ENGLISH_FORMULAS ;-)

The : colon range operator is fully implemented.

You're already used to the range operator in cell range expressions such as A1:B2, specifying a fixed cell range reference. New is that now the colon is a binary operator taking one reference argument on each side, forming a range encompassing the area of its arguments. Sounds complicated? The area of A1:B2 is of course top left cell A1 to bottom right cell B2. Here are some examples:

If a defined name NAMEDCELL is defined to cell reference C3, A1:NAMEDCELL returns the range reference A1:C3, similar if OTHERCELL is defined to cell reference D4, OTHERCELL:NAMEDCELL returns the range reference C3:D4. Note that the order of arguments does not matter, the range is always justified from top left to bottom right.

This works also for ranges as arguments, e.g. A1:B2:C3 returns a range reference of A1:C3, respectively NAMEDCELL:NAMEDRANGE returns A1:C3 if NAMEDCELL is defined to C3 and NAMEDRANGE is defined to A1:B2.

Functions that return a cell reference can be used as an argument to the range operator, for example if A1 contains the string "C3", B2:INDIRECT(A1) returns the range B2:C3.

See also the feature announcement mail.

The ~ tilde reference concatenation operator.

The reference concatenation operator or range list operator concatenates cell references. This is also called a union operator, but it is not a union set in the mathematical sense, duplicates are not eliminated. ODFF defines that to be the tilde character, but also the somewhat awkward syntax of another spreadsheet application is supported, where a range list is identified by an extra pair of surrounding parentheses and ranges are separated by the parameter separator. Hence the expressions A1:B2~C3:D4 and (A1:B2;C3:D4) are identical, though the latter is transformed to read (A1:B2~C3:D4) after input.

The AREAS() and INDEX() functions evaluate the number of ranges if a range list was passed.

Functions that expect a range reference parameter and where the order of elements does not matter in evaluation now accept a range list as argument as well. For details see the feature announcement mail.

Arbitrary characters in sheet names.

Sheet names now can consist of almost all characters, except those that one other spreadsheet application does not allow, for interoperability reasons. However, reading a sheet name containing those characters from an OpenDocument file is possible, just using them when giving a sheet a new name is not. For details and when you have to enclose the name in single quotes for cell reference expressions see the feature announcement mail.

Kudos to Caolan and Kohei for providing the patches.

ADDRESS() and INDIRECT() support the additional A1/R1C1 parameter.

As specified in ODFF for interoperability, the ADDRESS() and INDIRECT() functions now support the parameter that switches between A1 and R1C1 address style. For details see the feature announcement mail.

ISERROR(undefinedname) results in TRUE.

The ISERROR() function now can be used to check for undefined names, missing Add-In functions and the like. See feature announcement mail.

Enhancements to various spreadsheet functions.

Many implementation details changed for compliance with ODFF and interoperability. Places to check are the schedule of finished implementations in the wiki and a list of related issues and the list of feature announcement mails.

Kudos to Kohei and Lvyue for patch contribution.

Of course there is more to come for OOo3.1 and OOo3.2, stay tuned.

tags:

Posted by Eike Rathke on 25 Jul 2008  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[3]

Wednesday, 02 Apr 2008
Calc Usability Improvements in OpenOffice.org 2.4
Frank Loehmann
We founded a team to improve the usability of Calc last year. First usability improvements made by this team are visible in OOo 2.3. The new OOo 2.4 release contains even more usability improvements for Calc. This article list those improvements and links to specifications, further articles and videos:

1. Convert text to columns

With this feature CSV data inside cells can be transformed into columns directly. It uses the already existing Text import dialog and transforms CSV data inside cells into multiple columns. Example: A1 holds '1,2,3,4". After using this feature, 1, 2, 3 and 4 are split over cells.

Start this feature with the menu Data>Text to Columns.

Article Issue: 4040 Spec

2. Insertion Mode for Cells

Cells in spreadsheet can be moved, copied, or linked Select one or more columns or rows, or a range of cells, and hold down Alt while moving the selection with the mouse. When the mouse button is released, other cells/columns/rows are moved left or down, to insert the selection. Before this feature, target cells always were overwritten.

Article Issue: 7180 Spec

3. Enhanced Data Input

Enter key returns to the column where the input started, one row below Entering data row wise is made easier. The Enter key works like a carriage return-line feed on a type writer and allows to return to the column where the input started, but just a row below the current row.

Video Issue: 15546 Spec

4. Enhanced Formula Input

Formula input: "+" and "-" can also be used to start Most professional spreadsheet users do use the number pad for inputting data to save time. So it makes sense to allow to enter a formula not just by entering a "=" or hitting the Function button in the Formula tool bar. Therefore "+" and "-" can also be used to start entering a formula. This improves the usability and the speed when creating formulas in a spreadsheet. Furthermore this can be used as a little calculator.

Issue: 20496 Spec

5. Individual zoom level per sheet

Each sheet in a spreadsheet document can now have its own zoom level.

Issue: 24372 Spec

6. Improved AutoFilter Behavior

Choices are clearer grouped and based on result of filtering in other columns The AutoFilter function allows to filter data in a spreadsheet by one criteria per column. This feature allows to run a quick analysis of data. Two things are new: there is a clearer grouping of static filter choices. And if in one column a filter is set, the filters in the other columns only show the options that are available in the visible rows.

Issue: 27745 Spec

7. Improved DataPilot

Manual Sorting It's now possible to rearrange items in a DataPilot field via Drag&Drop or Copy&Paste. The default sorting for new tables is "Ascending", it is changed to "Manual" when items are moved.

Issue: 32307 Spec

Double-click in Datapilot cell provides calculation data of that cell When a cell within the data field is double-clicked, it inserts a new sheet containing a subset of rows from the original data source that constitutes the result data displayed in that cell. For instance, when the data field function is selected to be SUM, then the number that is shown in the data field cell must be identical to the sum of all the data field values in the constituent rows displayed in the inserted sheet.

Issue: 57030 Spec

8. Improved Print Dialog

The Print dialog in Calc makes it easier to choose what part of the spreadsheet has to be printed. The option 'print only selected sheets' is now on that dialog.

Furthermore the 'Sort' option is defaulted to on. This makes it easier to print multiple copies of a document.

Issue: 82071 Spec

9. Improved Print Preview

PageUp and PageDown keys scroll continuously through the print preview. Ctrl + PgUp/PgDn scroll to top of next/previous page.

Issue: 7269

tags:

Posted by Frank Loehmann on 02 Apr 2008  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this  |  Comments[4]

Tuesday, 04 Mar 2008
R4Calc - Integration of R into Calc
Niklas Nebel

With this year's Google Summer of Code on the horizon, it's time to mention an extension that started as a Summer of Code project last year: R4Calc, the integration of R into Calc, done by Wojciech Gryc, now also linked from the extension repository. It can put results into cells, or return a graph image. It contains a set of pre-defined tools, and it can easily be extended using simple text files. For example, this definition

 __GUI:
DIALOG <-Dialog(100,100,115,55,Symbol Bar Plot)
LBL1 <-Label(5,5,50,10,Chart Data:)
ARR1 <-ArrayField(60,5,50,10)
LBL2 <-Label(5,20,50,10,Chart Title:)
TXT1 <-TextField(60,20,50,10)
RUN <-RunButton(80,35,30,15)
__CALL:
library(plotrix)
PLOT: symbolbarplot({$ARR1}, main="{$TXT1}")

 is all you need to show this dialog

 Dialog screenshot

and create the output shown below.

Result screenshot 

tags:

Posted by Niklas Nebel on 04 Mar 2008  |  PermaLink |  Bookmark to Delicious To Delicious |  Digg this Digg this

Main | Next page » GullFOSS