Slight changes...
I hereby happily announce that starting this Friday, I'll be part of the great OpenOffice.org team at Novell. Being fully home-assigned and saving the ~2 hours commute every day, I can now hopefully push my hidden gsl agenda even more.
Posted at 09:13AM Jan 28, 2008 by thorsten in OpenOffice.org |
Don't trust your tools, if they are software...
While hunting down broken images in the pdf import, I came across a very weird behaviour, that was seemingly caused by plain C stream IO: every once in a while, streams that were explicitely opnened in binary mode converted 0x0A into 0x0D. One-on-one substitution, not the usual 0x0A to 0x0A 0x0D lineend conversion, notably. I wasted about a day trying to track that down, suspecting everything from crazy filesystem flags, to symbol hijacking during link time (i.e. somebody sneaking in own fwrite/putc/whatever into the binary). Laboriously ruled out one after the other.
Finally, I did what I should have done in the first place: I used a different hex editor. And lo and behold, the trusty midnight commander showed all the files with correct 0x0A. Caught in the act and proven guilty: emacs' hexl-mode.
Well, kind of. It turned out to be a rather unlucky combination of things going wrong: emacs with its buffer encoding scheme, which tries to be smart and detect the actual encoding - if there's a 0x0A in the file before a 0x0D, emacs assumes raw-text-unix and leaves line ends alone. If there's a 0x0D in the file before any 0x0A, emancs assumes raw-text-mac, and converts all assumed lineends to 0x0D. And since hexl-mode seems to work on top of an existing buffer (and never loads the file directly from disk), this resulted in the described mess.
So, to save yourself some trouble: don't trust your tools. And make sure to have files you want to work on with hexl-mode match something in file-coding-system-alist's content, that enforces binary reads (elc, png, jpeg, ar, gz etc. for my setup).
Posted at 06:00PM Nov 28, 2007 by thorsten in OpenOffice.org |
GSoC 2007 miscellanea
Being honoured to mentor Shane Matthews on his GSoC 2007 project Impress OpenGL rendered transitions, I today received the mentor gift - thank you Google for the tshirt!
On related news, the resulting CWS got recently merged to HEAD, so those of the adventurous kind can now build this as an optional feature: just issue "build ENABLE_OPENGL=TRUE" in the slideshow module to be blessed with an ogltrans shared lib. And thanks to Rene, after CWS configure22 is integrated, one can even give "--enable-opengl" on the configure line. After that, register the ogltrans lib into OOo via unopkg, and use this presentation file to see what others are missing.
This could even be shipped as a separate extension after 2.4 is out...
Posted at 11:05AM Nov 23, 2007 by thorsten in OpenOffice.org |
A Dead Parrot Learns to Fly
After an impromptu dead parrot sketch at the OOoCon ("the beamer works" - "no, it's dead" - "bah, it's not dead, it just needs a reboot"... (you might look forward to the still-missing video)), I had a rather rough ride with the proverbial last 10 percent of the browser functionality - I'm still not 100% happy with the representation in OOo's Calc, and the xcs schema parser leaves something to desire. But surely the thing is now ready for config hackers to play with it - grab your copy here:
09e22fc6d8d9f9bfd3dc7cf838ce6590 ooconfig.oxt
(for those of you who don't know what I'm talking about: OoConfig is a Python extension, providing a tool quite similar in spirit to Mozilla's about:config page - a way of tweaking the complete (even hidden) set of configuration items. See also the wiki for further information)
Posted at 10:47PM Oct 15, 2007 by thorsten in OpenOffice.org | Comments[5]
A hacker's way of app-bundling
Just found this. Pretty cool way of packaging an application to be completely relocatable (though without desktop integration, barring further hacks). Something for OOo Portable?
Posted at 12:02PM Aug 08, 2007 by thorsten in OpenOffice.org |
Germany leads in installed photovoltaic power generation
Via Russel Coker of Debian fame, the Boston Globe reports that 55 percent of the world's installed photovoltaic power is in: Germany. Yay! With my solar roof contributing, if even only the tiniest amount...
Posted at 09:02AM Aug 01, 2007 by thorsten in Personal |
Intel opensourced their Thread Building Blocks
Seems that Intel has set their TBB lib free - which solves a few of the basic issues outlined here (I've commented a bit in the context of c++).
Posted at 10:31AM Jul 24, 2007 by thorsten in OpenOffice.org |
OOo my Threading (3)
Okay, what does all of this really buy us, right now? Lets assume we want to speed up Calc's way of parsing and calculating cell values. Given the following dependency tree (cell 1 refers to the result of cell4, cell 6 to the result of cell2, and so forth):
parse_cell_4 parse_cell_5
| |
parse_cell_1 parse_cell_2 parse_cell_3
| | |
-------------- parse_cell_6 -----------------
The partial ordering of cell evaluation equivalent to this tree is as follows:
parse_cell_4<parse_cell_1 parse_cell_5<parse_cell_3 parse_cell_1<parse_cell_6 parse_cell_2<parse_cell_6 parse_cell_3<parse_cell_6
Wanting to employ what we've talked about earlier, to parallelize this calculation, one would need a static expression (i.e. one fixed at compile time):
as_func(
as_func(
parse,
as_func(
parse,
4),
1),
as_func(
parse,
2),
as_func(
parse,
as_func(
parse,
5),
3))
(with as_func denoting that the argument should be
evaluated lazily, and parse being a Calc cell parsing unary function,
expecting the cell number as its sole parameter).
Now, having the formula dependencies fixed in the code isn't of much value for a spreadsheet, thus we need to handle this a bit more dynamically. Futures and Actions in principle provide the functionality that we need here, only that there's one slight catch: the pool of threads processing the Actions or Futures might actually be too small to have all cells resolved, that in turn are referenced from other ones (with circular cell references being the extreme example. But see below). This is because forcing a Future or Action to evaluate will block the thread requesting that value, which will eventually lead to starvation, unless there's at least one more thread active to process cell values other Actions are waiting for.
Of course, one could remedy this by using N threads when
dealing with N cells. But that would get out of hand
pretty quickly. Alternatively, the cell parsing function can be split
into two parts: the first generates a parse tree, thus extracting the
referenced cells (depending on the heap allocation overhead, one could
also throw away the parser outcome, except for the cell
references. But OTOH, with a decent thread-local allocator, the full
parsing could happen concurrently. YMMV). Given the list of references
for each cell, one again gets the partial ordering over the value
calculation:
vector< vector< int > > intermediate;
parallel_transform( cells.begin(), cell.end(),
intermediate.begin(),
&parse_step_1 );
For each cell, this step yields a vector of preconditions (other cells, that need to be calculated before). Pushing the actual cell value calculation functions into a job queue, and handing it the dependency graph (represented by the individual cell's references) generated above, we arrive at a fully dynamic version of the concurrent cell parsing example:
int i=0;
job_queue queue;
vector< double > results(intermediate.size(),0.0);
transform( intermediate.begin(),
intermediate.end(),
results.begin(),
bind( &job_queue::add_job,
ref(queue),
bind( &parse_step_2,
ref(results),
ref(cells),
var(i)++ ),
_1 ));
queue.run();
This looks weird, but peeking at the prototypes of the involved functions might help clear things up:
/** adds a job functor
@param functor
Functor to call when job is to be executed
@param prerequisites
Vector of indices into the job queue, that must be processed
strictly before this job. Permitted to use not-yet-existing
indices.
*/
template job_queue::add_job( Func functor,
vector const& prerequisites );
and
/** calculates cell value
@param io_results
In/out result vector. Receives resulting cell value, and is used
to read referenced cell's input values.
@param content
Cell content
@param cell_index
Index of cell to process
*/
parse_step_2( vector& io_results,
string const& content,
int cell_num );
This job queue can then decide, either globally or via policies, what to do in various situations:
At any rate, fine-tuning this to various hardware, operating systems and deployment settings is much easier than for manual thread creations. Plus, given a few (falsifiable) attributes of the functions called, it's also much safer.
Posted at 01:59AM Nov 13, 2006 by thorsten in OpenOffice.org | Comments[2]
OOo my Threading (2)
I've already talked a bit about status quo of threading in OOo, and listed some largely useful stuff that others have been doing to remedy the problems shared-state multi-threading poses.
Why not going further? If there's a way to know that a function is pure, or that an object is thread-safe, then it would be nice to have automatic parallelization employed. The underlying issue that needs to be controlled here are races - prohibiting unserialized access to shared state. So, either a function is pure, i.e. does not depend on shared state at all, or the called subsystem takes care of itself against concurrent modification (this is most easily achieved by the environment concept of the UNO threading framework: the UNO runtime implicitely serializes external access to thread-unsafe appartements).
Although C++ provides no portable ways to express those concepts on a language level, for pure functions, there's a way of using a limited subset of lambda expressions, that inhibit access to shared state on the syntax level. And it's perfectly possible to mark objects (or even subsets of a class' methods) to be thread-safe. One straight-forward way to do this are specializations of UNO interface references, i.e. ones that denote thread-safe components.
Given all of this, we can form statements that contain:
So, in fact, a runtime engine could reason about which subexpressions
can be executed concurrently, and which must be serialized. If you
treat method calls as what they are, e.g. implicitely carrying a
this pointer argument, a possible data flow graph might
look like this:
new object1 new object2
| |
+->object1::methodA +->object2::methodA
| |
+------------------------------>+->object2::methodB(object1)
|
v
object1::methodC
new object3
|
+->object3::methodA
That is, the this pointer is carried along as a target
for modifications, and as soon as two methods have access to the same
object, they need to be called sequentially. This does not apply for
UNO interface calls or objects that are tagged as thread-safe, of course. To be specific, a forest of data flow trees can be generated, which
defines a partial ordering over the subexpressions. If neither
exp1<exp2 nor exp1>exp2 can be deduced
from this ordering, those two subexpressions can be executed in
parallel. Really basic stuff, that compiler optimizers do, as well -
only that plain C/C++ doesn't provide that many clues to safely
parallelize. From the example above, it is obvious that
object3::methodA can be executed concurrently to all other
methods, that object1::methodC must be execute strictly
after object2::methodA, and that
object1::methodA and object2::methodA can
also be executed concurrently.
Okay, this is largely crack-smoking. But there is something to be made of it. Stay tuned.
Posted at 09:50AM Nov 04, 2006 by thorsten in OpenOffice.org |
OOo my Threading...
This starts where Stephan's OOoCon talk left off - by taking his problem statement and maybe a gist of the proposed concepts into OOo's core (which happens to be C++ instead of Haskell. Which some folks might bemoan, but still, ~90% of our code consists of that language).
To recap, the free lunch is over for us application software developers, i.e. either OOo gets seriously multi-threaded soon, or we stop adding features and only optimize for speed, or OOo will get significantly slower within the next few years. But as you, the gurus and yours faithfully have noticed by now, becoming concurrent with osl::MutexGuard, osl::Thread & friends is tedious, error-prone, and neither race nor dead-lock free for a project of OOo's size. With the expressive power of a general programming language, it is even theoretically impossible to verify against these properties (for a sufficiently large project. OOo probably satisfies this constraint).
So, Stephan already showed what's possible in other languages. To boot, the trick is to raise the level of abstraction above digging-in-the-dirt, hand-made, explicit, and ugly locking or thread fiddling. Whereever possible, parallelism should be detected either implicitely, or at least denoted declaratively (instead of manually telling the machine what to do). John Ousterhout has a nice writeup about that. It's basically another instance of the lack of transparency and ease of use that is so ingrained with day-to-day tasks in C++. For stuff like object lifetime management, serialization and so on, boost provides transparent solutions. For large-scale concurrency management, the UNO threading framework does. For local, statement-level parallelization, there ain't much - or is there?
In fact, there actually are several attempts to extend C++ in that direction. To name a few:
From a coarse inspection, Boost.Act seems appealing (TBB really isn't an option for OOo, no?) - boost is high-quality anyway, and I think that the Action concept is a richer one than Futures.
Basically, the advanced approaches listed above employ declarative semantics, i.e. one no longer needs to create threads by hand, assign a thread function, and manually serialize access to shared state. Instead, one simply states that a loop body shall be executed in parallel:
parallel_transform( src.begin(), src.end(), dest.begin(), bind(_1 * 42) );
This multiplies each element in src with 42, and stores
the result in dest. How many threads are created, and
where they are taken from is completely up to the implementation - a
typical threading runtime on a machine with N cores will
hold a static pool of N threads, each of them receiving a
chunk of the parallel_transform workload from above. In
the old way of doing multi-threading, this behaviour would potentially
be spread over the whole code base (making it a cross-cutting
concern, just like explicit method guarding is today - with thousands
of osl::MutexGuard(m_aMutex) invocations all over the
place).
One of the crucial points of doing stuff like the above in a safe way
is to avoid hidden access to shared state. Access to the container can
be serialized by parallel_transform, but if the called
functor modifies shared objects concurrently, all bets are off. Thus,
the listed frameworks usually have some safe-guards in place, most of
them using a way of explicitely annotating free-standing functions and
classes to be thread-safe (note that this tagging of thread-safeness
can easily be achieved by means of the UNO
threading framework as well - Kay already has different
uno::Reference types on his list, to distinguish between
thread-safe and thread-unsafe components during compile-time). Another
important concept noteworthy here are pure functions,
i.e. functions that, given the same input parameters, always return
the same result, no matter where or when called (related, but not
exactly the same are referentially
transparent functions, which could also be exploited by an
optimizing parallel execution runtime). It's always safe to call pure
functions on disjunct parameters concurrently with no extra
synchronization effort. A safe (because syntactically limited) way of
defining pure functions is via boost's bind or
lambda library. There, expressions that modify shareable state could
simply be banned from the available syntax (they aren't - but it's
possible to do that).
Need an example? Good, take the standard one, that so naturally asks
for parallelization, that it's usually been sped up via SIMD in the past (you
might have noticed that those parallel_something
algorithms principally apply the same instructions (the functor to
run) in parallel to all container elements - that's SIMD), namely operating on image
pixel:
// color-convert an image from RGB to CMYK color space
parallel_transform(src_img.begin(),
src_img.end(),
dst_img.begin(),
bind(&rgb2cmyk,_1,cref(conversion_parms));
This is the archetype of inherent parallelism, having a huge amount of data (image pixel), that all need to have the same operation applied to. Disregarding memory bandwidth for the moment, this snippet scales linearly with the number of available cores.
Note the conversion_parms parameter to the bound
function, which can pose problems if rgb2cmyk silently
const_casts it to something mutable. This is the dark
corner of C/C++'s non-enforcable constness, and therefore, a minimal
amount of developer discipline is necessary: if objects or functions
employ magic behind their public interfaces (on-demand creation of
data, buffering, caching, copy-on-write), this still needs to be made
thread-safe explicitely. Fortunately, for each of the aforementioned
items, there's a thread-safe helper available in OOo (use rtl::Static
for late-initialization of static data, boost::shared_ptr
for lazy initialization of class members, caching and buffering, and
cow_wrapper
for copy-on-write).
To be continued.
Posted at 10:02PM Oct 25, 2006 by thorsten in OpenOffice.org |
First impression: Draw/Impress on cairocanvas
After much prep work (kudos to you, Armin!), things start to look like it's working:
(if you inspect this carefully, you'll see the difference between the preview image on the left side, which is still rendered the old way, and Draw's edit pane content to the right, which now gets displayed via cairocanvas).
Gory details, live demo, and plans how to move on at this year's OOoCon. If you can't wait: OOoWiki has a brief hacker's guide.
Posted at 04:00PM Sep 01, 2006 by thorsten in OpenOffice.org |
The graphics brethren: basegfx and basebmp
Started in late 2002, OOo's module basegfx is now a pretty mature collection of data structures and algorithms for graphics processing. The set of exported headers sport abstract data types such as:
- points, vectors, boxes, ranges, bezier curves
- matrices
- polygons and collections of polygons ("poly-polygons")
- 1D, 2D, 3D
- integer and double-precision floating point
In addition, especially the polygon ADTs are shipped with a set of associated algorithms (provided as free functions, instead of member functions):
- a poly-polygon clipper (comparable to what gpc is doing)
- a poly-polygon triangulator (convert a polygon to triangles, which can be fed to a GPU)
- a function to generate stroked paths from a given polygon
- plus a lot of basic, but very useful stuff (like svg im/export, scan conversion, arc and circle generation, etc.)
What's missing, though, is everything related to bitmaps. That's where basebmp comes into play, started a few weeks ago and set to provide for pixel processing, what basegfx provides for the symbolic graphics primitives.
I've initially started this work by using AGG, but quickly found out that adding pixel formats other than integer true color ones is pretty hard (Because the code is largely undocumented. Because the assumption that pixel values can be modified by shift left and right operations, and are PODs (i.e. non-class types) is all over the place. And because basically, a way to somehow get a hold to the famous extra level of indirection is missing. Which, as usual, would have solved this software engineering problem). Instead of starting from scratch, we've based this on vigra, a framework for generic image processing, which provides an elegant way to add 'special processing' before an actual pixel gets assigned a value.
In order to really get to the point here, I first need to sketch the way generic bitmap processing works: it's the idea of separating data access and the actual bitmap manipulation algorithm, using C++ templates. The concept is borrowed from the STL - bitmap pixel can be accessed by (two-dimensional) iterators, which in a very basic sense behave like a pointer:
while( i!=end )
*i++ = rand();
If i is a bitmap iterator, the pixel will be set to
random values with this algorithm. The type of i can be chosen freely,
and so can the actual pixel format of the bitmap. If one wants to
e.g. add palette lookup to this algorithm (or scale the random values
into the appropriate color range) without changing the algorithm's code, either extremely ugly hacks
involving proxy objects are necessary (because operator*() is expected
to return an lvalue, which gets assigned the new value directly), or
one needs a direct way to 'see' both the iterator and the new value
prior to assignment:
while( i!=end )
acc.set(i++, rand());
The acc variable is a pixel accessor in vigra lingo,
which takes the iterator and the new value, and accesses the
corresponding pixel. Now, adding palette lookup, scaling, blending,
mask operations and much more is as easy as providing a different
accessor to the algorithms is. Plus, it's completely orthogonal to
both the iterator implementation, and the algorithm used. That means,
you define a set of iterators (for packed-pixel images, for r8g8b8
images, for r5g6r5 images), a set of accessors (for true color pixel,
for palette lookup, for blitting through a binary mask), and a set of
algorithms. And you combine all those freely, i.e. you have the
cartesian product of variations to choose from.
Posted at 02:00AM Jul 17, 2006 by thorsten in OpenOffice.org |
Solar roof: update
Guess it's about time to post some update to my photovoltaic roof thingy. The generator is now running for 6 months, without any downtime, and produced ~700 kWh during that period. Right now, that's about 10 percent below the estimated yield (based on a simulation, which in turn is based on the mean from 20 years of local weather recordings), and has saved the athmosphere around 88 kg of CO2 exhaust (well, not really. First of all, the generator has to earn the energy it has used up for producing and mounting it. This might take some years).
Currently, with the May providing a cloudless sky one day after the other, we're reaping 20 to 25 kWh per day (could be even more, if it wasn't so damn hot for the season - thermal effects in the panels now cancel even more of the electrons).
All in all: feels really good.
Posted at 11:28PM May 06, 2006 by thorsten in Personal |
Source code grokking, 3rd installment
Following my habit to write something about source code analysis every few weeks, here comes the next installment under that topic: Initially set off by KaiB's pain with OOo's build slowness, I remembered some rules from John Lakos' 'Large-Scale C++ Software Design' tome - and to first off assess the state of affairs regarding include dependencies, found this:
- DEPS, a tool to extract dependencies (e.g. includes) from source code
- cinclude2dot, a tool plotting a project's include dependencies as a GraphViz dot graph
- Bloof, an infrastructure for processing of version control data, and
- StatCVS, which extracts statistical HTML reports from a CVS repository. (thanks Pavel for pointing me to it).
Posted at 01:07AM Apr 24, 2006 by thorsten in OpenOffice.org |
cairocanvas is integrated!
Thanks to Radek's tireless work, CWS cairocanvas is now finally integrated (into the 2.0.3 code line). Took about eight months to get there, and now provides a nicely anti-aliased slideshow experience, with even the alpha effects running at a decent speed.
The downside: you need a fairly recent X server and cairo 1.0.x on your system.
Posted at 12:37PM Feb 28, 2006 by thorsten in OpenOffice.org | Comments[1]