OOo my Threading (2)
I've already talked a bit about status quo of threading in OOo, and listed some largely useful stuff that others have been doing to remedy the problems shared-state multi-threading poses.
Why not going further? If there's a way to know that a function is pure, or that an object is thread-safe, then it would be nice to have automatic parallelization employed. The underlying issue that needs to be controlled here are races - prohibiting unserialized access to shared state. So, either a function is pure, i.e. does not depend on shared state at all, or the called subsystem takes care of itself against concurrent modification (this is most easily achieved by the environment concept of the UNO threading framework: the UNO runtime implicitely serializes external access to thread-unsafe appartements).
Although C++ provides no portable ways to express those concepts on a language level, for pure functions, there's a way of using a limited subset of lambda expressions, that inhibit access to shared state on the syntax level. And it's perfectly possible to mark objects (or even subsets of a class' methods) to be thread-safe. One straight-forward way to do this are specializations of UNO interface references, i.e. ones that denote thread-safe components.
Given all of this, we can form statements that contain:
So, in fact, a runtime engine could reason about which subexpressions
can be executed concurrently, and which must be serialized. If you
treat method calls as what they are, e.g. implicitely carrying a
this pointer argument, a possible data flow graph might
look like this:
new object1 new object2
| |
+->object1::methodA +->object2::methodA
| |
+------------------------------>+->object2::methodB(object1)
|
v
object1::methodC
new object3
|
+->object3::methodA
That is, the this pointer is carried along as a target
for modifications, and as soon as two methods have access to the same
object, they need to be called sequentially. This does not apply for
UNO interface calls or objects that are tagged as thread-safe, of course. To be specific, a forest of data flow trees can be generated, which
defines a partial ordering over the subexpressions. If neither
exp1<exp2 nor exp1>exp2 can be deduced
from this ordering, those two subexpressions can be executed in
parallel. Really basic stuff, that compiler optimizers do, as well -
only that plain C/C++ doesn't provide that many clues to safely
parallelize. From the example above, it is obvious that
object3::methodA can be executed concurrently to all other
methods, that object1::methodC must be execute strictly
after object2::methodA, and that
object1::methodA and object2::methodA can
also be executed concurrently.
Okay, this is largely crack-smoking. But there is something to be made of it. Stay tuned.
Posted at 09:50AM Nov 04, 2006 by thorsten in OpenOffice.org |