GetJava Download Button XML Feed
All | About | Flying | General | Java | Solaris 10
20070110 Wednesday January 10, 2007

Multi-tiers

So the tiered jvm is working pretty well but I want to see startup performance more on par with client. So I have to make some changes to try and help it out. So in the client vm the situation looks like this:

In the server vm the situation looks like:

In the tiered vm as it existed a few weeks ago the situation was:

The initial theory was that client compiler code was going to be fast enough that collecting full profile data wasn't going to be that bad. I've done some benchmarking with Alacrity and it generally is too bad (10% or so) although there are a couple of benchmarks where the impact is pretty severe.

So in the server vm the interpreter runs in two modes the idea is to do the same for the client compiler and in effect add more tiers. Instead of a 3 speed transmission we go to 4 speeds to try and smooth out the startup performance differences I was seeing. Actually I've added an additional gear, the tiered system can actually create code that looks like:

Now the truth is that as I'm currently running the system I'm only using tiers > 1. The expectation is to use tier1 for special circumstances. For instance if someone find a method miscompiles at tier 4 (server compiler) we have a way to allow them to specify that method only reaches tier1. No sense in penalizing them further by collecting profile data we can't even use. Similarly there are methods that for various reasons (resource limits typically) the server compiler can't compile. In that situation we're better off using tier1 instead of being trapped in the interpreter like the current server vm is.

So I added all these tiers and I tried it out. It was pretty disappointing worse than the previous tiered system. It was clear that I was not able to control what compiles were happening and when. After some analysis it was clear what was going on. For various reasons the counters that are used for triggering compiles and for profiling information  are partially shared. This was always a compromise at best in the other vms but in tiered I found that you just couldn't reason about how changes to triggering would influence my profile results. I really wanted triggering data to be separate from profiling data.

This was kind of scary. I wasn't the first hotspot developer to see that this overloading made for hard to predict changes in behavior. The current system has be tuned over a long time to get the kind of performance we want. No one wanted to mess with it for fear of spending inordinate amount of time getting the performance where we wanted with a saner system. Fortunately I'm not that smart so I'm changing it. :-) Actually I don't think I had much choice but it was nice to know that others found the counters not entirely rational.

So I split the triggering mechanism out completely. Pretty much immediately I got back to where I was with the initial tiered system and some benchmarks looked a little better. However I had one benchmark that used to run in 5.5 to 6.0 seconds that was now taking more than 400 seconds! What was up with that?

[ I know the answer to that question and I'll answer it next time. I've left clues as to what the problem is. See if you can figure it out. No prizes though... ]

Jan 10 2007, 03:22:20 PM EST Permalink