At JavaOne earlier this month, there was a lot of talk about the Java SE Real-Time System 2.0, so I thought I would brag a little about the real-time performance of Solaris. That is right, you may think of Solaris for running huge databases or for web services, but have you thought about Solaris for real-time? To make it even a little more challenging, how does Solaris real-time work on a heavily loaded systems? With a little help from a Solaris engineering I set off to find out. If you want a more complete explaination, check out Jim's blog, he is the one to credit for this work.

First we grabbed a Sun Fire server with four Opteron processors and loaded up the latest Solaris engineering build. Then just for fun we started the nightly Solaris kernel build which starts up a few thousand compile jobs. Then just to make sure we were keeping the CPUs busy we started up a background job that throws 240 processes on the system. These background jobs were run at priority 0 so they only run when CPU cycles are available. This was good enough to generate a load average of over 3000 active jobs:

# uptime 8:05pm up 4 day(s), 16:09, 2 users, load average: 3689.85, 3590.41, 3248.41

Then to test the real-time performance we used a simple program we call latstat that measures interrupt latency, dispatch latency, kernel-user context switch latency, and total latency. The real-time task is running on CPU 0 in a single CPU processor set with interrupts driven away and waits for a high level timer interrupt. It takes measurements of its progress from waking up until it returns to user land. Here is the full output from the lastat tool. The best page to look at is the last one. This is simply the delta between the time the interrupt was to be delivered and the time the application was running in user space. This graph is over 300 million points. The mean is 4 microseconds and worst is 32 microseconds. There are roughly 300 points out of the 300 million that are in the range of 13 microseconds to 32 microseconds. While there is no hard definition for "real-time", operating systems that average > 1000 microseconds response time are generally not considered realtime and "hard real-time" embedded operating systems like vxworks generally get you in the 10 microseconds range.

I'd challenge any commercial Linux distribution to beat those results, you can even use an unloaded system! I'd be happy to provide a copy of our latstat code if anyone is up to the challenge.

Comments:

How much memory is on the system? Is latstat statically linked? Is it's address space locked before real execution begins? I'm just curious given the load you put on the system and the results you got, what specific steps were followed in addition to having run on a dedicated CPU/processor set were used? We run applications requiring soft real-time response on Solaris today and I'm always on the lookout for best practices on each version of Solaris. We currently run on Solaris 10. By the way, I'd be interested in the latstat source code, however, I won't be using it on Linux. --Marc

Posted by Marc Rocas on May 29, 2007 at 07:22 PM PDT #

Marc, You can find details on the test at http://blogs.sun.com/thejel/entry/realtime_benchmark_details but quickly the system had ~4GB of physical memory. The user level program was not statically linked but we did take steps to mitigate the impact of dynamic linking (but I did not use LD_BIND_NOW). I did lock down memory and made sure it was faulted in before use. I used the real-time scheduling class with a fairly low priority (but high enough) and the SCHED_FIFO scheduling behavior. I used a single cpu processor set and drove interrupts away. The rest was just the way Solaris is designed to handle real-time processing and load.

Posted by James Litchfield on May 29, 2007 at 11:53 PM PDT #

[Trackback] Many people think of Linux as real-time unix. Or about hard real time operating systems like vxworks. Not so many people know: The plain Solaris is a really capable real-time operating system, too. James Litchfield made some interesting latency tests t...

Posted by c0t0d0s0.org on May 30, 2007 at 05:20 AM PDT #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2009 by marchamilton