| « December 2009 |
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|
| | | 1 | 2 | 3 | 4 | 5 |
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 | 31 | | |
| | | | | | | |
| Today |
Today's Page Hits: 68

Thursday August 10, 2006
Sun JDK 5.0_08 Is Now Available!
JDK 5.0_08 is now publicly available on Java.sun.com!. Another fine day for Sun Java Performance. This is our highest performing and most reliable release to date. We have demonstrated winning performance across Sun's server offering, from x64 Systems to CoolThread servers then all the way up to the Sun Fire E25K.
Winning performance on The Sun Blade X8400, beating BEA JRockit on a comparable system! (Sun Hotspot result, BEA JRockit result)
Winning performance on The Sun Fire T1000 and Sun Fire T2000 benchmark result (T1000 result,T2000 result)
Winning performance on The Sun Fire E25K (benchmark result)
SPECjbb2005 Sun Fire T1000 (1 chip, 8 cores) 60,323 SPECjbb2005 bops, 15,081 SPECjbb2005 bops/JVM; Sun Fire T2000 (1 chip, 8 cores) 74,365 SPECjbb2005 bops, 18,591 SPECjbb2005 bops/JVM; Sun Fire E25K (72-way, 72 chips, 144 cores) 1,387,437 SPECjbb2005 bops, 19,270 SPECjbb2005 bops/JVM; Sun Blade X8400 (8 cores, 4 chip, Solaris 10, Sun HotSpot 5.0_08) 121,228 SPECjbb2005 bops, 30,307 SPECjbb2005 bops/JVM; Fabric7 Q80 (8 cores, 4 chip, Microsoft Windows Server 2003, JRockit 5.0 P26.4.0) . SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 06/19/06 on www.spec.org.

Wednesday June 21, 2006
Sun Java and the Sun Fire E25K Raise the Bar on SPECjbb2005
The Sun Fire 25K and Sun J2SE 5.0_08 team up to demonstrate leadership on large servers running SPECjbb2005, increasing performance by
19.1% over our
previous submission on the same hardware. Not bad for 6 months of performance work!
The 72-way Sun Fire 25K score is 1,387,437 SPECjbb2005 bops, 19,270 SPECjbb2005 bops/JVM. That is 11% faster than the 128-way Fujitsu PRIMEPOWER 2500 and many times faster than IBM's fastest SPECjbb2005 result to date. The
BMSeer once again beats me the punch talking about SPECjbb2005 results, he/she (who is BMSeer anyway?)
has a great piece talking about this result.
Required Disclosure Statement
SPECjbb2005 Sun Fire E25K (72-way, 72 chips, 144 cores) 1,387,437 SPECjbb2005 bops, 19,270 SPECjbb2005 bops/JVM, Fujitsu PRIMEPOWER 2500 (128 chips, 128 cores) 1,251,024 SPECjbb2005 bops, 39,095 SPECjbb2005 bops/JVM, IBM eServer p5 570 (8 chips, 16 cores, 16-way) 244,361 SPECjbb2005 bops, 30,545 SPECjbb2005 bops/JVM. SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 06/19/06 on www.spec.org.

Monday June 19, 2006
Sun Java vs. C#
Here's my latest round of platform performance comparisons using
Scimark. This time I compare Java to C# and once again Java performance is looking quite good. Thanks to Tony Zhang, another colleague of mine on the performance team who ran the initial performance comparision a few months back and provided me the environment to re-run the tests with our latest JVMs.
The system under test was a 4 CPU Intel Xeon MP server (4x2.78 GHz, 8 cores, 3.87 GB memory) running Microsoft Windows 2003 Server and .NET 2.0. The CLR version under test according to SciMark was 2.0.50727.42. We used the
Scimark 2.0 C# port found here. The HotSpot server compiler (-server) was used for both J2SE 5.0_08 and Java SE 6 b87. SciMark was run with the large data set (-large).

Also, I found the chart below in an
interesting writeup showing similar performance comparisions with older versions of the JVM. I particularly like HotSpot's performance lead over JRockit.

Friday June 09, 2006
Sun Java is faster than C/C++ (Round 2)
I received a few comments on my
previous blog entry saying the results were bogus since I used an old compiler. I quickly found another test system running Suse SLES 9 U2 with gcc 3.3.3 and repeated the test. If I get around to installing the latest Visual Studio I'll repeat the test there as well. The JVM versions are different as I wanted to quickly post the results.
Guess what, the results are a lot better! I ran this several times and its quite repeatable. I appreciate comments so please let me know what your thoughts. Especially if there are issues with the choice of gcc 3.3.3.
The system under test was a 2 x 3.0Ghz Intel Xeon MP System (4-core) running Suse SLES 9 U2 and gcc 3.3.3. The C code was compiled with full optimization as shown by the Makefile in the SciMark source package. This time
no tuning parameters were used for either 5.0_08 or 6.0 b83.
Here's some output from /proc/cpuinfo:
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) MP CPU 3.00GHz
For background here's the skinny on
SciMark2. Scimark2 is a set of simple numerical kernels and its performance is directly related to the performance and quality of the generated code. The tests are single threaded and have little to no garbage collection overhead. In short, a great set of applications to compare statically compiled C code and dynamically compiled Java.
This time Java is 35% faster than C.

Here's a breakdown of the subtests. C is only ahead on Sparse MatMult by a small margin.

Any one interested to see how the other JVM vendors look? Can JRockit or IBM beat C?
Sun Java is faster than C/C++
This is quite cool. Andy Johnson, a colleague of mine on the Java performance team, did a few performance tests comparing Java to native C.
SciMark2 was used for the performance comparision. The system under test was a 2Ghz Pentium white box running Windows 2000 and using the Microsoft Visual C/C++ 6. The C code was compiled with full optimization. The server compiler was used for both J2SE 5.0_07 and Java SE 6.
Scimark2 is a set of simple numerical kernels and its performance is directly related to the performance and quality of the generated code. The tests are single threaded and have little to no garbage collection overhead. In short, a great set of applications to compare statically compiled C code and dynamically compiled Java.
The chart below is quite revealing. Both the charts are normalized to J2SE 5.0_07. Native C is only 3% faster than 5.0_07 and Java SE 6 pulls ahead of native C by 2%.

The following chart breaks the comparison down further. Remember SciMark2 is a composite benchmark and the overall score is a simple mean of each subtest mflops score. With that, Java is ahead in some cases, and behind in others. Actually Java is ahead in all cases except Sparse Matmult. Looks like we have something to look at for additional optimization.

Friday June 02, 2006
Java Performance Continues to Accelerate on Sun CoolThreads Technology
The performance of Java on Sun CoolThreads servers continues to be impressive. Our latest round of improvements have increased performance on
SPECjbb2005 by 17% on the Sun Fire T1000 and T2000. If you thought the competitive positioning of these systems was impressive before, take a look at them now. The
charts below represent the competitive landscape for the Sun CoolThreads servers and by no means are they meant to be a complete comparison of all systems in the classes described below. If there are particular descrepencies that are annoying, please let me know. For more detailed information on the Sun Fire T1000 and T2000 and comparisions running competitive benchmarks check out
BMSeer's blog.
The first chart shows the competitive landscape for 1 RU servers. The Sun Fire T1000 shines compared to other systems in this space. The Sun Fire X4100 (powered by AMD Opteron CPUs) looks rather good as well.

The second chart shows the competitive landscape for 2 RU and 4 RU servers. The Sun Fire T2000 shows impressive performance against the competition in this space as well.

Now this is were the Sun Fire T1000 and Sun Fire T2000 truly excel. The first power performance graph shows a comparision based on performance per watt using the SPECjbb2005 bops metric. The data presented is limited to what I've gathered using The Sun Fire CoolThreads systems and what has been gathered on
http://www.sun.com/coolthreads.
Here's another look at power performance using the SWaP metric. The SWaP metric is similar to performance / Watt, but includes system footprint as a part of the equation. The Sun Fire T1000 number is impressive. The light bulb next to my workbench in my basement uses more power than this server.

For those individuals who prefer a spreadsheet to charts, here the same information as show above.

Finally, this chart shows the performance difference between J2SE 5.0_06 and J2SE 5.0_08 on the same hardware, demonstrating a 17% increase in performance on both the Sun Fire T1000 and Sun Fire T2000. If we can improve performance by 17% in 6 months, wait to you see what Java SE 6 ("Mustang") can do.

Required Disclosure Statement:
SPECjbb2005 Sun Fire T1000 (1 chip, 8 cores) 51,528 SPECjbb2005 bops, 12,882 SPECjbb2005 bops/JVM submitted for review; SPECjbb2005 Sun Fire T2000 (1 chip, 8 cores) 74,365 SPECjbb2005 bops, 18,591 SPECjbb2005 bops/JVM submitted for review; Sun Fire X4100 (2 chips, 2 cores) 38,090 SPECjbb2005 bops, 19,045 SPECjbb2005 bops/JVM submitted for review; IBM eServer p5 550 (2 chips, 4 cores) 61,789 SPECjbb2005 bops, 61,789 SPECjbb2005 bops/JVM; IBM x346 (2 chips, 4 cores) 39,585 SPECjbb2005 bops, 39,585 SPECjbb2005 bops/JVM; IBM eServer p5 520 (1 chip, 2 cores) 32,820 SPECjbb2005 bops, 32,820 SPECjbb2005 bops/JVM; IBM eServer p5 510 (1 chip, 2 cores) 36,039 SPECjbb2005 bops, 36,039 SPECjbb2005 bops/JVM; Fujitsu Siemens RX220 (2 chips, 2 cores) 61,155 SPECjbb2005 bops, 30,578 SPECjbb2005 bops/JVM, Dell PE SC1425 (2 chips, 2 cores) 24,208 SPECjbb2005 bops, 24208 SPECjbb2005 bops/JVM; Dell PE 850 (1 chips, 2 cores) 31,138 SPECjbb2005 bops, 31,138 SPECjbb2005 bops/JVM; Dell PE 2950 (2 chips, 4 cores) 64,288 SPECjbb2005 bops, 64,288 SPECjbb2005 bops/JVM; SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 6/02/06 on
www.spec.org

Monday May 22, 2006
Sun Fire T2000 Blows Cold Air!!
This year at JavaOne we had a demo at the performance pod demonstrating Java SE performance and scalability. We had a
Sun Fire T2000 with a 1.0 Ghz UltraSPARC T1 processor and 8gb RAM running Sun J2SE 5.0_06, J2SE 5.0_08, and
Java SE 6.
Brian Doherty did the setup this year (thanks a lot Brian!). He spent the entire day on Monday fighting the networking issues on the JavaOne pavilion floor but was eventually able to get the demo working (but we had to buy our own USB to serial kit to do it). It was also quite cold in the building, and Brian didn't bring his jacket because of the 80 degree weather that day in San Francisco. So, just like any resourceful engineer working in a lab, Brian decided to warm his hands with the fan exhaust in the back of the
Sun Fire T2000. Much to his surprise, the T2000 was blowing cold air!
Over the next few days on the show floor we put the system to the test. We ran SPECjbb2005 every day for 10 hours straight with the CPU fully consumed at 100%. Guess what? It still blew cold air. This was absolutely amazing, especially since my little laptop is about to burn my legs as I type this...
I find this incredible. At risk of being a bit annoying I asked nearly everyone who stopped by our booth to put their hands by the fans and feel the air. I wasn't the only one amazed, many people wanted to the the CPU stats to be sure the system was running full tilt.
Very cool. (literally).
Sun Java Performance: Here we come again
I love performance work. The sweet taste of knowing that your product is the fastest is like no other. Perhaps it is because I have a competitive personality, but beating the competition is a lot of fun. And you know what? Active competition between vendors on public Java benchmarks benefits customers. So without further ado, I'd like to announce our latest round of world record Java competitive benchmark results.
Sun J2SE 5.0_08, powered by the ripping fast Sun HotSpot JVM, has new world records running SPECjbb2005, improving our previous scores on the exact same hardware by a whopping 17%, and publishing the improved score in less than 6 months. See what I mean by sweet?
The
BMSeer has a great piece on the new results,
check it out here.
Be sure to check out the very popular
press release here.
To top it off, performance is not Sun Java Software's highest priority. I'm sure you're well aware that performance optimization is my highest priorty, but really its not the top focus of the organization as a whole. Our primary foci are reliability , compatibility (but performance and scalability are not that far down the list). We would pass up a 20% performance gain at the drop of a hat if it imposed any reliability risk. I do mean any risk, as a performance guy I've butted heads with this ideology many times in the past. But you know what, in the end I agree, because that's what customers need.
Reliability is always first.
Brian Doherty, an esteemed colleague of mine has often said, “The performance of a crashing JVM is zero”, and that's dead on. A close second is compatibility, but that's an easy one as it speaks to the core of what defines Java technology. I'm proud to say Sun has taken this to heart, we support more hardware and OS combinations than any other vendors,
Any JVM vendor can claim they are the “World's fastest JVM”. Competitive benchmarking is a lot of fun and is an opportunity to promote software and hardware performance. What's important is that your application is as fast as you need it to be, and its so reliable that you don't have to think about it.

Friday March 10, 2006
Java Compatibility Call to Arms
Capatibility between Java implementations is critical to the success of the platform. Its the responsibility of the JRE vendor to ensure that any Java application will run. Yes, any Java application. After all, compatibility is the a key ingredient to what makes Java. "Write Once Run Anywhere", Right?
Apparently this isn't always the case. Here's an example of a "compatibility issue identified on the Java.net Glassfish Project.":https://glassfish.dev.java.net/servlets/ReadMsg?list=dev&msgNo=761 There are always bugs in software and some of those bugs can break compatibility. It is of utmost importance that issues such as this are addressed in a timely manner.
This is where you come in. When testing Java software, whether it be new development, a purchase evaluation, or your tried and true back office application, please do the following. Run your application with your JVM of choice, but also test it against other JVMs running on the same platform. That's right, if you're running Sun's JVM, also test BEA JRockit and IBM JDK. Multiple Java implementations are available on Windows, Linux, and now Solaris SPARC. If any of the implementations show incorrect behavior or dare I say don't run at all, I implore you to send a note to the implementation's support channels and if possible file a bug. None of the Java vendors out there can possibly test enough Java applications, and in many ways we're relying on the users to let us know if something's broken.
In the end, any Java application should run on any Java implementation. Hands down. No excuses. If you run into problems, have questions about Java performance, or identify compatibility issues running Sun's JVMs, in particular "Java SE 6 ":https://mustang.dev.java.net, please post a note on the
java.net performance forum or feel free to send me a comment here. I would love to hear your compatibility successes, along with the issues seen with our competitors JVMs

We're very serious about performance, compatibility, and reliability of the Java platform. If a vendor is not doing well in this regard I would like to know about so I can take steps to ensure compatibility guarantees of that implementation.
Java SE Tuning Tip: Large Pages on Windows and Linux
Enabling large page support on supported operating environments can give a significant boost to performance. This is especially true for applications with large datasets or running with large heap sizes. Below is a summary of how to enable large pages on Solaris, Windows, and Linux. The text is largely from the "HotSpot VM Options Page ":http://java.sun.com/docs/hotspot/VMOptions.html, but I've had a lot of questions about this and thought it merited highlighting the information here. Stay tuned for a revamped "HotSpot VM Options Page ":http://java.sun.com/docs/hotspot/VMOptions.html coming your way in the next few weeks.
Beginning with Java SE 5.0 there is now a cross-platform flag for requesting large memory pages: -XX:+UseLargePages
(on by default for Solaris, off by default for Windows and Linux). The goal of large page support is to optimize processor Translation-Lookaside Buffers.
A Translation-Lookaside Buffer (TLB) is a page translation cache that holds the most-recently used virtual-to-physical address translations. TLB is a scarce system resource. A TLB miss can be costly as the processor must then read from the hierarchical page table, which may require multiple memory accesses. By using bigger page size, a single TLB entry can represent larger memory range. There will be less pressure on TLB and memory-intensive applications may have better performance.
However please note sometimes using large page memory can negatively affect system performance. For example, when a large mount of memory is pinned by an application, it may create a shortage of regular memory and cause excessive paging in other applications and slow down the entire system. Also please note for a system that has been up for a long time, excessive fragmentation can make it impossible to reserve enough large page memory. When it happens, either the OS or JVM will revert to using regular pages.
Operating system configuration changes to enable large pages:
Solaris
As of Solaris 9, which includes Multiple Page Size Support (MPSS), no additional configuration is necessary. If you're running 32-bit J2SE versions prior to J2SE 5.0 Update 5 on AMD Opteron hardware additional tuning is necessary. Due to a bug in HotSpot Large page code, the default large page size running the 32-bit x86 binary is 4mb. Since 4mb pages is not supported on Opteron, the large page request fails and the page size defaults to 8k. To get around this, explicitly set the large page size to 2mb with the following flag:
-XX:LargePageSizeInBytes=2m
Linux
Large page support is included in 2.6 kernel. Some vendors have backported the code to their 2.4 based releases. To check if your system can support large page memory, try the following:
# cat /proc/meminfo | grep Huge
HugePages_Total: 0
HugePages_Free: 0
Hugepagesize: 2048 kB
If the output shows the three "Huge" variables then your system can support large page memory, but it needs to be configured. If the command doesn't print out anything, then large page support is not available. To configure the system to use large page memory, one must log in as root, then:
# Increase SHMMAX value. It must be larger than the Java heap size. On a system with 4 GB of physical RAM (or less) the following will make all the memory sharable:
# echo 4294967295 > /proc/sys/kernel/shmmax
# Specify the number of large pages. In the following example 3 GB of a 4 GB system are reserved for large pages (assuming a large page size of 2048k, then 3g = 3 x 1024m = 3072m = 3072 * 1024k = 3145728k, and 3145728k / 2048k = 1536):
# echo 1536 > /proc/sys/vm/nr_hugepages
Note the /proc values will reset after reboot so you may want to set them in an init script (e.g. rc.local or sysctl.conf). Also, internal testing has shown that root permissions may be necessary to get large page support on various flavors of Linux, most notably Suse SLES 9.
Windows
Only Windows Server 2003 supports large page memory. In order to use it, the administrator must first assign additional privilege to the user who will be running the application:
# select Control Panel -> Administrative Tools -> Local Security Policy
# select Local Policies -> User Rights Assignment
# double click "Lock pages in memory", add users and/or groups
# reboot the machine
As always, every application is different and true performance is always defined by each individual running their own application. If you run into problems or have questions about Java performance visit the
java.net performance forum or feel free to send me a comment.

Monday February 27, 2006
High Performance Java on Sun CoolThread Servers
Back in December when
Sun's CoolThread Servers were announced, I wrote a
similar blog entry comparing the Sun Fire T1000 and T2000 SPECjbb2005 scores to our competitor's SPECjbb2005 scores on 1U, 2U, and 4U systems. Below is updated data, along with space and power data using the
SWaP metric. The
Sun Fire T1000 scores are phenomenal!. All run with
Sun J2SE 5.0._06. with HotSpot JVM technology. Interested in finding out for yourself? Go here to try a
Sun Fire T2000 free for 60 days.
Take a look at the the chart below. The Sun T2000 surpasses all other competition in the 2U and 4U space.

How are these results comparable? Its simple, compare the raw throughput SPECjbb2005 bops score. One may ask: "How can you compare a 8 core / 32 thread box to a 4 core / 8 thread Power 5+?". Its easy. Chip and core counts are steadily becoming irrelavent. What really matters is how much work (throughput) a system can achieve and how much is that system going to cost to run. This includes lab space, power, and cooling costs.
Below is a system comparison using the
SWaP--the Space, Watts and Performance (SWaP) metric. The SWaP metric is defined as follows:

How about scalability? Here's a good example of how the Sun Fire T2000 and the UltraSPARC T1 processor scales from 1 to 32 threads. Each SPECjbb2005 warehouse is a new thread. Throughput steadily increases as new threads are added, peaking at 32.

Fine print SPEC disclosure:
SPECjbb2005 Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM, IBM eServer p5 520 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM, IBM eServer p5 510 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM (referenced on IBM benchmark website), AMD Tyan white box (2 chips, 4 cores, 4 thread) 44,574 bops, 44,574 bops/JVM, IBM eServer p5 550 (4 chips, 4 cores, 4 thread) 61,789 bops, 61,789 bops/JVM . SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 27, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Thursday February 23, 2006
Sun Fire E25K and J2SE 5.0_06 SPECjbb2005 World Record
The Sun Fire E25K running J2SE 5.0_06 now holds the overall world record running SPECjbb2005!
Hot off the presses, here's the new world record result: 1,164,995 SPECjbb2005 bops, 32,361 SPECjbb2005 bops/JVM. This result beats the recently announced result from Fujitsu for the PRIMEPOWER 2500 with SPARC64 V.
Once again the combination of Sun's world class enterprise server architecture, the Ultra SPARC IV+ processor, and Sun J2SE 5.0_06 with HotSpot JVM technology team up to prove once again world class performance and scalability with the SPECjbb2005 benchmark. Very, very impressive.
As a designer and developer of this benchmark I found it hard to envision the day where the SPECjbb2005 bops score would breach 1 million. The day is here and much sooner than I could have ever anticipated. These are exciting times for Java performance (and there's more performance optimizations coming soon!)
Stay tuned for more information on this latest world record. The
BMSeer has a excellent competitive overview of this result, the price performance of the Sun Fire E25K is quite impressive compared to our competition $$ (add an extra $ for IBM). (Hey BMSeer, next time you won't beat me to the punch announcing our latest SPECjbb2005 world record!!).
Fine print SPEC disclosure:
SPECjbb2005 Sun Fire E25K (72-way, 72 chips, 144 cores) 1,164,995 SPECjbb2005 bops, 32,361 SPECjbb0205 bops/JVM submitted for review, Fujitsu PRIMEPOWER 2500 (128 chips, 128 cores) 1,157,619 SPECjbb2005 bops, 72,351 SPECjbb2005 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 23, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Wednesday February 22, 2006
Sun HotSpot J2SE 5.0_06 Crushes BEA JRockit Running SPECjbb2005
(The following is a resubmission of a blog entry from February 10, 2006 with a few comments and edits. Changes are noted below.)
Looks like our friends from BEA JRockit are at it again. Take a look at the following blog entry from BEA.
http://dev2dev.bea.com/blog/hstahl/archive/2006/01/new_specjbb2000_1.html
First SPECjbb2000 is a 5 year old retired benchmark. Its time has past and SPECjbb2005 is its replacement. BEA loves to talk about SPECjbb2000, they obviously spent a lot of time optimizing for SPECjbb2000. The problem with JRockit is that they are optimized just for SPECjbb2000. If time was spent on optimizations for the real world they'd be able to maintain their competitive position with SPECjbb2005, right? The same applies for
any other competitive benchmark (SPECjappserver2004, Scimark, and so on). The reality is much different, SPECjbb2000 is a special case for JRockit and performance gains there don't pan out in the real world.
One more comment on SPECjbb2000. As I stated above the benchmark retired the beginning of January. Which JVM ended on top? Reading the BEA blog you'd assume it was BEA JRockit.
Sun HotSpot J2SE 5.0_06 closed this benchmark as the final world record holder. Now lets move on, SPECjbb2000 is over.
BEA JRockit tried to spin their current competitive situation in the best possible light, omitting many results that did not suit their smoke and mirrors argument.
First, BEA positioned a fully configured 32-way, 32-core, 32-thread Itanium2 system against a partially configured 16-way, 32-core, 32-thread Sun Fire 6900 in an attempt to highlight JVM performance. These are completely different hardware platforms and any attempt to highlight JVM performance alone using these results is inaccurate. Comparing these results does give insight on throughput and scaling capacity but the comparison is at a system level and only demonstrates a JVMs capacity to fully utilize the underlying hardware platform. When comparing a fully configured mid-sized enterprise systems regardless of the platform, the Sun Fire 6900 (24-way, 48-core, 48-thread) beats the JRockit result hands down.
342,578 SPECjbb2005 bops, 28,548 SPECjbb2005 bops/JVM (Sun Fire E6900 with Sun JVM)
322,719 SPECjbb2005 bops, 40,340 SPECjbb2005 bops/JVM (Fujitsu PRIMEQUEST 480 with JRockit)
Also, please review the SPECjbb2005 results page. A quick scan will show that Sun HotSpot holds the record for single and multi-instance results, more than doubling BEA's single JVM result, and tripling BEA's multi-instance result. Funny how BEA forgot to mention these results.
http://www.spec.org/jbb2005/results/jbb2005.htmlTWO(2) JVMs on a 4 core box. They even use 2 JVMs on a 2-core box. That's absolutely ridiculous. Why would anyone choose to do this? The only reason is they
can't beat HotSpot running a single JVM and have difficultly scaling this benchmark on small 2 and 4 core systems. HotSpot could easily beat these multi-instance results, but chances are we won't submit multi-instance SPECjbb2005 on configurations that don't match customer deployments.
(
Author's note: Since hindsight is always 20/20, the following is more specific than the above paragraph)
Now onto the AMD based SPECjbb2005 results referred to in the BEA blog. I'm embarrassed for BEA because they had to use these results to talk about performance. Their 2-way, 2-core result uses
TWO(2) JVMs on a 4 core box. They even use 2 JVMs on a 2-core box. That's absolutely ridiculous. Why would anyone choose to do this? The only logical reason is they
can't beat HotSpot running a single JVM and have difficultly scaling SPECjbb2005 on small 2 and 4 core systems. HotSpot could easily beat these multi-instance results, but chances are we won't submit multi-instance SPECjbb2005 on configurations that don't match customer deployments.
Here are the latest 2 and 4 core single instance SPECjbb2005 submissions on a Sun Fire X4200 running Windows, Linux, and Solaris.
49,097 SPECjbb2005 bops, 49,097 SPECjbb2005 bops/JVMSun Fire X4200 running Solaris 10 x64
47,437 SPECjbb2005 bops, 47,437 SPECjbb2005 bops/JVMSun Fire X4200 running Windows 2003 Server
43,076 SPECjbb2005 bops, 43,076 SPECjbb2005 bops/JVMSun Fire X4200 running Red Hat EL 4
Fine print SPEC disclosure:
SPECjbb2005 Sun Fire X4200 on Solaris 10 (2 chips, 4 cores, 4 threads) 49,097 bops, 49,097 bops/JVM,SPECjbb2005 Sun Fire X4200 on Windows 2003 Server (2 chips, 4 cores, 4 threads) 47,437 bops, 47,437 bops/JVM, SPECjbb2005 Sun Fire X4200 on Red Hat EL 4 (2 chips, 2 cores, 2 threads) 43,076 bops, 43,076 bops/JVM, Fujitsu Limited PrimeQuest 480 (32 chips, 32 cores, 32 threads) 322,719 bops, 40,340 bops/JVM. SPECjbb2005 Sun Fire E6900 on Solaris 10 (24 chips, 32 cores, 32 threads) 342,578 bops, 28,548 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 22, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Friday February 17, 2006
Java Performance: Solaris 10 x86 vs. Linux
Solaris 10 screams running Java. Competitive benchmarks do a good job highlighting this,just take a look at the latest
SPECjbb2005 and
SPECjappserver2004 results.
I have noticed some fundamental differences in "Out of the Box" tuning when comparing Solaris and Linux. When running Java server applications, Solaris 10 default tuning is general purpose and tuned for moderate thread counts similar to a time shared system. This in many ways is an indication of the maturity of the platform. Linux, on the other hand, is specfically tuned for high thread counts and performance suffers when running low thread counts. A good example of this behavior can be seen comparing SPECjbb2005 results. Below are two results run on the exact same hardware, only differing the OS and minor JVM tuning (the heap tuning has minimal performance impact).
SPECjbb2005 on Sun Fire X4200 running Solaris 10 Update 1, 49,097 SPECjbb2005 bops, 49,097 SPECjbb2005 bops/JVM
SPECjbb2005 on Sun Fire X4200 running Red Hat EL 4, 43,076 SPECjbb2005 bops, 43,076 SPECjbb2005 bops/JVM
Running SPECjbb2005 on identical hardware with optimal tuning parameters Solaris 10 is 14% faster than Linux. SPECjbb2005 on small x64 hardware runs only a moderate number of threads, in the above example to peak application thread count is 8.
What tuning can be applied when running high thread counts on Solaris 10 x86?
Here's two quick tuning steps you can try with your application.
1. If you're running many threads and performing socket I/O, try libumem.so.
When launching your application within a shell script, set the following environment variable.
LD_PRELOAD=/usr/lib/libumem.so;export LD_PRELOAD
2. Tune the Solaris scheduler.
Simple scheduler tuning can yield significant performance gains, especially with highly threaded short lived applications.
Try the FX scheduling class:
priocntl -c FX -e java
class_name
Try the IA scheduling class:
priocntl -c IA -e java
class_name
Every application is different and true performance is always defined by each individual running their own application. If you run into problems or have questions about Java on Solaris performance visit the
java.net performance forum or feel free to send me a comment.
Fine print SPEC disclosure:
SPECjbb2005 Sun Fire X4200 on Solaris 10 (2 chips, 4 cores, 4 threads) 49,097 bops, 49,097 bops/JVM, SPECjbb2005 Sun Fire X4200 on Red Hat EL 4 (2 chips, 2 cores, 2 threads) 43,076 bops, 43,076 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 17, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Wednesday February 15, 2006
Java SE 6 Beta is Released!
Hey Look, Java SE 6 ("Mustang") has gone Beta!
http://java.sun.com/javase/6/download.jsp
Huge performance improvements, slick client improvements (love the font smoothing!), and a plethora of other features make this our best beta release to date. Give it a try and let us know what you think.
As always, please let us know if you run into issues or regressions. Go to the
Java SE 6 Regressions Challenge Page if you identify a regression for a chance to win a
Sun Ultra 20 Workstation.
For performance issues and questions visit the
java.net performance forum.

Tuesday February 14, 2006
Java SE Tuning Tip: Server Ergonomics on Windows
J2SE 5.0 Server Ergonomics is not on by default on Windows. The basic reasoning here is that Windows is largely a client platform and automatic server tuning may negatively impact startup performance. We are revisiting this for Mustang, but for now do the following to enable server ergonomics on Windows:
1). Specify JVM tuning options equivalent to server ergonomics
java -server -Xmx1g -XX:+UseParallelGC
2). Check to make sure server ergonomics is enabled by checking the JVM version:
$ java -server -Xmx1g -XX:+UseParallelGC -version
java version "1.6.0-rc"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-rc-b69)
Java HotSpot(TM) Server VM (build 1.6.0-rc-b69, mixed mode)
If you see "Server VM", you're ready to test.

Thursday February 02, 2006
SPECjbb2005: A Valid Representation of Java Server Workloads
I was reading some of the other blogs at Sun and noticed some entertaining comments on
BMSeer's blog. In particular the comments on the entry titled
Sun head-to-head wins again: SPECjbb2005.
Specifically the set of comments is from Robin (basspetersen@yahoo.com). Robin apparently works for or has close association with HP. Hello Robin, I hope you are reading this. Robin doesn't feel that SPECjbb2005 represents real world Java server applications and workloads, mostly because it doesn't stress the network or I/O subsystems. I strongly disagree and feel that SPECjbb2005 is a valid representation of Java server workloads and has already had a significant impact on JVM and Java SE performance. Here's a few quotes from Robin's comments:
"It looks like HP is the only company smart enough to stay out of this benchmark game, with no relevance in the real world."
...
"JBB pretends to measure the server-side performance of Java runtime environments but it is not at all representative of a real workload. Running unrealistic workloads to measure performance is a disservice to customers."
This statement is a bit naive. SPECjbb2005 has significant features that highlight its relevance to real world workloads.
First, garbage collection is part of the measurement interval. SPECjbb2000 called a System.gc() before each measurement interval to ease the impact of GC on the score. This was somewhat necessary to have the benchmark scale back in 2000, not the case now. Garbage collection is fully a part of this benchmark, large GC pauses significantly impact benchmark scores.
Second XML DOM L3 is part of the benchmark, will 20% of the workload in DOM tree creation and manipulation. Parsing is not included in order to avoid I/O bottlenecks.
Third, the benchmarks must run with thread counts (warehouses) 2X the number of hardware threads on the system. A 4-way must run to 8 warehouses. A 32-way must run 64 warehouses. When did managing 64-threads become trivial and not impacted by system performance?
Fourth, many of the optimizations and performance work that started with SPECjbb2005 had direct impact on customer and Java EE benchmark performance. Take a look at the latest SPECjappserver2004 world record.
BEA WebLogic Server 9.0 on Sun Fire T2000 Cluster running Sun J2SE 5.0_06
Sun's HotSpot J2SE 5.0_06 was the JVM for this benchmark result, the same JVM which currently holds many, many major performance records on SPECjbb2005. If performance optimizations targeted for SPECjbb2005 have direct impact on Java EE benchmarking, how again is SPECjbb2005 irrelevant?
"In my opinion HP does not want to give credit to a bad benchmark by publishing results. Why should they give you the satisfaction of jumping off the bridge after you? Clearly HP thinks the benchmark is not important."
HP was on the core development team of SPECjbb2005. Take a look at one of my
first blog entries announcing SPECjbb2005. Why would HP think a benchmark was not important or irrelavant when they put resources on the development of the benchmark? .
Fifth, I/O and network were purposely left out of the benchmark to concentrate on JVM, OS, and Hardware performance. The benchmark heavily stresses the memory subsystem with large Java heaps and high memory allocation counts. The OS needs to manage many threads and possibly many processes effectively for high performance. SPECjbb2005 stresses JVM, OS, and Memory, it is a complete system benchmark concentrating on Java server performance.
Lastly, I would like to see HP submit SPECjbb2005 numbers, competition leads to innovation and performance optimization that benefits customers. Chances are HP is plugging away working to improve their HotSpot implementation, preparing for the day they will submit a result.

Wednesday February 01, 2006
Three new SPECjbb2005 World Records: Sun Fire x64 Servers and J2SE 5.0_06

Wednesday January 25, 2006
Sun Fire E6900 and Hotspot dominate SPECjbb2005 under 32 CPUs
The Sun Fire E6900 (24 chip) takes the lead running SPECjbb2005 on configurations with 32 chips or less with a score of 342,578 bops. This score surpasses the previous high score of 322,719 bops run on a Fujitsu Prime Quest 480 (32 chip!).
Why is this result interesting? First, the Sun Fire E6900 surpasses all other competitors in this space, faster than the IBM P5 570 and the Fujitsu Prime Quest 480. Second, and most importantly to me, this is the first of many results that highlight the performance of Sun Hotspot J2SE 5.0_06. Today's a good day for Sun Java Performance.
SPEC Footnote:
SPECjbb2005 Sun Fire E6900 (24-way, 24 chips, 48 cores) 342,578 bops,
28,548 bops/JVM submitted for review, Fujitsu PRIMEQUEST 480 (32 chips,
32 cores) 322,719 bops, 40,340 bops/JVM, IBM eServer p5 570 (8 chips,
16 cores, 16-way) 244,361 bops, 30,545 bops/JVM. SPEC, SPECjbb reg tm
of Standard Performance Evaluation Corporation. Results as of 01/23/06
on www.spec.org.
SPECjbb2005: Single Instance vs. Multiple Instance Competitive Comparisons
SPECjbb2005 can be run in single and multiple-instance modes. Single instance is where one JVM runs the benchmark on a single system. Multiple instance is where
n JVMs run in parallel, with the benchmark load distributed between the separate JVM processes. SPECjbb2005 also has two equally important metrics. SPECjbb2005 bops (business operations per second) is a measure of overall system throughput, and SPECjbb2005 bops/JVM, which is a measure of JVM performance and scalability.
Single instance results target hardware, OS, and highlight JVM performance and scalability. The multiple instance results target hardware, OS, JVM performance and scalability, and highlight total system throughput. Both single and multi instance configurations of SPECjbb2005 can provide a sense of hardware, OS, and JVM performance and scalability. However, single instance configurations put more focus on the throughput delivered by the JVM, where as multi instance configurations put more focus on total throughput delivered by the system. When multiple instance configurations demonstrate higher throughput than single instance configurations, it's usually an indication that there's either a JVM limitation, such as maximum heap size or 64-bit JVM performance, or that there's some hardware architectural aspect of the system that multiple JVMs can take advantage of, such as a NUMA memory architecture.
A SPECjbb2005 performance comparison between two hardware platforms is a comparison of the highest bops score as a measure of overall system throughput. Whe comparing hardware platforms the comparison can be made regardless of the benchmark configuration, but its important that you choose a configuration type that matches the deployment characteristics of your system as deployed in production.
Most large MP servers with greater than 16 hardware threads are deployed with many, many JVM (or OS) instances, and customers are concerned with complete system throughput and scalability. The comparison is system throughput, not necessarily software component performance, but often JVM scalability is a factor considering each JVM must scale to 8 hardware threads or more. In this case the fastest results by hardware vendor A should be compared to the fastest results by hardware vendor B, with an eye to JVM scalability as measured by the bops/JVM metric.
Small x86 or x64 systems with 8 or less cores are not typically deployed with more than one JVM. Customers are concerned with total system throughput but also efficient system utilization by their Java server software and the JVM. The SPECjbb2005 single instance configuration is a good match for small systems with less than 8 hardware threads.
SPECjbb2005 multiple instance results should not be used to compare systems with less than 8 hardware threads simply because those systems are not typically deployed in production in that fashion. Its the responsibility of the hardware and JVM vendor along with the benchmark submitter to hold the line on SPECjbb2005 configuration types and to ensure that the configuration type matches the system under test and more importantly how they are deployed in production.
JVM performance comparisions using SPECjbb2005 are a bit different. In this case JVM performance and scalability are the concentration and are best demonstrated using the single instance SPECjbb2005 configuration. When comparing JVMs, multiple instance results can only be compared to other multiple instance results, and its best if each result was run with the same number of JVM instances. Single instance SPECjbb2005 results on large SMP systems can help give insight into performance capabilities of the JVM within given instruction set and the potential scalability characteristics on other supported platforms.
The latest SPECjbb2005 score can be found at
http://www.spec.org/jbb2005

Thursday January 19, 2006
Sun Hotspot Wins Best Java Virtual Machine
Sun J2SE has won
JDJ Reader's Choice Best Java Virtual Machine Award. Take a look, its category #16.
Congratulations Java Software!

Wednesday January 18, 2006
Sun Hotspot SPECjbb2000 World Record
The last results for SPECjbb2000 have been accepted at SPEC and its official, Sun Hotspot running on a Fujitsu PrimePower 25000 holds the
end all SPECjbb2000 World Record!. As much as I personally disliked this benchmark (I've talked about it quite often), this result is more proof of the world class performance and scalability of Sun J2SE 1.5.0_06. Congratulations to Fujitsu Limited and the Sun Hotspot development team!

Monday January 09, 2006
SPECjbb2000 has finally retired
SPECjbb2000 has finally retired!
SPECjbb2005 has replace SPECjbb2000 and the competitive landscape has changed drastically. Strangely, a particular JVM vendor who showed strong performance in SPECjbb2000 doesn't seem to do as well with SPECjbb2005. Hmmm.
Gone are the days when a stunt JVM can make broad claims in world record performance based on a 5 year old benchmark. No more risky lock elision optimizations for 30% gains and special object ordering and prefetching because GC is outside the measurement intervals. Good riddance I say!

Tuesday December 13, 2005
Sun's Hotspot JVM = Industry Leading Performance
Sun's Hotspot JVM continues to demonstrate industry leading performance. Here's just a few examples where Hotspot shines.
SPECjbb2005
Leading x64 on Opteron 2-core result; 27004 bops, 27004 bops/JVM;
Sun Fire X4100 and Sun Fire X4200
Leading x64 on Xeon 2-core result; 28,314 bops, 28,314 bops/JVM
Fujitsu Siemens Computers PRIMERGY TX300 S2
Leading x64 on Opteron 4-core result: 45,124 bops, 45,124 bops/JVM
Sun Fire X4100 and Sun Fire X4200
Best of class 1U result; Sun Fire T1000, 51,540 bops, 12,885 bops/JVM; Results under review.
Best of class 2U result; 63,378 bops, 15,845 bops/JVM;
Sun Fire T2000, powered by UltraSPARC T1
SPECjappserver2004
SPECjappserver2004 World Record
6 Sun Fire T2000 servers
SPECjappserver2004 Single J2EE Node World Record
1 Sun Fire T2000 server
SciMark
Top 3 submitted results;
running Solaris, Linux, and Windows
Please post comments and questions here or on the
java.net performance forum sharing your experiences running Hotspot. Yes, I'd love to here success stories, but what is most important are those situations where performance wasn't what you expected. We are serious about Java performance here at Sun, and want to do what it takes to make every Java user satisfied with the performance of their application. We want to fix any and all performance issues you run into. We can and will continue to demonstrate industry leading performance, but what is most important is broad and reliable JVM performance which is defined individually with every user's application.
Fine print SPEC disclosure:
SPECjbb2005 Sun Fire X4200 (2 chips, 2 cores, 2 threads) 27004 bops, 27004 bops/JVM, Fujitsu Siemens Computers PRIMERGY TX300 S2 (2 chips, 2 cores, 4 threads) 28,314 bops, 28,314 bops/JVM, Sun Fire X4200 (2 chips, 4 cores, 4 threads) 45,124 bops, 45,124 bops/JVM, Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM submitted for review, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of November 30, 2005. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.
Sun's Hotspot JVM = Reliable Performance
Take a look at the latest SPECappserver2004 World Record results. BEA Weblogic running on
Sun Fire T2000 servers powered by UltraSPARC T1 processors
and Sun J2SE 5.0_06. Thats right BEA's "Record setting Weblogic 9" set the world records running on Sun's Hotspot JVM.
SPECjappserver2004 World Record (Multi-Node)
SPECjappserver2004 World Record (2-Node)
But how can this be? Sounds like BEA Weblogic relies on the cool performance and reliability of Sun's Hotspot JVM to achieve their world record performance on SPECjappserver2004.

Tuesday December 06, 2005
New Java Performance Tuning Whitepaper
Check out our new
Java performance tuning whitepaper on java.sun.com.
This has been on the Java performance group's to do list for very long time, thanks to Tom Marble for making this happen. There's nothing liking the kickin' performance the new
UltraSPARC T1 processor, the
Sun Fire T1000 and T2000 servers, and our latest update release
J2SE 5.0_06 to give the needed kick in the pants to put out a tuning guide. This is a work in progress so your
feedback is very much appreciated and needed. Thanks.
UltraSPARC T1 Screams Running Java
Sun has announced the new Sun Fire T1000 and T2000 servers today along with SPECjbb2005 benchmark results on these systems. What makes these results so special? They run the UltraSPARC T1 processor with 8 cores and 32 threads on a single chip. The performance of the UltraSPARC T1 systems easily surpasses performance on all other 1U, 2U, or 4U Systems. These results also leverage the high performance features in the newly released
J2SE 5.0._06.
Take a look at the the chart below. The Sun T2000 surpasses all other competition in the 2U and 4U space. The 1U Sun Fire T1000 leads the 1U results.

How are these results comparable? Its simple, compare the raw throughput SPECjbb2005 bops score. One may ask: "How can you compare a 8 core / 32 thread box to a 4 core / 8 thread Power 5+?". Its easy. Chip and core counts are steadily becoming irrelavent. What really matters is how much work (throughput) a system can achieve, how much is that system going to cost to run, and how much lab space, power, and cooling will this system require. Looking at the above results with this in mind clearly shows why Sun UltraSPARC T1 systems are separate from the pack. Sun Fire UltraSPARC T1 much, much less expensive to run than is competitors. How about those Cool Threads!
Here's the details on the configurations compared above:

How about scalability? Here's a good example of how the Sun Fire T2000 and the UltraSPARC T1 processor scales from 1 to 32 threads. Each SPECjbb2005 is a new thread. Throughput steadily increases as new threads are added, peaking at 32.

Fine print SPEC disclosure:
SPECjbb2005 Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM submitted for review, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM submitted for review, IBM eServer p5 520 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM, AMD Tyan white box (2 chips, 4 cores, 4 thread) 44,574 bops, 44,574 bops/JVM, IBM eServer p5 550 (4 chips, 4 cores, 4 thread) 61,789 bops, 61,789 bops/JVM . SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of November 30, 2005. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.
[ T:
http://technorati.com/tag/NiagaraCMT
]

Wednesday November 30, 2005
New SPECjbb2005 World Record - Java SE and UltraSPARC IV+
Sun has again achieved a new world record on SPECjbb2005, beating IBM's latest result by 1.5%. Take a look at the
press release. This result is using the newly released
J2SE 5.0._06.
IBM's score: 244,361 bops, 30,545 bops/JVM
Sun's score: 248,075 bops, 31,009 bops/JVM
Both the IBM and Sun results are 4 instance SPECjbb2005 runs. Each result has 32 "Processors" available to Java, as determined by the Java API java.lang.Runtime.availableProcessors(). The systems are equivalent from the view of Java, each has 32 hardware threads or "virtual" CPUs. How the 32 threads are implemented on each system is different, for example take a look at the
Sun Fire 6900 details.
I like that IBM is actively submitting SPECjbb2005 results. Competition spurs innovation, which is a great thing for everyone, especially customers with enterprise Java server applications. It would be great to see other large system vendors step up and compete on SPECjbb2005. HP? Fujitsu? Try running with Sun J2SE 5.0_06, its rather easy.
Fine print SPEC disclosure:
SPECjbb2005 Sun Fire E6900 (16-way, 16 chips, 32 cores) 248,075 bops, 31,009 bops/JVM submitted for review, IBM eServer p5 570 (8 chips, 16 cores, 16-way) 244,361 bops, 30,545 bops/JVM . SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 11/30/05 on www.spec.org unless otherwise noted.

Monday November 21, 2005
Java Heap Sizing: How do I size my Java heap correctly?
Proper heap sizing is key to good Java application performance. Whether running a client or server application, if your system is running low on heap and spending a lot of time with garbage collection you'll want to investigate adjusting your heap size. You also don't want to set your heap size too large and impact other applications running on the system. This edition of JVM Performance Tuning Basics will cover general heap and generation tuning steps, and new features in JDK 5.0.
h2.
Java SE 5.0 Ergonomics
Ergonomics for servers was first introduced in Java SE 5.0. It has greatly reduced application tuning time for server applications, particularly with heap sizing and advanced GC tuning. In many cases no tuning options when running on a server is the best tuning you can do.
Server ergonomics is enabled when running on a
server class machine on Solaris, Linux, and 64-bit Windows. It is disabled by default when running on 32-bit Windows.
Ergonomics does the following:
* Throughput garbage collector and Adaptive Sizing (-XX:+UseParallelGC)
* Initial heap size of 1/64 of physical memory up to 1Gbyte
* Maximum heap size of 1/4 of physical memory up to 1Gbyte
* Server runtime compiler (-server)
To enable server egonomics on 32-bit Windows, use the following flags:
-server -Xmx1g -XX:+UseParallelGC (varying the heap size)
h2. Identify how much Java heap your application needs.
There are several ways to identify how much heap your application is using. Java SE has a suite of monitoring tools such as
jstat and jconsole. There is Brian Doherty's
jvmstat tools, in particular visualgc. Then there is the tried and true -verbosegc and -XX:+PrintGCDetails. For this example I'll use -verbosegc.
For details on GC implementation and logging outputs, take a look here:
Details on Garbage Collection Tuning with Java SE 5.0
Examples of verbosegc and -XX:+PrintGCDetails.
h3. -verbosegc
The first step in investigating GC performance problems is looking at -verbosegc output. The following example is a server application with a heap size fixed at 64mb. The server compiler is specified and default GC collectors are chosen. In this case J2SE 1.4.2 is running with the default serial GC collectors.
java -server -Xms80m -Xmx80m my.serverApp
[GC 55974K->35946K(79232K), 0.0269796 secs]
[GC 57834K->36306K(79232K), 0.0278222 secs]
[GC 58194K->36669K(79232K), 0.0264892 secs]
[GC 58557K->37044K(79232K), 0.0223606 secs]
[GC 58932K->37400K(79232K), 0.0262330 secs]
[GC 59288K->37803K(79232K), 0.0271792 secs]
[GC 59691K->38097K(79232K), 0.0283054 secs]
[GC 59985K->38516K(79232K), 0.0276064 secs]
[GC 60404K->38847K(79232K), 0.0244366 secs]
[GC 60735K->43570K(79232K), 0.0732041 secs]
[GC 65458K->56730K(79232K), 0.1476127 secs]
[Full GC 78618K->61524K(79232K), 0.8851303 secs]
[Full GC 79231K->61898K(79232K), 0.9426240 secs]
[Full GC 79231K->62263K(79232K), 0.9828957 secs]
[Full GC 79231K->59527K(79232K), 1.0334212 secs]
[Full GC 79231K->59906K(79232K), 0.9298369 secs]
[Full GC 79231K->60014K(79232K), 0.8833146 secs]
[Full GC 79231K->60124K(79232K), 0.8293863 secs]
[Full GC 79231K->59615K(79232K), 0.8944206 secs]
[Full GC 79231K->59679K(79232K), 0.9169885 secs]
[Full GC 79231K->59626K(79232K), 0.9366790 secs]
[Full GC 79231K->59697K(79232K), 0.8613183 secs]
[Full GC 79231K->59594K(79232K), 0.9114757 secs]
[Full GC 79231K->59654K(79232K), 0.9987619 secs]
[Full GC 79231K->59654K(79232K), 1.0146781 secs]
[Full GC 79231K->59661K(79232K), 0.9687409 secs]
The first 11 lines of verbose gc output is young generation garbage collection.
The first number is the size of the heap before the GC, the second number is the size afterwards. The third number is the overall heap size, the forth it the time spent during the GC operation.
[GC 60404K->38847K(79232K), 0.0244366 secs]
# 60404K
# 38847K
# 79232K
# 0.0244366 secs
Note that the second number continues to grow during the first 11 GCs. This is an indication that many objects are being promoted to the old (tunured generation). There are several reasons this may occur. First is the obvious, most object continue to be live and are properly tenured. Second is that the young generation is not large enough to allow transient objects to successfully die in the young generation.
Also note that eventually young GCs cease and only Full GC operations occur there after. When running the serial collector, there must be enough space in the tenured generation to allow full promotion of the young generation plus one survivor space. This is known as the
. If there isn't enough space and the guarantee is not upheld, then only Full GCs occur.
Simply increasing the size of the heap slightly to 128mb is enough to uphold the young generation guarantee and avoids a majority of the Full GCs.
java -server -Xms128m -Xmx128m my.serverApp
[GC 109920K->75933K(126720K), 0.0533675 secs]
[GC 110877K->76916K(126720K), 0.0437944 secs]
[GC 111860K->77818K(126720K), 0.0490449 secs]
[GC 112762K->78812K(126720K), 0.0482215 secs]
[GC 113756K->79810K(126720K), 0.0444408 secs]
[GC 114754K->80759K(126720K), 0.0502736 secs]
[GC 115703K->81657K(126720K), 0.0435275 secs]
[GC 116601K->82629K(126720K), 0.0521527 secs]
[GC 117573K->83564K(126720K), 0.0443587 secs]
[GC 118508K->84501K(126720K), 0.0438583 secs]
[GC 119445K->85492K(126720K), 0.0556998 secs]
[GC 120436K->86412K(126720K), 0.0437702 secs]
[GC 121356K->87402K(126720K), 0.0478918 secs]
[Full GC 122346K->59749K(126720K), 0.9128712 secs]
[GC 94693K->63029K(126720K), 0.0415602 secs]
[GC 97973K->64037K(126720K), 0.0442277 secs]
[GC 98981K->65123K(126720K), 0.0538927 secs]
[GC 100067K->66058K(126720K), 0.0509740 secs]
[GC 101002K->66971K(126720K), 0.0529873 secs]
[GC 101915K->67931K(126720K), 0.0432661 secs]
[GC 102875K->68896K(126720K), 0.0468042 secs]
[GC 103840K->69864K(126720K), 0.0515457 secs]
[GC 104808K->70787K(126720K), 0.0435953 secs]
[GC 105731K->71789K(126720K), 0.0438197 secs]
[GC 106733K->72799K(126720K), 0.0520742 secs]
[GC 107743K->73692K(126720K), 0.0528108 secs]
[GC 108636K->74622K(126720K), 0.0531088 secs]
[GC 109566K->75533K(126720K), 0.0523352 secs]
[GC 110477K->76456K(126720K), 0.0532375 secs]
[GC 111400K->77423K(126720K), 0.0434274 secs]
[GC 112367K->78398K(126720K), 0.0435165 secs]
[GC 113342K->79417K(126720K), 0.0537748 secs]
[GC 114361K->80287K(126720K), 0.0432627 secs]
[GC 115231K->81156K(126720K), 0.0422614 secs]
[GC 116100K->82170K(126720K), 0.0427083 secs]
[GC 117114K->83087K(126720K), 0.0528816 secs]
[GC 118031K->84140K(126720K), 0.0488751 secs]
[GC 119084K->85046K(126720K), 0.0533192 secs]
[GC 119990K->86011K(126720K), 0.0542484 secs]
[GC 120955K->86914K(126720K), 0.0451865 secs]
[GC 121858K->87861K(126720K), 0.0435737 secs]
[GC 122805K->88683K(126720K), 0.0457787 secs]
[GC 123627K->89739K(126720K), 0.0540739 secs]
[Full GC 124683K->59854K(126720K), 0.9496498 secs]
h2. Next Topic: Young Generation Sizing

Tuesday October 18, 2005
SPECjbb2005 World Record - Hotspot and UltraSPARC IV+ Rock!
Sun has achieved a new world record on SPECjbb2005, beating IBM's latest result by 8%. Take a look at the press release
here.
IBM's score: 224,200 bops
Sun's score: 241,560 bops
SPECjbb2005 is the next generation Java server application benchmark. It
replaces SPECjbb2000 on January 4th, 2006. IBM has stepped up and submitted several benchmark results. This is fantastic. Where's BEA, HP, and Fujitsu? Dig yourselves out of the quagmire of SPECjbb2000 and submit result on a viable benchmark for a change.
SPECjbb2000 = Old benchmark, Invalid results for today's applications
SPECjbb2005 = New benchmark, developed to model today's applications
Where do you want your JVM and Hardware vendor concentrating their resources?