David Dagastine's Weblog
Archives
« February 2006 »
SunMonTueWedThuFriSat
   
3
4
5
6
7
8
9
10
11
12
13
16
18
19
20
21
24
25
26
28
    
       
Today
XML
Search

Links
Referrers

Today's Page Hits: 125

« Previous month (Dec 2005) | Main | Next month (Feb 2006) »
20060227 Monday February 27, 2006
High Performance Java on Sun CoolThread Servers
Back in December when Sun's CoolThread Servers were announced, I wrote a similar blog entry comparing the Sun Fire T1000 and T2000 SPECjbb2005 scores to our competitor's SPECjbb2005 scores on 1U, 2U, and 4U systems. Below is updated data, along with space and power data using the SWaP metric. The Sun Fire T1000 scores are phenomenal!. All run with Sun J2SE 5.0._06. with HotSpot JVM technology. Interested in finding out for yourself? Go here to try a Sun Fire T2000 free for 60 days. Take a look at the the chart below. The Sun T2000 surpasses all other competition in the 2U and 4U space. How are these results comparable? Its simple, compare the raw throughput SPECjbb2005 bops score. One may ask: "How can you compare a 8 core / 32 thread box to a 4 core / 8 thread Power 5+?". Its easy. Chip and core counts are steadily becoming irrelavent. What really matters is how much work (throughput) a system can achieve and how much is that system going to cost to run. This includes lab space, power, and cooling costs. Below is a system comparison using the SWaP--the Space, Watts and Performance (SWaP) metric. The SWaP metric is defined as follows: How about scalability? Here's a good example of how the Sun Fire T2000 and the UltraSPARC T1 processor scales from 1 to 32 threads. Each SPECjbb2005 warehouse is a new thread. Throughput steadily increases as new threads are added, peaking at 32. Fine print SPEC disclosure: SPECjbb2005 Sun Fire T1000 (1 chip, 8 core, 32 threads) 51,540 bops, 12,885 bops/JVM, Sun Fire T2000 (1 chip, 8 core, 32 threads) 63,378 bops, 15,845 bops/JVM, IBM eServer p5 520 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM, IBM eServer p5 510 (2 chips, 2 cores, 4 thread) 32,820 bops, 32,820 bops/JVM (referenced on IBM benchmark website), AMD Tyan white box (2 chips, 4 cores, 4 thread) 44,574 bops, 44,574 bops/JVM, IBM eServer p5 550 (4 chips, 4 cores, 4 thread) 61,789 bops, 61,789 bops/JVM . SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 27, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Feb 27 2006, 10:16:14 AM EST Permalink

20060223 Thursday February 23, 2006
Sun Fire E25K and J2SE 5.0_06 SPECjbb2005 World Record
The Sun Fire E25K running J2SE 5.0_06 now holds the overall world record running SPECjbb2005! Hot off the presses, here's the new world record result: 1,164,995 SPECjbb2005 bops, 32,361 SPECjbb2005 bops/JVM. This result beats the recently announced result from Fujitsu for the PRIMEPOWER 2500 with SPARC64 V. Once again the combination of Sun's world class enterprise server architecture, the Ultra SPARC IV+ processor, and Sun J2SE 5.0_06 with HotSpot JVM technology team up to prove once again world class performance and scalability with the SPECjbb2005 benchmark. Very, very impressive. As a designer and developer of this benchmark I found it hard to envision the day where the SPECjbb2005 bops score would breach 1 million. The day is here and much sooner than I could have ever anticipated. These are exciting times for Java performance (and there's more performance optimizations coming soon!) Stay tuned for more information on this latest world record. The BMSeer has a excellent competitive overview of this result, the price performance of the Sun Fire E25K is quite impressive compared to our competition $$ (add an extra $ for IBM). (Hey BMSeer, next time you won't beat me to the punch announcing our latest SPECjbb2005 world record!!). Fine print SPEC disclosure: SPECjbb2005 Sun Fire E25K (72-way, 72 chips, 144 cores) 1,164,995 SPECjbb2005 bops, 32,361 SPECjbb0205 bops/JVM submitted for review, Fujitsu PRIMEPOWER 2500 (128 chips, 128 cores) 1,157,619 SPECjbb2005 bops, 72,351 SPECjbb2005 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 23, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Feb 23 2006, 10:58:57 PM EST Permalink

20060222 Wednesday February 22, 2006
Sun HotSpot J2SE 5.0_06 Crushes BEA JRockit Running SPECjbb2005
(The following is a resubmission of a blog entry from February 10, 2006 with a few comments and edits. Changes are noted below.) Looks like our friends from BEA JRockit are at it again. Take a look at the following blog entry from BEA. http://dev2dev.bea.com/blog/hstahl/archive/2006/01/new_specjbb2000_1.html First SPECjbb2000 is a 5 year old retired benchmark. Its time has past and SPECjbb2005 is its replacement. BEA loves to talk about SPECjbb2000, they obviously spent a lot of time optimizing for SPECjbb2000. The problem with JRockit is that they are optimized just for SPECjbb2000. If time was spent on optimizations for the real world they'd be able to maintain their competitive position with SPECjbb2005, right? The same applies for any other competitive benchmark (SPECjappserver2004, Scimark, and so on). The reality is much different, SPECjbb2000 is a special case for JRockit and performance gains there don't pan out in the real world. One more comment on SPECjbb2000. As I stated above the benchmark retired the beginning of January. Which JVM ended on top? Reading the BEA blog you'd assume it was BEA JRockit. Sun HotSpot J2SE 5.0_06 closed this benchmark as the final world record holder. Now lets move on, SPECjbb2000 is over. BEA JRockit tried to spin their current competitive situation in the best possible light, omitting many results that did not suit their smoke and mirrors argument. First, BEA positioned a fully configured 32-way, 32-core, 32-thread Itanium2 system against a partially configured 16-way, 32-core, 32-thread Sun Fire 6900 in an attempt to highlight JVM performance. These are completely different hardware platforms and any attempt to highlight JVM performance alone using these results is inaccurate. Comparing these results does give insight on throughput and scaling capacity but the comparison is at a system level and only demonstrates a JVMs capacity to fully utilize the underlying hardware platform. When comparing a fully configured mid-sized enterprise systems regardless of the platform, the Sun Fire 6900 (24-way, 48-core, 48-thread) beats the JRockit result hands down. 342,578 SPECjbb2005 bops, 28,548 SPECjbb2005 bops/JVM (Sun Fire E6900 with Sun JVM) 322,719 SPECjbb2005 bops, 40,340 SPECjbb2005 bops/JVM (Fujitsu PRIMEQUEST 480 with JRockit) Also, please review the SPECjbb2005 results page. A quick scan will show that Sun HotSpot holds the record for single and multi-instance results, more than doubling BEA's single JVM result, and tripling BEA's multi-instance result. Funny how BEA forgot to mention these results. http://www.spec.org/jbb2005/results/jbb2005.htmlTWO(2) JVMs on a 4 core box. They even use 2 JVMs on a 2-core box. That's absolutely ridiculous. Why would anyone choose to do this? The only reason is they can't beat HotSpot running a single JVM and have difficultly scaling this benchmark on small 2 and 4 core systems. HotSpot could easily beat these multi-instance results, but chances are we won't submit multi-instance SPECjbb2005 on configurations that don't match customer deployments. (Author's note: Since hindsight is always 20/20, the following is more specific than the above paragraph) Now onto the AMD based SPECjbb2005 results referred to in the BEA blog. I'm embarrassed for BEA because they had to use these results to talk about performance. Their 2-way, 2-core result uses TWO(2) JVMs on a 4 core box. They even use 2 JVMs on a 2-core box. That's absolutely ridiculous. Why would anyone choose to do this? The only logical reason is they can't beat HotSpot running a single JVM and have difficultly scaling SPECjbb2005 on small 2 and 4 core systems. HotSpot could easily beat these multi-instance results, but chances are we won't submit multi-instance SPECjbb2005 on configurations that don't match customer deployments. Here are the latest 2 and 4 core single instance SPECjbb2005 submissions on a Sun Fire X4200 running Windows, Linux, and Solaris. 49,097 SPECjbb2005 bops, 49,097 SPECjbb2005 bops/JVMSun Fire X4200 running Solaris 10 x64 47,437 SPECjbb2005 bops, 47,437 SPECjbb2005 bops/JVMSun Fire X4200 running Windows 2003 Server 43,076 SPECjbb2005 bops, 43,076 SPECjbb2005 bops/JVMSun Fire X4200 running Red Hat EL 4 Fine print SPEC disclosure: SPECjbb2005 Sun Fire X4200 on Solaris 10 (2 chips, 4 cores, 4 threads) 49,097 bops, 49,097 bops/JVM,SPECjbb2005 Sun Fire X4200 on Windows 2003 Server (2 chips, 4 cores, 4 threads) 47,437 bops, 47,437 bops/JVM, SPECjbb2005 Sun Fire X4200 on Red Hat EL 4 (2 chips, 2 cores, 2 threads) 43,076 bops, 43,076 bops/JVM, Fujitsu Limited PrimeQuest 480 (32 chips, 32 cores, 32 threads) 322,719 bops, 40,340 bops/JVM. SPECjbb2005 Sun Fire E6900 on Solaris 10 (24 chips, 32 cores, 32 threads) 342,578 bops, 28,548 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 22, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Feb 22 2006, 10:59:35 PM EST Permalink

20060217 Friday February 17, 2006
Java Performance: Solaris 10 x86 vs. Linux
Solaris 10 screams running Java. Competitive benchmarks do a good job highlighting this,just take a look at the latest SPECjbb2005 and SPECjappserver2004 results. I have noticed some fundamental differences in "Out of the Box" tuning when comparing Solaris and Linux. When running Java server applications, Solaris 10 default tuning is general purpose and tuned for moderate thread counts similar to a time shared system. This in many ways is an indication of the maturity of the platform. Linux, on the other hand, is specfically tuned for high thread counts and performance suffers when running low thread counts. A good example of this behavior can be seen comparing SPECjbb2005 results. Below are two results run on the exact same hardware, only differing the OS and minor JVM tuning (the heap tuning has minimal performance impact). SPECjbb2005 on Sun Fire X4200 running Solaris 10 Update 1, 49,097 SPECjbb2005 bops, 49,097 SPECjbb2005 bops/JVM SPECjbb2005 on Sun Fire X4200 running Red Hat EL 4, 43,076 SPECjbb2005 bops, 43,076 SPECjbb2005 bops/JVM Running SPECjbb2005 on identical hardware with optimal tuning parameters Solaris 10 is 14% faster than Linux. SPECjbb2005 on small x64 hardware runs only a moderate number of threads, in the above example to peak application thread count is 8. What tuning can be applied when running high thread counts on Solaris 10 x86? Here's two quick tuning steps you can try with your application. 1. If you're running many threads and performing socket I/O, try libumem.so. When launching your application within a shell script, set the following environment variable. LD_PRELOAD=/usr/lib/libumem.so;export LD_PRELOAD 2. Tune the Solaris scheduler. Simple scheduler tuning can yield significant performance gains, especially with highly threaded short lived applications. Try the FX scheduling class: priocntl -c FX -e java class_name Try the IA scheduling class: priocntl -c IA -e java class_name Every application is different and true performance is always defined by each individual running their own application. If you run into problems or have questions about Java on Solaris performance visit the java.net performance forum or feel free to send me a comment. Fine print SPEC disclosure: SPECjbb2005 Sun Fire X4200 on Solaris 10 (2 chips, 4 cores, 4 threads) 49,097 bops, 49,097 bops/JVM, SPECjbb2005 Sun Fire X4200 on Red Hat EL 4 (2 chips, 2 cores, 2 threads) 43,076 bops, 43,076 bops/JVM. SPEC™ and the benchmark name SPECjbb2005™ are trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of February 17, 2006. For the latest SPECjbb2005 benchmark results, visit http://www.spec.org/osg/jbb2005.

Feb 17 2006, 04:13:10 PM EST Permalink Comments [4]

20060215 Wednesday February 15, 2006
Java SE 6 Beta is Released!
Hey Look, Java SE 6 ("Mustang") has gone Beta! http://java.sun.com/javase/6/download.jsp Huge performance improvements, slick client improvements (love the font smoothing!), and a plethora of other features make this our best beta release to date. Give it a try and let us know what you think. As always, please let us know if you run into issues or regressions. Go to the Java SE 6 Regressions Challenge Page if you identify a regression for a chance to win a Sun Ultra 20 Workstation. For performance issues and questions visit the java.net performance forum.

Feb 15 2006, 09:55:39 AM EST Permalink

20060214 Tuesday February 14, 2006
Java SE Tuning Tip: Server Ergonomics on Windows
J2SE 5.0 Server Ergonomics is not on by default on Windows. The basic reasoning here is that Windows is largely a client platform and automatic server tuning may negatively impact startup performance. We are revisiting this for Mustang, but for now do the following to enable server ergonomics on Windows: 1). Specify JVM tuning options equivalent to server ergonomics java -server -Xmx1g -XX:+UseParallelGC 2). Check to make sure server ergonomics is enabled by checking the JVM version: $ java -server -Xmx1g -XX:+UseParallelGC -version java version "1.6.0-rc" Java(TM) 2 Runtime Environment, Standard Edition (build 1.6.0-rc-b69) Java HotSpot(TM) Server VM (build 1.6.0-rc-b69, mixed mode) If you see "Server VM", you're ready to test.

Feb 14 2006, 10:44:05 AM EST Permalink Comments [2]

20060202 Thursday February 02, 2006
SPECjbb2005: A Valid Representation of Java Server Workloads
I was reading some of the other blogs at Sun and noticed some entertaining comments on BMSeer's blog. In particular the comments on the entry titled Sun head-to-head wins again: SPECjbb2005. Specifically the set of comments is from Robin (basspetersen@yahoo.com). Robin apparently works for or has close association with HP. Hello Robin, I hope you are reading this. Robin doesn't feel that SPECjbb2005 represents real world Java server applications and workloads, mostly because it doesn't stress the network or I/O subsystems. I strongly disagree and feel that SPECjbb2005 is a valid representation of Java server workloads and has already had a significant impact on JVM and Java SE performance. Here's a few quotes from Robin's comments: "It looks like HP is the only company smart enough to stay out of this benchmark game, with no relevance in the real world." ... "JBB pretends to measure the server-side performance of Java runtime environments but it is not at all representative of a real workload. Running unrealistic workloads to measure performance is a disservice to customers." This statement is a bit naive. SPECjbb2005 has significant features that highlight its relevance to real world workloads. First, garbage collection is part of the measurement interval. SPECjbb2000 called a System.gc() before each measurement interval to ease the impact of GC on the score. This was somewhat necessary to have the benchmark scale back in 2000, not the case now. Garbage collection is fully a part of this benchmark, large GC pauses significantly impact benchmark scores. Second XML DOM L3 is part of the benchmark, will 20% of the workload in DOM tree creation and manipulation. Parsing is not included in order to avoid I/O bottlenecks. Third, the benchmarks must run with thread counts (warehouses) 2X the number of hardware threads on the system. A 4-way must run to 8 warehouses. A 32-way must run 64 warehouses. When did managing 64-threads become trivial and not impacted by system performance? Fourth, many of the optimizations and performance work that started with SPECjbb2005 had direct impact on customer and Java EE benchmark performance. Take a look at the latest SPECjappserver2004 world record. BEA WebLogic Server 9.0 on Sun Fire T2000 Cluster running Sun J2SE 5.0_06 Sun's HotSpot J2SE 5.0_06 was the JVM for this benchmark result, the same JVM which currently holds many, many major performance records on SPECjbb2005. If performance optimizations targeted for SPECjbb2005 have direct impact on Java EE benchmarking, how again is SPECjbb2005 irrelevant? "In my opinion HP does not want to give credit to a bad benchmark by publishing results. Why should they give you the satisfaction of jumping off the bridge after you? Clearly HP thinks the benchmark is not important." HP was on the core development team of SPECjbb2005. Take a look at one of my first blog entries announcing SPECjbb2005. Why would HP think a benchmark was not important or irrelavant when they put resources on the development of the benchmark? . Fifth, I/O and network were purposely left out of the benchmark to concentrate on JVM, OS, and Hardware performance. The benchmark heavily stresses the memory subsystem with large Java heaps and high memory allocation counts. The OS needs to manage many threads and possibly many processes effectively for high performance. SPECjbb2005 stresses JVM, OS, and Memory, it is a complete system benchmark concentrating on Java server performance. Lastly, I would like to see HP submit SPECjbb2005 numbers, competition leads to innovation and performance optimization that benefits customers. Chances are HP is plugging away working to improve their HotSpot implementation, preparing for the day they will submit a result.

Feb 02 2006, 09:46:18 AM EST Permalink

20060201 Wednesday February 01, 2006
Three new SPECjbb2005 World Records: Sun Fire x64 Servers and J2SE 5.0_06
Sun Fire x64 Servers and J2SE 5.0_06 deliver 3 world records running SPECjbb2005 on Linux, Windows, and Solaris. Sun Fire X4100 (First link dated 1/30/2006) Sun Fire X4200 (First link dated 1/30/2006) The results have just been approved at SPEC, here are the links to the results. Sun Fire X4100 running Solaris 10 x86 (64-bit) Sun Fire X4100 running Windows 2003 Server (64-bit) Sun Fire X4100 running Red Hat EL 4 (64-bit) Sun Fire X4200 running Solaris 10 x86 (64-bit) Sun Fire X4200 running Windows 2003 Server (64-bit) Sun Fire X4200 running Red Hat EL 4 (64-bit)

Feb 01 2006, 04:46:37 PM EST Permalink