BM Seer Unofficial thoughts from an anonymous Sun employee

Sun SPARC Enterprise T5120/T5220 World Record Single-JVM Single Chip Performance

Wednesday Oct 10, 2007

The Sun SPARC Enterprise T5120 and Sun SPARC Enterprise T5220 each with the 1.4 GHz UltraSPARC T2 processor obtained the best single-JVM single chip results on the SPECjbb2005 server-side Java benchmark.

The Sun SPARC Enterprise T5120 and Sun SPARC Enterprise T5220 servers each equipped with a single UltraSPARC T2 processor at 1.4 GHz, delivered a World Record single-JVM single-chip result of 170153 SPECjbb2005 bops, 170153 SPECjbb2005 bops/JVM. The Sun SPARC Enterprise T5120 and the Sun SPARC Enterprise T5220 each consumed an average of 468 Watts of power to obtain this result.

The Sun SPARC Enterprise T5120 and T5120 servers beat all single-JVM results from Dell and HP, and all 8-core or less single-JVM results from IBM. These are easy to run and they are big companies, so why not publish on the latest? frayed knot? :)

The Sun SPARC Enterprise T5120 and T5220 servers single-JVM 64-bit SPECjbb2005 result is within 11% of the performance of the multi-JVM result, highlighting the flexibility of the Ultra SPARC T2 and Sun HotSpot JVM technology.

The Sun T5220 server (single UltraSPARC T2) was within 3% of the performance of the multi-JVM 4-core 4.7GHz IBM p570 (POWER6) result of 175,474 SPECjbb2005 bops 87737 SPECjbb2005 bops/JVM. The Sun T5220 server has 2.2x better power-performance and has 4.3x better SWaP than the IBM 4-core p570.

The Sun T5220 server (single UltraSPARC T2) demonstrated 7% better performance than the Dell PowerEdge 6950 result of 159,382 SPECjbb2005 bops, 39846 SPECjbb2005 bops/JVM which used four 2.8GHz dual-core Opteron processors. The Sun T5220 server has 1.4x better power-performance and has 2.8x better SWaP.

The Sun T5120 server (single UltraSPARC T2) demonstrated 8% better performance than the 4-socket HP rx6600 (2.8 GHz Xeon DC) result of 158174 SPECjbb2005 bops, 39544 SPECjbb2005 bops/JVM. The Sun T5120 server has 2.6x better power-performance and has 18x better SWaP than the HP rx6600.

The Sun T5120 server (single UltraSPARC T2) demonstrated 2.1x better performance than the 2-socket HP rx2660 result of 80884 SPECjbb2005 bops, 80884 SPECjbb2005 bops/JVM. The Sun T5120 server has 2.5x better power-performance and has 5x better SWaP.

The Sun T5120 server (single UltraSPARC T2) demonstrated 52% better performance than the Dell PowerEdge 860 result of 112,092 SPECjbb2005 bops 112092 SPECjbb2005 bops/JVM which used a 2.4GHz quad-core Xeon processor. The Sun T5120 server delivers better performance in the same 1 RU rack space.

The Sun T5120 server (single UltraSPARC T2) demonstrated 1.9X better performance over the 2-core 4.7GHz IBM p570 (POWER6) result of 88,089 SPECjbb2005 bops, 88,089 SPECjbb2005 bops/JVM. The Sun SPARC Enterprise T5120 has 4.3x better power-performance and has 17x better SWaP.

The SWaP metric is a measure of server efficiency ratio that includes system performance, power and space consumption on a specific benchmark. (SWaP = Perf /[ Space (RU) x Watts ] )

Power-performance is computed as watt/performance. Since power-performance is related to price/performance they are both calculated with performance in the denominator.

SPECjbb2005 Performance Chart (ordered by performance)

bops : SPECjbb2005 Business Operations per Second (bigger is better)

System Date CPU Performance
Chips, Cores, Threads GHz Type bops JVMs bops/JVM
Sun T5120 10/07 1, 8, 64 1.4 GHz US T2 192055 8 24007
Sun T5220 10/07 1, 8, 64 1.4GHz US T2 192055 8 24007
IBM p570 6/07 2, 4, 4 4.7GHz POWER6 175474 2 87737
Sun T5120 10/07 1, 8, 64 1.4GHz US T2 170153 1 170153
Sun T5220 10/07 1, 8, 64 1.4GHz US T2 170153 1 170153
HP rx6600 11/06 4, 8, 16 1.6GHz Itanium2 DC 158174 4 39544
Dell PE6950 1/07 4, 8, 8 2.8GHz Opteron DC 159382 4 39846
Dell PE860 1/07 1, 2, 4 2.4 GHz Xeon 112092 1 112092
IBM p570 6/07 1, 2, 2 4.7GHz POWER6 88089 1 88089
HP rx2660 1/07 2, 4, 4 1.6GHz Itanium 2 80884 1 80884
IBM p505Q 8/06 2, 4, 8 1.65GHz POWER5+ 63544 2 31772

Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org.

Benchmark Description

SPECjbb2005 (Java Business Benchmark) measures the performance of a Java implemented application tier (server-side Java). The benchmark is based on the order processing in a wholesale supplier application. The performance of the user tier and the database tier are not measured in this test. The metrics given are number of SPECjbb2005 bops (Business Operations per Second) and SPECjbb2005 bops/JVM (bops per JVM instance).

Disclosure Statement:

SPECjbb2005 Sun SPARC Enterprise T5120 (1 chip, 8 cores) 192055 SPECjbb2005 bops, 24007 SPECjbb2005 bops/JVM, Sun SPARC Enterprise T5220 (1 chip, 8 cores) 192055 SPECjbb2005 bops, 24007 SPECjbb2005 bops/JVM, Sun SPARC Enterprise T5120 (1 chip, 8 cores) 170153 SPECjbb2005 bops, 170153 SPECjbb2005 bops/JVM, Sun SPARC Enterprise T5220 (1 chip, 8 cores) 170153 SPECjbb2005 bops, 170153 SPECjbb2005 bops/JVM, Sun SPARC Enterprise T5120/T5220 results submitted to SPEC for review, Dell PowerEdge 860 (1 chip, 4 cores) 112092 SPECjbb2005 bops, 112092 SPECjbb2005 bops/JVM, Dell PowerEdge 6950 (4 chips, 8 cores) 159382 SPECjbb2005 bops, 39846 SPECjbb2005 bops/JVM, HP rx2660 (2 chip, 4 cores) 80884 SPECjbb2005 bops, 80884 SPECjbb2005 bops/JVM, HP rx6600 (4 chips 8 cores) 158174 SPECjbb2005 bops, 39544 SPECjbb2005 bops/JVM, IBM p570 (1 chip, 2 cores) 88089 SPECjbb2005 bops, 88089 SPECjbb2005 bops/JVM, IBM p570 (2 chips, 4 cores) 175474 SPECjbb2005 bops, 87737 SPECjbb2005 bops/JVM, IBM p505Q (2 chips, 4 cores) 63544 SPECjbb2005 bops, 31772 SPECjbb2005 bops/JVM, SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 10/08/2007 on www.spec.org.

Power References: The 2-core IBM p570 POWER6 system requires 4 RU or 4 times the rack space of a Sun T5120 and consumes on average 1040 Watts of power. The 4-core IBM p570 POWER6 system requires 4 RU or twice the rack space of a Sun T5220 and consumes on average 1040 Watts of power. IBM p6 570 2-core & 4-core power specifications from 80% of maximum report power consumption published here, 06/07/07, posted here. The IBM p505Q POWER5+ system requires 1 RU of rack space and consumes on average 320 Watts of power. IBM p505 power specifications from 80% of maximum report power consumption published in ?Facts and Features Report?, 03/27/06, posted here. The HP rx2660 server requires 2 RU of rack space and consumes on average 563+ Watts of power. HP rx2660 power consumption estimated by taking 70% of the maximum reported power dissipation, documented here on 03/23/07: Actual HP power specs here. The HP rx6600 server requires 7 RU of rack space and consumes on averge 1163 Watts of power. HP rx6600 power consumption estimated by taking 70% of the maximum reported power dissipation, documented here on 03/23/07. The Dell PowerEdge 6950 requires 4 RU or twice the rack space of a Sun T5220. The Dell PowerEdge 6950 power consumption from here. Prices based on publicly documented list prices.

Results Summary

Results
SPECjbb2005 bops: 170153
SPECjbb2005 bops/JVM: 170153
Reference Date: Oct 9, 2007
Systems: Sun SPARC Enterprise T5120, T5220
Total Number Processors: 1
Processor/GHz of Server: UltraSPARC T2 1.4 GHz
Operating System: Solaris 10 8/07
JVM: Java HotSpot(TM) 32-Bit Server, Version 1.6.0_04-p
If you want to know, I most of this from internal documentation, if I had to figure all of this out and follow these rules, I'd go crazy. So many thanks for all of the Sun people that I plagiarize.

[5] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

why are the numbers of jvm not always equal the numbers of core?

Posted by f on October 11, 2007 at 04:08 AM PDT #

The number of JVMs is a software configuration thing, it doesn't have to relate to any hardware configuration. However it does take a very good JVM to scale a single JVM to lots of cores & threads.

Posted by BM Seer on October 11, 2007 at 09:36 AM PDT #

Big fish in a small pond - 1 socket 1 JVM, or comparing to 4-14 month old competitive results. Try playing with the big boys - look at recent results, like 3Q2007. The third SPECjbb2005 on the 3Q list beats yours in power and performance in the same 1U space.

To use your nomenclature: The Dell PE1950 server (dual Xeon 5365), with a result of 238472 SPECjbb2005 bops, 59618 bops/JVM demonstrated 24% better performance than the multi-JVM Sun T5220 server result. With a lower power consumption of 455 watts, the PE1950 has 1.27x better power-performance and has 1.27x better SWaP than the Sun T5220.

Speaking of which, your nomenclature for comparisons is confusing and misleading. There are 2 valid forms to use:
1. Server A has 1.4x the performance of Server B
2. Server A has 40% better performance than Server B
Your form, Server A has 1.4x better performance than Server B, can be interpreted as either 140% better or 40% better. Stick with the other forms.

For those who don't follow SPEC results, the reason hardware vendors don't publish single-JVM numbers is that hardware vendors are interested in showing off their hardware, not the scalability of the JVM they use. There is no big conspiracy theory as BMSeer implies. The only vendors that care about single vs multi JVM performance are the JVM software vendors: BEA, IBM and Sun.

Have you corrected Prof. Patterson on his incorrect (in your opinion) usage of performance/watt in the speech you linked to in your blog? Still haven't fixed SWaP to put watts on top? Keep tilting at windmills!

Still waiting for SPECfp_rate2006 results on any T2 processor. Maybe IBM marketing will buy one and publish it for you.

Posted by Xray on October 13, 2007 at 11:07 AM PDT #

XRay,

Clearly you just like to attack without facts. There are lots of results posted here that you simply chose to ignore.

So you pick the 1-chip 1-JVM UltraSPARC T2 result (slower because it scales
a single JVM more than any other 1 and 2 chip system. Then you compare it against a Xeon result that didn't even use default BIOS (yes to get a big increase they turned pre-fetch off), see AMD's posting on Xeon that clearly pointed that out and was talked about in this blog.

First, you ignore SPECweb2005 reesults, where a single UltraSPARC T2 beat FOUR-SOCKET QUAD-CORE XEONS by 22% , see http://blogs.sun.com/bmseer/entry/truly_outstanding_webserving_sun_sparc, Yep Sun SPARC Enterprise T5220 has 2.1x better power-performance and has 4.1x better SWaP. ...since you don't like that conclusion I expect you to next say why SPECweb is flawed.

Second, SPECfp_rate2006 was posted on the T2 processor, single chip world record beating power6, quad-core Xeon.
http://blogs.sun.com/bmseer/entry/spec_cpu2006_ultrasparc_t2_exactly

Third, were is the quad-core Xeon on SPEComp? Easy benchmark to publish, or does that show a weak-point in your favorite chip?
http://blogs.sun.com/bmseer/entry/ultrasparc_t2_specomp_floating_point

Fourth where is the published DELL, HP, IBM, etc. actual MEASURED power on all of these benchmarks? Why do they only quote measured power on slower-GHz CPUs and small memory and performance on fastest GHz and large memory configs (also posted about this here).

Fifth, Sun also posted Multi-JVM results, again why are you so selective in your attacks? Your Xeon result is also posted here http://blogs.sun.com/bmseer/entry/whole_cup_of_ultrasparc_t2

Fifth, watt/performance looks like $/perf, or do you just like big marketing numbers. The metric is constructed this way so that it points out that Xeon with the latest power-saving features running at low utilization is 2x to 5x much more wasteful than running a power-efficient processor like the UltraSPARC T2 at decent utilization.

Sixth, we compared against all results that were published. Why aren't there current results on all of the benchmarks for your favorite chip?

Seventh, XRay by the way in the past you argued against Woodcrest drawing 500watts at reasonable memory size. This has been well verified. Now everyone knows you were not basing arguments on facts.

Disclosure Statement:

Sun SPARC Enterprise T5220 (8 cores, 1 chip) 37001 SPECweb2005, submitted to SPEC for review on October 8, 2007. HP ProLiant DL580 G5 (16 cores, 4 chips). 30261 SPECweb2005. SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Sun result submitted to SPEC, other results from www.spec.org as of 9/27/07. Sun SPARC Enterprise T5220/T5120 (UltraSPARC T2, 1 chip, 8 cores), 78.5 SPECint_rate2006, IBM p570 (POWER6, 1 chip, 2 cores), 60.9 SPECint_rate2006, HP DL380 G5 (X5365, 1 chip, 4 cores), 61.3 SPECint_rate2006, Sun SPARC Enterprise T5220 (UltraSPARC T2, 1 chip, 8 cores), 62.3 SPECfp_rate2006.
SPEC, SPECfp reg tm of Standard Performance Evaluation Corporation. Sun result submitted to SPEC, other results from www.spec.org as of 9/27/07. Sun SPARC Enterprise T5220/T5120 (UltraSPARC T2, 1 chip, 8 cores), 62.3 SPECfp_rate2006. IBM p570 (POWER6, 1 chip, 2 cores), 58.0 SPECfp_rate2006, Sun SPARC Enterprise T5220 (UltraSPARC T2, 1 chip, 8 cores), 62.3 SPECfp_rate2006. HP DL380 G5 (X5365, 1 chip, 4 cores), 38.8 SPECfp_rate2006. SPECweb2005. SPEC, SPECcpu, SPECfp, SPECint, SPECjbb, SPECweb, SPEComp are reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of Oct 9, 2007.

Posted by BM Seer on October 14, 2007 at 11:16 AM PDT #

Xray

Even IBM is seeing it as watt/performance, or watts/unit-of-work, as they say...
"can be used to calculate watts per unit"
http://www-03.ibm.com/press/us/en/pressrelease/22433.wss

When will Dell get this fact? for that matter... HP? Intel? AMD?

I think Prof. Patterson, was just being colloquial.

Posted by BM Seer on October 16, 2007 at 11:51 AM PDT #

Post a Comment:
Comments are closed for this entry.