BM Seer Facts & Questions from an Anonymous Sun Source

World Record Single-Chip UltraSPARC T2 SPECint_rate2006 Performance with gccfss

Thursday Feb 14, 2008

World's fastest chip. The Sun SPARC Enterprise T5120 server, running at 1.4 GHz, delivered a world record single chip result of 83.9 SPECint_rate2006. Please remember it is about system performance and chips not about things inside a chip (like perf/transistor, perf/NAND-gate, perf/metal-layers, perf/thread, perf/bore, oopps perf/core, perf/silicon grain).

The Sun SPARC Enterprise T5120 using the GCC for SPARC Systems (gccfss) compiler topped all competitor's single-chip results including beating the IBM p570 single-chip 4.7GHZ POWER6 result by 38%. IBM used its proprietary compiler, XL C/C++.

The Sun SPARC Enterprise T5120 using the GCC for SPARC Systems (gccfss) compiler beat the performance of the HP DL360 G5 with a single chip quad-core 3.16GHz Xeon X5460 by 15%.

The gccfss compiler allows one to use the optimal Sun SPARC optimization tools along with the popular gcc coding conventions and deliver performance that has not been possible before without time consuming code changes.

For more information on gccfss and how to get it, go to http://cooltools.sunsource.net/gcc/.

Sun also submitted results on the SPECfp_rate2006 benchmark suite using just a single disk. The Sun SPARC Enterprise T5120 server, running at 1.4 GHz, delivered a result of 62.1 SPECint_rate2006.

This result was run on a single disk. The previously reported result used the electrical equivalence rule of SPEC, but the configuration used more disks than fit in a T5120. This result shows that the performance is comparable, regardless of the disk configuration.

SPEC CPU2006 Performance Charts - bigger is better, selected recent results, see www.spec.org for complete results

SPECint_rate2006

System Processors Performance Results
Type GHz Chips Cores Threads Peak Base
T5120 (gccfss 4.2) UltraSPARC T2 1.4 1 8 64 83.9 76.2
T5220 (gccfss 4.2) UltraSPARC T2 1.4 1 8 64 83.2 75.6
T5120/T5220 UltraSPARC T2 1.4 1 8 64 78.5 73.0
T5220 (gccfss) UltraSPARC T2 1.4 1 8 64 78.0 71.6
Asus P5E3 Intel QX9650 3.0 1 4 4 76.7 69.0
HP DL360 G5 Intel X5460 3.16 1 4 4 73.0 62.1
Asus P5E3 Intel QX6850 3.0 1 4 4 69.1 64.9
Dell T3400 Intel QX9650 3.0 1 4 4 68.8 61.4
IBM p 570 Power6 4.7 1 2 4 60.9 53.2
Fujitsu RX100 Intel X3210 2.13 1 4 4 54.4 48.0

SPECfp_rate2006

System Processors Performance Results
Type GHz Chips Cores Threads Peak Base
T6320 UltraSPARC T2 1.4 1 8 64 62.3 58.1
T5120/T5220 UltraSPARC T2 1.4 1 8 64 62.3 57.9
T5120 (one disk) UltraSPARC T2 1.4 1 8 64 62.1 57.9
IBM p 570 Power6 4.7 1 2 4 58.0 51.5
Intel Asus P5E3 Intel QX9650 3.0 1 4 4 52.0 49.9
Dell T3400 Intel QX9650 3.0 1 4 4 47.2 44.9
HP DL360 G5 Intel X5460 3.16 1 4 4 44.5 41.3

Results as of 12 Feb 2008 from www.spec.org.

Benchmark Description

SPEC CPU2006 is SPEC's most popular benchmark. It measures:

  • "Rate" - system performance of CPUs, memory, compiler
  • "Speed" - single thread performance of chip, memory, compiler; not intended to stress multi-core designs
  • The strategic metrics include:

  • SPECint_rate2006: throughput for 12 integer benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding

    SPECfp_rate2006: throughput for 17 floating point benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

  • There are "base" variants of both the above metrics that require more conservative compilation, such as using the same flags for all benchmarks.

    Disclosure Statement:

    SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 2/12/08. Sun SPARC Enterprise T5120 gccfss (UltraSPARC T2, 1 chip, 8 cores), 83.9 SPECint_rate2006. IBM p570 (POWER6, 1 chip, 2 cores), 60.9 SPECint_rate2006. HP DL360 G5 (Xeon X5460, 1 chip, 4 cores), 73.0 SPECint_rate2006. Sun SPARC Enterprise T5120 (UltraSPARC T2, 1 chip, 8 cores), 62.1 SPECfp_rate2006.

    Results Summary

    Results
    Reference Date: Feb 12, 2008
    System: Sun SPARC Enterprise T5120
    Processor: Sun UltraSPARC T2, 1.4 GHz
      83.9 SPECint_rate2006
      62.1 SPECfp_rate2006
    Software: Solaris 10, Sun Studio 12 Compiler gccfss

    [17] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    UltraSPARC T2 World-Record Single-chip SPECint_rate2006

    Wednesday Jan 09, 2008

    Sun has set a single-chip SPECint_rate2006 world record using the gccfss compiler from Sun. For other benchmarks that show single-chip UltraSPARC T2 beating multi-chip servers see: http://www.sun.com/servers/coolthreads/t5220/benchmarks.jsp & http://www.sun.com/servers/coolthreads/t5120/benchmarks.jsp

    The Sun SPARC Enterprise T5220 server, running at 1.4 GHz, delivered a world record single chip result of 83.2 SPECint_rate2006.

    The Sun SPARC Enterprise T5220 using the GCC for SPARC Systems (gccfss) compiler topped all competitor's single chip results including beating the IBM p570 single-chip 4.7 GHZ POWER6 result by 37%. IBM used its proprietary compiler, XL C/C++.

    The Sun SPARC Enterprise T5220 using the GCC for SPARC Systems (gccfss) compiler beat the performance of the HP DL360 G5 with a single chip quad-core 3.16GHz Xeon X5460 by 14%.

    The gccfss compiler allows one to use the optimal Sun SPARC optimization tools along with the popular gcc coding conventions and deliver performance that has not been possible before without time consuming code changes. For more information on gccfss and how to get it, go to http://cooltools.sunsource.net/gcc/.

    SPECint_rate2006 selected Performance (bigger is better)

    System Processors Performance Results
    Type GHz Chips Cores Threads Peak Base
    T5220 (gccfss 4.2) UltraSPARC T2 1.4 1 8 64 83.2 75.6
    T5120/T5220 UltraSPARC T2 1.4 1 8 64 78.5 73.0
    T5220 (gccfss) UltraSPARC T2 1.4 1 8 64 78.0 71.6
    Asus P5E3 Intel QX9650 3.0 1 4 4 76.7 69.0
    HP DL360 G5 Intel X5460 3.16 1 4 4 73.0 62.1
    Asus P5E3 Intel QX6850 3.0 1 4 4 69.1 64.9
    T5120/T5220 UltraSPARC T2 1.2 1 8 64 68.9 est 63.8 est
    Dell T3400 Intel QX9650 3.0 1 4 4 68.8 61.4
    IBM p 570 Power6 4.7 1 2 4 60.9 53.2
    Fujitsu RX100 Intel X3210 2.13 1 4 4 54.4 48.0

    Benchmark Description

    SPEC CPU2006 has two basic measures:

    • "Rate" - system performance of CPUs, memory, compiler
    • "Speed" - single-thread performance; not intended to understand multi-core designs

    The strategic metrics include:

    • SPECint_rate2006: throughput for 12 integer benchmarks derived from real applications such as perl, gcc, XML processing, and pathfinding
    • SPECfp_rate2006: throughput for 17 floating point benchmarks derived from real applications, including chemistry, physics, genetics, and weather.

    There are "base" variants of both the above metrics that require more conservative compilation, such as using the same flags for all benchmarks.

    Disclosure Statement:

    SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Sun result submitted to SPEC, other results from www.spec.org as of 1/8/08. Sun SPARC Enterprise T5220 gccfss (UltraSPARC T2, 1 chip, 8 cores), 83.2 SPECint_rate2006. Sun SPARC Enterprise T5220 (UltraSPARC T2, 1 chip, 8 cores), 78.5 SPECint_rate2006. IBM p570 (POWER6, 1 chip, 2 cores), 60.9 SPECint_rate2006. HP DL360 G5 (Xeon X5460, 1 chip, 4 cores), 73.0 SPECint_rate2006.

    Sun Configuration Details

    Results
    Reference Date: Jan 7, 2008
    System: Sun SPARC Enterprise T5220
    Processor: Sun UltraSPARC T2, 1.4 GHz
      83.2 SPECint_rate2006
    Software: Solaris 10, Sun Studio 12 Compiler gccfss

    [5] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    IBM wrong at top of their voice and IBM JS22 power6 blade

    Thursday Nov 08, 2007

    Even IBM expert blogs continue to trumpet bad data. In the previous posting I pointed out the problems, please see BM Seer's "IBM JS22 Power6 blades not the performance you think"

    While, I realize that over-excited IBM marketing types may tend to overstate and make bad claims, but IBM's performance blog expert too?

      IBM's Stahl writes: Equally compelling is the analysis that one rack of IBM's new POWER6 processor-based blades is so powerful when virtualized that it can replace many non-virtualized racks of Sun's latest V490 servers, potentially saving a ton in energy costs.
    Interesting that in a previous posting she complained of an HP comparison because they used their own test. In the IBM JS22 comparison, IBM comes up with their own baseless 'Sun is three times worse' derating factor.

    I guess IBM uses something learned from Lucy in Peanuts cartoon: "If you can't be right, .... be wrong at the top of your voice!"

    So in how many more blogs, ads, and other marketing material will we see this comparison? My guess is LOTS at the top of their voice!

    Like this post? del.icio.us | furl | slashdot | technorati | digg

    IBM JS22 Power6 blades not the performance you think

    Thursday Nov 08, 2007

    Why do I attack IBM alot, it is not because they are a competitor (I don't attack all competitors), it is because IBM continues to pull such funny games to confuse the marketplace.

    Here is a prime example of an un-fair/stilted comparisons from IBM press release:

      Calculations show that one rack of IBM's new POWER6 processor-based blades is so powerful when virtualized that it can replace 23 non-virtualized racks of Sun's latest V490 servers, potentially saving more than $200,000 per year in energy costs alone. (3)

    Why didn't IBM compare against the Sun Blade T6320 (UltraSPARC T2) blades? IBM would have lost if they properly compared to Sun.

    IBM gamed again by comparing new servers to older Sun servers. IBM claimed one rack of JS22 BladeCenter Hs to replace 180 V490s. IBM based its JS22 claim by 'pulling a a *3X* utilization rate" out of the air' when compared to the V490. The JS22 utilization was 60%, while the V490 was 20% utilized-- a report from Alinean Consulting was cited as the source for the utilization comparisons(note:I guess you can pay for anything). This was the same fake 3x difference in utilization that IBM tryed to pull in the POWER6 announcement that BM Seer shot down in "IBM rewrites history, OK footnotes to clearly show bogus calculations"

    As a reminder as always with IBM read written material very carefully: http://blogs.sun.com/bmseer/entry/careful_reading_shows_a_lot

    Here is the IBM footnote:

      (3) The number of IBM BladeCenter JS22 servers required to replace 180 Sun Fire V490 was calculated based on SPECint_rate2006 results. The V490 SPECint_rate2006 result is for a 2.1GHz system with 4 chips and 2 cores per chip. It has a result of 78.0. The V490 result can be found at www.spec.org. It is current as of October 23, 2007. The JS22 result for the same benchmark is for a 4.0GHz system with 2 chips and 2 cores per chip. It has a result of 84.7. That result was submitted on November 6, 2007. It will also be posted on www.spec.org. The cumulative capacity of these servers is estimated to be the SPECint_rate2006 result for one server multiplied by the number of servers. A virtualization factor of 3X was applied to the JS22 virtualization scenario using utilizations derived from studies conducted by Alinean available at http://www-935.ibm.com/services/us/cio/optimize/opt_wp_ibm_systemp.pdf. That is the utilization rate for the non-virtualized V490 is estimated to be 20% and the utilization rate for the virtualized JS22 is estimated to be 60%. Using these assumptions, the cumulative capacity of the 56 JS22 servers at 60% is greater than the cumulative capacity of the 180 V490 servers at 20% utilization.

    Like this post? del.icio.us | furl | slashdot | technorati | digg

    Sun SPARC Enterprise T6320 SPECint_rate2006 Single-Chip World Record

    Thursday Oct 11, 2007

    The Sun Blade 6000 chassis can run Solaris, Linux, Windows, and VMware running on single and multi-core processors by Sun, AMD, and Intel, in one chassis. It is a 10-blade, 10RU Sun Blade 6000 Chassis.

    Sun has announced single chip World Record results for SPECint_rate2006. This result was run on the Sun Blade T6320 blade module which uses the 1.4 GHz UltraSPARC T2 processor.

    The Sun SPARC Enterprise T5220 server, running at 1.4 GHz, beat all single chip results running SPECint_rate2006 with a result of 78.5.

    The Sun Blade T6320 system beats the best single IBM 4.7 GHz dual-core POWER6 processor result by 29%.

    The Sun Blade T6320 system beat the best published single 3 GHz Xeon quad-core by 28% on SPECint_rate2006.

    There are no single quad-core Opteron results published for SPECint_rate2006.

    SPEC SPECint_rate2006 Performance - bigger is better, selected recent results, please see www.spec.org for complete results.

    System CPU Performance
    Type, GHz Chips, Cores,Threads Peak Base
    T6320 US T2 1.4GHz 1, 8, 64 78.6 73.1
    T5120/T5220 US T2, 1.4GHz 1, 8, 64 78.5 73.0
    HP DL360 G5 Intel Xeon QC 3GHz 1, 4, 4 61.3 53.8
    IBM p 570 Power6 4.7GHz 1, 2, 4 60.9 53.2
    Fujitsu RX300 Intel Xeon, 2.66 Xeon 1, 4, 4 52.8 50.5
    Yes UltraSPARC T2 result differences are in the noise between these platforms, SPEC allows run-to-run variations. Notice these results are 0.127% to 0.137% (yes near 1/10 of 1%) different.

    Results as of 9 Jan 2008 from www.spec.org.

    Benchmark Description

    SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and CINT2006. CFP2006 targets floating-point performance, while CINT2006 targets integer performance.

    Each suite has two different measures. First is the CPU measure, which is the performance on the suite as a single stream. This can be either a single thread or automatic compiled parallel run. This measure is further defined by base and optimized runs. Base uses the same compiler flags for all kernels, where optimized is allowed to use different compiler flags for each kernel. Results are compared against a baseline system run that was standardized by SPEC.

    The second measure is Rate. It is a measure of how many CPU measures can be run at a time. Typically, it is run as n processes on n processors. It shows how well the same job mix can run on a system under some load. It also is run as a base and optimized set of results.

    Disclosure Statement:

    SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Results as of 9 Jan 2008 from www.spec.org. Sun Blade T6320 (UltraSPARC T2, 1 chip, 8 cores), 78.6 SPECint_rate2006. Sun Blade T6320 (UltraSPARC T2, 1 chip, 8 cores), 78.6 SPECint_rate2006. IBM p570 (POWER6, 1 chip, 2 cores), 60.9 SPECint_rate2006. Sun Blade T6320 (UltraSPARC T2, 1 chip, 8 cores), 78.6 SPECint_rate2006. HP DL360 G5 (X5365, 1 chip, 4 cores), 61.3 SPECint_rate2006.

    Results Summary

    Results
    Reference Date: 9 Jan 2008
    System: Sun Blade T6320
    Processor: Sun UltraSPARC T2, 1.4 GHz
      78.6 SPECint_rate2006
    Software: Solaris 10, Sun Studio 12 Compiler

    [1] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    UltraSPARC T2: more floating-point performance

    Tuesday Aug 07, 2007

    More about floating-point on the Sun UltraSPARC T2 in this posting, In the previous posting SPECfp_2006 scores and the UltraSPARC T2 design being open-sourced were discussed.

    In the UltraSPARC T2 there are eight floating-point units that are well suited for scientific applications. Based upon preliminary runs the Sun UltraSPARC T2 processor at 1.4 GHz beats all single chip scores showing 14230(est)/15081(est) SPECompMbase2001/SPECompMpeak2001.

    How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECompMbase2001/SPECompMpeak2001 scores?

    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip IBM p520 POWER5+ 1.9GHz processor published result by 85%.
    • ...Sun is waiting for POWER6 4.7GHz results, maybe UltraSPARC T2 results will scare IBM from ever publishing a single-chip result?
    Benchmark description:

    The SpecOMP benchmark is a test of the performance of 9 High Performance computing applications. It is used to compare the performance of shared memory servers. All C/C++ and FORTRAN applications in this suite use the OpenMP programming model that provides a portable, scalable model for developing parallel applications for platforms ranging from the desktop to the supercomputer.

    The OpenMP Application Program Interface (API) supports multi-platform shared-memory parallel programming in C/C++ and Fortran on all architectures, from the largest Unix servers to the small Windows NT platforms.

    Disclosure statement:

    All UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable” runs, but are nevertheless designated as “estimates” because they use preproduction systems. SPEC, and SPEComp registered trademarks of Standard Performance Evaluation Corporation. Sun UltraSPARC T2 1.4GHz (1 chip, 8 cores, 64 threads) 14230 (est)/ 15081 (est) SPECompMbase2001/SPECompMpeak2001. Competitive results from www.spec.org as of August 6, 2007. IBM p520 1.9GHz (1 chip, 2 cores, 4 threads) published 8141/8174 SPECompMbase2001/SPECompMpeak2001.

    [2] Comments

    Performance of the new Sun UltraSPARC T2

    Tuesday Aug 07, 2007

    Sun UltraSPARC T2 is an amazing chip and very fast! The UltraSPARC T2 features several industry firsts:

    • Eight cores and 64 threads
    • Integrated 10 GbE networking and I/O
    • Dedicated, cryptographic and floating point units per core
    • 10 cryptographic functions supported with hardware
    • open-source design: www.opensparc.net

    Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz, beat all single chip scores showing 78.3 est. SPECint_rate2006. How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECint_rate2006 results.

    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip IBM POWER6 4.7GHz processor published result by 29%.
    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip estimated scores of the AMD Barcelona by 23%.
    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip published scores of the 2.66GHz Intel X5355 (Clovertown) by 48%.
    Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz, beat all single chip scores showing 62.3 est. SPECfp_rate2006. How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECfp_rate2006 results.
    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best published single-chip IBM POWER6 4.7GHz processor result by 7%.
    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip estimated scores of the AMD Barcelona by 11%.
    • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip published scores of the 2.66GHz Intel X5355 (Clovertown) by 66%.

    Performance per core doesn't matter GHz doesn't matter, what matters is numbers of cores, efficiency, and design of the chip! Competitors are saying that UltraSPARC T2 is proprietary... this makes no sense. both UltraSPARC T1 and UltraSPARC T2 are open source designs (www.opensparc.net). You do not find the latest design of Intel, AMD, or IBM as open source designs.

    Disclosure Statement:

    All Sun UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable” runs, but are nevertheless designated as “estimates” because they use preproduction systems. SPEC, SPECint, SPECfp registered trademarks of Standard Performance Evaluation Corporation. Sun UltraSPARC T2 1.4GHz (1 chip, 8 cores, 64 threads) 78.3 est. SPECint_rate2006, 62.3 est. SPECfp_rate2006. Competitive results from www.spec.org as of August 6, 2007. IBM POWER6 4.7GHz (1 chip, 2 cores, 4 threads) 60.9. SPECint_rate2006, 58.0 SPECfp_rate2006. AMD Barcelona 2.6 GHz (1 chip, 4 cores, 4 threads) 63.9 est SPECint_rate2006, 56.3 est. SPECfp_rate2006. Barcelona estimates based upon "The Register" article stating 2.6GHz quad is 21% and 50% faster than Intel 2.66 system. Fujitsu RX300 Intel X5355 2.66 GHz (1 chip, 4 cores, 4 threads) 52.8 SPECint_rate2006, 47.5 SPECfp_rate2006.

    Reminder: The Niagara 2 score was obtained from a full "reportable" SPEC run, but is designated as an "estimate" because a pre-production system was used.

    ...more information on the UltraSPARC T2 later today.

    [6] Comments

    Sun Blade X6250 & Sun Studio 12 x86 World Record

    Wednesday Jun 13, 2007

    Sun Blade X6250 Delivers a pair of x86 SPEC CPU2006 integer performance World Records:

    Sun Blade X6250 (Dual-Core Intel Xeon 5160) and running Solaris 10 and using Sun Studio 12 compiler delivered the best x86 result for the SPECint2006 benchmark.

    Sun Blade X6250 (Dual-Core Intel Xeon 5160) using Solaris 10 and Studio 12, delivered x86 4-core world record on SPECint_rate2006.

    Sun Blade X6250 server had a SPECint2006 result of 21.0 and SPECint_rate2006 result of 65.0. The advanced features of freely available Sun Studio 12 complier were critical for getting this level of performance on the Sun Blade 6250.

    The Sun Blade X6250 is only 3% slower than the peak score of the very-expensive new IBM POWER6 p570, which was recently announced. SPECint2006 is a single job stream. So let's now turn to comparing 4 thread results, in this case the Sun Blade X6250 is 7% faster than the peak SPECint_rate2006 score of he very-expensive new IBM POWER6 p570 (both IBM and Sun at 4 threads). Oh, and remember that anymore clock rate is not how you compare systems the Sun Blade X6250 is at 3GHz and the IBM POWER6 is at 4.7GHz. CPU frequency is basically irrelevant, it is CPU and system architecture that matters!

    SPEC CPU2006 Landscape - bigger is better, selected recent results

    SPECint2006

    System Processors Performance Results
    Type GHz Chips Cores Peak Base
    IBM p570 (power6) Power6 4.7 1 1 21.6 17.8
    Sun Blade X6250 Intel Xeon 5160 3.0 2 4 21.0
    Supermicro X7DB8+ board Intel Xeon 5160 3.0 2 4 20.8 18.9
    Sun Ultra 40 M2 AMD Opteron 2222SE 3.0 2 4 16.1

    SPECint_rate2006

    System Processors Performance Results
    Type GHz Chips Cores Threads
    / Copies
    Peak Base
    Sun Blade X6250 Intel Xeon 5160 3.0 2 4 4 65.0
    Supermicro X7DB8+ Intel Xeon 5160 3.0 2 4 4 64.9 60.0
    IBM p570 (Power6) Power6 4.7 1 2 4 60.9 53.2
    Sun Ultra 40 M2 AMD Opteron 2222SE 3.0 2 4 4 60.4
    Fujitsu BX620 S3 Xeon 5160 (Woodcrest) 3.0 2 4 4 59.4 56.7

    Results as of 06 Jun 2007 from www.spec.org.

    Benchmark Description

    SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and CINT2006. CFP2006 targets floating-point performance, while CINT2006 targets integer performance.

    Each suite has two different measures. First is the CPU measure, which is the performance on the suite as a single stream. This can be either a single thread or automatic compiled parallel run. This measure is further defined by base and optimized runs. Base uses the same compiler flags for all kernels, where optimized is allowed to use different compiler flags for each kernel. Results are compared against a baseline system run that was standardized by SPEC.

    The second measure is Rate. It is a measure of how many CPU measures can be run at a time. Typically, it is run as n processes on n processors. It shows how well the same job mix can run on a system under some load. It also is run as a base and optimized set of results.

    Disclosure Statement:

    SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org or from IBM public websites as of 6/06/07. Sun Blade X6250 (Intel Xeon 5160, 2chips/4cores, Solaris 10) 65.0 SPECint_rate2006; Sun Blade X6250 (Intel Xeon 5160, 2chips/4cores, Solaris 10) 21.0 SPECint2006; IBM System p 570 (POWER6, 1chip/1core, AIX 5L v5.3) 21.6 SPECint2006; IBM System p 570 (POWER6, 4 theads, 1chip/2cores, AIX 5L v5.3) 60.9 SPECint_rate2006.

    System Configuration

    Results
    Reference Date: Jun 06, 2007
    System: Sun Blade X6250
    SPEED: 16GB memory 8x2GB
    RATE : 32GB memory 8x4GB
    X6250 21.0 SPECint2006
    X6250 65.0 SPECint_rate2006
    Total Number Processors: 2 x Intel Xeon 5160
    Software: Solaris 10 11/06, Sun Studio 12 Compiler, MicroQuill's SmartHeap Library v7.4

    See Also

  • All Benchmark results on Sun Blade 6000 Blade Server
  • [4] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    TPC-C Reminder

    Monday Apr 30, 2007

    When Sun was had the world record we said it was too simplistic and old, and that was yeast ago. TPC-C has problems, IBM has heavily tuned it like this. Why does IBM still point to this 14+ year old benchmark? Why do they avoid new benchmarks with the lastest GHz full-system IBM p595 on:

    • SPECjbb2005?
    • SPECint_rate2006?
    • SPECfp_rate2006?
    • Linpack?
    • SPECint_2006?
    • SPECfp_2006?
    • ....the list goes on...
    Doesn't IBM want fair comparisons? I guess IBM would just be beaten by Sun in performance and $/perf so they want to avoid comparisons.

    It is funny that last year I egged HP on about SPECjbb2005, "why no results?" Someone commented that HP thinks it is a bad benchmark, so they won't publish on it. Now HP has the top result. Changed their tune?

    Notice how this is different than when established a World Record TPC-C, Sun told the world the benchmark was too simplistic back then and is sticking to it? The world became a lot more complicated in the past 7 years and computing has evolved a lot so we won't go back to something that was created 13 years ago. Sun never quotes 23-year old Dhrystones benchmark anymore either. :)

    The press and analysts are overwhelmingly seeing TPC-E the successor to the simplistic 14 year-old TPC-C.

    IBM's TPC-C "tuning"(?) that won't apply to anything in the real world

    June 2005 Interview with Bruce Lindsay (IBM Fellow) at http://www.sigmod.org/sigmod/record/issues/0506/p71-column-winslet.pdf

      "And the good news is that about 40-70% of the stuff we do in performance tuning actually ends up helping end users."

    This means that 30% to 60% of IBM's TPC-C tunings don't help users.

    Really beyond the huge disk size of the large TPC-C results (which has a lot to do with the TPC-C being 14 years old), the quote below points to tuning that is legal but seems a bit too "tricky" for my taste...

      "We get down to the level of worrying about the physical column order in the table so the reference columns are near each other, minimizing cache misses during fetching. This is feasible in the TPC-C benchmark because there are only five tables and only ten to fifteen columns in each table. In a more realistic application, where there are many more queries to be considered, the tables are typically much, much wider, in the 80 to 100 column range; and there are dozens if not thousands of tables. Then this kind of analysis is no longer practical." Bruce Linsay, IBM fellow"

    For those who may not remember, IBM didn't even end the EOL'ed SPECint_rate2000 on a high note. See: http://www.spec.org/cpu2000/results/rint2000.html and search for "1644" and "1513"

    various footnotes:

    "It's well-understood in the technical communities that TPC-C no longer represents current customer workloads since the transaction load that its models are made of are small, primitive and disconnected transactions. While this model was acceptable for the workloads of the late 1980s, it misses the mark..." Sun's World Record TPC-C Press release, August2000

    Disclosure Statement

    TPC-C results referenced above was the fastest overall performance world record at August 31, 2000. Sun Enterprise 10000 server (Starfire) running Sybase Adaptive Server Enterprise (ASE), 156,873.03 transactions per minute (tpmC), $48.81 price/tpmC, available February 28, 2001. A full disclosure report and executive summary are available through the TPC Web site located at www.tpc.org.

    [5] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    Sun UltraSPARC IV+ tops POWER5+, SPEC CPU2006 2.1GHz/1.95GHz

    Wednesday Apr 04, 2007

    Sun is publishing SPECint_rate2006 results, SPECint_rate2006 is a standard benchmark lots of results. Everyone is now publishing on this benchmark... but... IBM doesn't publish results on SPECint_rate2006. Are they avoiding looking slower than Sun? At many customer sites Sun is beating the performance of IBM Power5+ systems by a lot more than comparisons we've shown in standard benchamrks. IBM avoids them but Bull fortunately exposes some of the Power5+ issues with performance.

    For those who may not remember, IBM didn't even end the EOL'ed SPECint_rate2000 on a high note: http://www.spec.org/cpu2000/results/rint2000.html, search for "1644" and "1513"

    The Sun Fire server family with the new UltraSPARC IV+ 2.1GHz and 1.95GHz processors continues to show good performance.

    • The Sun Fire E4900 running with 1.95GHz UltraSPARC IV+ processors eclipsed the performance of the 2.2GHz Power5+ Bull Escala PL1650R+ on the SPECint_rate2006 benchmark.
    • The Sun Fire E2900 running with 1.95GHz UltraSPARC IV+ processors delivered nearly the same performance as the 2.2GHz Power5+ power Bull Escala PL1650R+, but only needed 24 jobs compared to 32 jobs for the Bull system on the SPECint_rate2006 benchmark.
    • This result uses the Sun Studio 12 compiler suite, which is designed to utilize the new cache structure of the US-IV+ processor.
    • The Sun Fire UltraSPARC family of servers continues to show its versitility, accepting its third major chip revision, starting with the single-core US-III, continuing with the dual-core US-IV and now with the advanced dual-core design of the US-IV+.

    Competitive Landscape

    SPEC CPU2006 Performance Charts - bigger is better, selected recent results, please see www.spec.org for complete results.

    SPECint_rate2006

    System Processors Performance Results
    Type GHz Chips Cores Threads Peak Base
    HP Integrity Superdome Itanium 2 1.6 64 128 128 1648 1534
    Sun Fire E25K US-IV+ 1.95 72 144 144 1230 1120
    Sun Fire E25K US-IV+ 1.8 72 144 144 904 759
    Sun Fire E25K US-IV+ 1.95 48 96 96 833 762
    HP Integrity Superdome Itanium 2 1.6 32 64 64 824 770
    Sun Fire E20K US-IV+ 1.95 36 72 72 608 556
    HP Integrity rx8640 Itanium 2 1.6 16 32 32 416 385
    Sun Fire E6900 US-IV+ 1.95 24 48 48 410 372
    Sun Fire E20K US-IV+ 1.95 24 48 48 409 376
    Sun Fire E6900 US-IV+ 1.95 16 32 32 288 261
    Sun Fire E4900 US-IV+ 1.95 12 24 24 220 200
    Bull Escala PL1650R+ POWER5+ 2.2 8 16 32 217 197
    HP Integrity rx8640 Itanium 2 1.6 8 16 16 208 193
    Sun Fire E2900 US-IV+ 1.95 12 24 24 208 191
    Sun Fire V890 US-IV+ 2.1 8 16 16 154 141
    Sun Fire X4600 M2 Opt 8218 2.6 8 16 16 135 114
    HP Integrity rx6600 Itanium 2 1.6 4 8 8 102 94.7
    Sun Fire V490 US-IV+ 2.1 4 8 8 78.0 71.7

    Results as of 2 April 2007 from www.spec.org.

    Benchmark Description

    SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and CINT2006. CFP2006 targets floating-point performance, while CINT2006 targets integer performance.

    Each suite has two different measures. First is the CPU measure, which is the performance on the suite as a single stream. This can be either a single thread or automatic compiled parallel run. This measure is further defined by base and optimized runs. Base uses the same compiler flags for all kernels, where optimized is allowed to use different compiler flags for each kernel. Results are compared against a baseline system run that was standardized by SPEC.

    The second measure is Rate. It is a measure of how many CPU measures can be run at a time. Typically, it is run as n processes on n processors. It shows how well the same job mix can run on a system under some load. It also is run as a base and optimized set of results.

    Disclosure Statement:

    SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 4/2/07. Sun Fire E4900 US-IV+ @1.95GHz (12 chips, 24 cores, 24 threads), 220 SPECint_rate2006. Bull Escala PL1650R+ Power5+ @2.2GHz (8 chips, 16 cores, 32 threads), 217 SPECint_rate2006.

    SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 4/2/07. Sun Fire E2900 US-IV+ @1.95GHz (12 chips, 24 cores, 24 threads), 208 SPECint_rate2006. Bull Escala PL1650R+ Power5+ @2.2GHz (8 chips, 16 cores, 32 threads), 217 SPECint_rate2006.

    SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 4/2/07. Sun Fire E25K US-IV+ @1.95GHz (72 chips, 144 cores), 1230 SPECint_rate2006. Sun Fire E25K US-IV+ @1.5GHz (72 chips, 144 cores), 904 SPECint_rate2006.
    Submitted Results: Sun Fire E25K/E20K (1.95GHz)
    SPECint_rate2006: 1230 (72 procs)
    SPECint_rate2006: 833 (48 procs)
    SPECint_rate2006: 608 (36 procs)
    SPECint_rate2006: 409 (24 procs)
    Submitted Results: Sun Fire E6900/E4900/E2900 (1.95GHz)
    SPECint_rate2006: 410 (24 procs)
    SPECint_rate2006: 288 (16 procs)
    SPECint_rate2006: 220 (E4900 - 12 procs)
    SPECint_rate2006: 208 (E2900 - 12 procs)
    Submitted Results: Sun Fire V890/V490 (2.1GHz)
    SPECint_rate2006: 154 (8 procs)
    SPECint_rate2006: 78 (4 procs)
    Software: Solaris 10, Sun Studio 12

    [4] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    Power5+ now off the road?

    Saturday Mar 31, 2007

    IBM lacks Power5+ benchmarks on new & old workloads that everyone else is publishing on. Why no lastest GHz full-system IBM p595 publications on:

    • SPECjbb2005?
    • SPECint_rate2006?
    • SPECfp_rate2006?
    • Linpack?
    • SPECint_2006?
    • SPECfp_2006?
    • ....the list goes on...
    Don't they want comparisons? I hear IBM bloggers still love TPC-C so is the IBM p595 only suited for that very old (14-year old) test? The press and analysts are overwhelmingly seeing TPC-E the successor to the simplistic 13 year-old TPC-C. 7 years ago when Sun established a World Record TPC-C, Sun told the world the benchmark was too simplistic. It is good the see the rest of the industry is catching up. Sun never quotes 23-year old Dhrystones benchmark anymore either. :)

    For those who may not remember, IBM didn't even end the EOL'ed SPECint_rate2000 on a high note: http://www.spec.org/cpu2000/results/rint2000.html, search for "1644" and "1513" Since we're talking history, I should be clear and state that by "1513" I wasn't talking about the year that Juan Ponce de Leon definitely is known to have sighted what is now the USA and claimed it for Spain. :)

    Like this post? del.icio.us | furl | slashdot | technorati | digg

    IBM too tricky for good of others?

    Thursday Feb 22, 2007

    IBM's TPC-C results not worthy of belief? Lots of unrealistic optimisations? Sometimes you never know what you find when you start searching the web. After yesterday's posting I started looking. Here is info from June 2005 IBM interview: (who knows what they've done since that doesn't benefit users?)
    http://www.sigmod.org/sigmod/record/issues/0506/p71-column-winslet.pdf

      "And the good news is that about 40-70% of the stuff we do in performance tuning actually ends up helping end users. " Bruce Lindsay, IBM fellow

    Ouch! Sun aims for benchmark tuning that end users actually use! Does this explain IBM's over-inflated TPC-C results?

      Q: "Is there any particularly sneaky but still totally legal aspect of TPC-C tuning that you would like to mention?"

      A: "Well, we do things that are very, what should I say? Intense. We get down to the level of worrying about the physical column order in the table so the reference columns are near each other, minimizing cache misses during fetching. This is feasible in the TPC-C benchmark because there are only five tables and only ten to fifteen columns in each table. In a more realistic application, where there are many more queries to be considered, the tables are typically much, much wider, in the 80 to 100 column range; and there are dozens if not thousands of tables. Then this kind of analysis is no longer practical." Bruce Linsay, IBM fellow

    Good reason to make benchmarks messy and change them often. Is this why IBM hasn't published SPECint_rate2006 because they can't do the above?

    We were right with these past postings:
    http://blogs.sun.com/bmseer/entry/ibm_continues_to_abuse_and
    http://blogs.sun.com/bmseer/entry/selective_vision
    ...interesting...

    [2] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    Promises, promises & IBM

    Thursday Feb 15, 2007

    IBM POWER6 info in a CNET article.

    The say, "The first Power6 systems, lower-end models, are due to arrive midway through 2007."

    So in the mean time will IBM start publishing the benchmarks they've avoided on IBM p5 595 POWER5+ any time soon? Or is it just too embarrassing to show SPECjbb2005, SPECint_rate2006, etc. results compared to Sun 1.8GHz US-IV+ systems?

    when do the high-end power6 systems start to show? ...late 2007 or 2008?

    [4] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    bucket-o-records SPEC CPU2006 Sun Blade X8420

    Thursday Jan 11, 2007

    Sun Blade X8420 is 1.9x faster than the best Intel Woodcrest system on SPECint_rate2006 and is also 2.1x faster than the best Intel Woodcrest on SPECfp_rate2006. The Sun Blade X8420 is also 22% faster than 4-way Itanium2 dual-core on SPECfp_rate.

    Sun Blade X8420 delivered the best result with SPECint_rate2006 score of 93.1, using Solaris 10 and Studio 11 combo. The Sun Blade X8420 also delivered the best result of of 87.3 for the SPECfp_rate2006 benchmark for all x86 systems.

    SPEC CPU2006 Performance Charts (bigger is better, selected recent results)

    SPECint_rate2006

    System Processors Performance Results
    Type GHz Chips Cores Threads Peak Base
    Sun Blade X8420 AMD Opteron 8220 2.8 4 8 8 93.1 80.4
    Fujitsu CELSIUS R640 Xeon 5160 (Woodcrest) 3.0 2 4 4 50.3 48.8
    Sun Ultra 40 M2 AMD Opteron 2220SE 2.8 2 4 4 48.8 41.9
    HP DL585 Opteron 854 2.8 4 4 4 46.9 41.4
    Supermicro X7DBE Xeon 5160 (Woodcrest) 3.0 2 4 4 --- 45.2
    Sun Fire X4200 Opteron 285 2.6 2 4 4 42.8 37.8
    Fujjitsu RX220 Opteron 280 2.4 2 4 4 40.0 35.7
    Sun Fire X4200 Opteron 256 3.0 2 2 2 26.4 23.1
    HP DL585 Opteron 854 2.8 2 2 2 25.2 22.3
    Dell PrecWork 380 Pentium EE 3.73 1 2 2 -- 23.1
    HP DL380 G4 Pentium 4 3.8 2 2 2 -- 20.9

    SPECfp_rate2006

    System Processors Performance Results
    Type GHz Chips Cores Threads Peak Base
    Sun Blade X8420 AMD Opteron 8220 2.8 4 8 8 87.3 82.5
    HP rx6600 Itanium2 dual-core 1.6 4 8 8 71.4 69.1
    HP DL585 Opteron 854 2.8 4 4 4 49.3 45.6
    FSC CELSIUS R640 Intel Xeon 5160 (Woodcrest), WinXP Pro 3.0 2 4 4 42.5 41.4
    Sun Fire X4200 Opteron 285 2.6 2 4 4 38.1 36.0

    Results as of 09 Jan 2007 from www.spec.org.

    Benchmark Description

    SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and CINT2006. CFP2006 targets floating-point performance, while CINT2006 targets integer performance.

    Each suite has two different measures. First is the CPU measure, which is the performance on the suite as a single stream. This can be either a single thread or automatic compiled parallel run. This measure is further defined by base and optimized runs. Base uses the s