BM Seer Facts & Questions from an Anonymous Sun Source

HP's real story about power6

Thursday Apr 17, 2008

HP puts out various "real stories" about competitors. They have an updated one about the new IBM power6 systems. http://h71028.www7.hp.com/ERC/cache/107848-0-0-0-121.aspx

I'll try to comment on some of them:

    Fact 1 they state:
      IBM software experts have admitted that software already tuned for out-of-order version of POWER is, “no [sic] so good for in-order power6 processor.” “Maximizing Application Performance on POWER” IBM Linux on POWER GCC Team Lead, April 19, 2007, page 8, SW_Summit_gcc_and_tool_chain_Peter.pdf
    If you don't understand the issues with"out-of-order", what you can take away is that not every technology that you hear hyped by vendors will give you a true advantage when you look at whole system performance on real applications.

    Fact 2 & 3: shows that adding GHz doesn't add delivered performance but it does add a disproportionate number of watts. IBM p 595 (POWER6) 27,500 watts max for 64 cores. 27500w/64-core = 430watts/power6-core

      The max rated system electrical load for the POWER6-595 server has increased nearly 5000 watts over the POWER5-p595 for the same number of processors.(ENUS108-257)
    Then they go on to compare multi-threaded server chips with a 1-job benchmark. I don't know why they didn't compare on server benchmark SPECrate_int2006, maybe when you test these as servers you see real differences. Let's look at the latest 2-chip results for Itanium2, power6, and UltraSPARC T2 Plus:

    A 2-chip Sun SPARC Enterprise T5240 server, running the UltraSPARC T2 Plus processor at 1.4 GHz, beat the 2-chip IBM 4.7GHz POWER6-based p570 by 29% on the SPECint_rate2006 benchmark, and also beat the 2-chip HP 1.66GHz Itanium-based Integrity rx2660 by 2.5 times on the SPECint_rate2006 benchmark.

    Fact 4: shows IBM is raising its software prices.

    Fact 5: HP states that AIX 6.1 will be needed to more fully exploit POWER6, and then asks: how many ISV applications are certified for AIX 6.1?

For more on the latest SPECint_rate 2006 results see: http://blogs.sun.com/bmseer/entry/2_chip_spec_cpu2006_rate

For more on prices on small 4-core IBM: http://blogs.sun.com/bmseer/entry/some_ibm_power6_actual_prices

I haven't seen anything on IBM p 595 power6 prices 64-core 5GHz, if you have any pointers please post in the comments.

Disclosure statement:
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation. Sun result submitted to SPEC, other results from www.spec.org as of 4/7/08. Sun SPARC Enterprise T5240 (UltraSPARC T2 Plus, 2 chips, 16 cores), 157 SPECint_rate2006; IBM p 570 (POWER6, 2 chips, 4 cores), 122 SPECint_rate2006, HP Integrity rx2660 (Itanium2, 2-chip, 1.66GHz/18MB), 62.8 SPECint_rate2006.

[11] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

HP watts up with no specification?

Friday Mar 21, 2008

HP DL 580 G5 = 387 watts at 100% target load

Reading "HP ProLiant DL580 G5 server posts highest 4P result on the new SPECpower_ssj2008(TM) benchmark" brochure leaves one quite confused. HP does not specify processor GHz or Memory size, why?. Take note SPEC members: you guys need to force that be clearly specified in the future, or you will just encourage confusion. An HP DL580 G5 4P QC 1.86GHz Xeon 16GB uses 387 watts at 100% target load!

...or... HP DL 580 G5 = 942 watts at average (uncomparable test)

IBM points to a Principled technologies paper (by the way, who commissioned that paper?). Looking at that paper they clearly specify the memory size, processor GHz, and measured average wattage. An HP DL580 G5 4P QC 2.93GHz Xeon 64GB uses 942 average watts!

I do not present these wattage numbers for comparison, as they are different tests. But what are the real watts of a DL580 G5? Clearly HP isn't telling us what we need to know.

I'd suggest SPEC require better disclosure of information and clearly show effects of processor GHz and memory size. MEMORY SIZE makes a HUGE difference in watts. Wake up world! Again my plea to add power measurements and power-performance metrics to all performance benchmarks at full utilization.

SPEC Disclosure statement

SPECpower_ssj2008:HP Proliant DL580 G5 (4-chip QC Xeon L7345 1.86GHz, 16GB), 546 overall ssj_ops/watt, 359523 ssj_ops and 387 watt at 100% target load, 255512 ssj_ops and 359 watt at 70% target load, and 71409 ssj_ops and 294 watt at 20% target load. SPEC, SPECpower reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 12/11/07.

note: if anyone has corrections to the above disclosure, let me know in comment below as quickly as you can and I will correct it immediately. I don't attend SPEC meetings so I don't know all of the rules, but I try my best to write disclosures correctly.

[10] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

HP targets some of POWER6's Issues and hype...

Saturday Jul 28, 2007

Sun isn't the only one to see issues with POWER6 marketing, see what HP wrote at: http://h71028.www7.hp.com/ERC/cache/107848-0-0-0-121.html?ERL=true

IBM focuses so much on processor technical details, it makes you wonder what they missed on system? Why does everyone else focus on what a full system can do and IBM focuses on what a single core can do?

[2] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

Beware the Ides... of May

Tuesday May 15, 2007

Careful reading often required when vendors make claims. I've lumped some of the bad comparisons in the industry into some general classifications. I'll perfect these classifications at a later date.

"Pay no attention to the man behind the curtain."

    Here vendors point to peak numbers even though it is easy to actually measure delivered bandwidths.

    Action: Ask for delivered bandwidths on memory and IO.

"Don't accept analysis rotten to the core."

    Here vendors avoid pointing at delivered performance results on modern benchmarks (you know the ones developed in the past 5 years) and looking at complete systems. They instead construct ratios to things such as cores, threads etc -- AND ignore that everyone has cores and threads that are implemented completely differently and at very different cost structures. Everyone's cores cost completely differently.
    Note: My comments on the pre-jurassic TPC-C benchmark, if you think that Sun is avoiding current tests, TPC-C isn't current.

    Action: Ask for system performance and system cost, for example: ignore all performance per core comparisons.

"Don't believe in paper tigers."

    Rush out the prototype then deliver the world (with un-promised date). Take a long time to avoid showing on current projects

    Action: Ask for certified delivery dates, if they get dodgy during the discussion then the don't have it.

"Don't think things in magnifing glass are as big as real things."

    This comes in a couple of forms:
    First, use small systems to deliver some performance and then imply undelivered bigger systems to be world beaters.
    Second, claim a new feature that has huge benefit, show it on one test -- then make it sound like it is also a huge benefits for everything.

    Action: Read carefully, don't make assumptions, ask questions, share your analysis with others. If you see a low level performance measurement asks what how it really effects real workloads -- get nervous if they say "your mileage will very." Press for numbers.

"Don't believe 'A' implies 'C'. 'Sea' implies 'B', therefore 'A' implies 'B'."

    (ok not as poetic, I'll work on this)
    Report on one small or huge configuration (which ever gives you the particular advantage), then talk about another configuration so you think that both are the same. An example of this was one vendor used a small configuration to measure watts, then in the next sentence talked about a big configuration so if you weren't careful you think the wattage rating applied to both.

    Action Item: Listen and read very carefully. Ask for config details. Don't make assumptions.

Might be fun to use this "Sieve of BM Seer" to rate any product announcement.

I'm thinking of evolving this and having a regular "Beware the Ides of xxxx" article and see how it develops. Check back on the ides of each month for the next installment.
(ides = 15th)

Like this post? del.icio.us | furl | slashdot | technorati | digg

Java performance: Sun Fire E25K UltraSPARC IV+ Beats HP Superdome Itanium 2

Tuesday May 08, 2007

Sun leads the way, beating Itanium2 and POWER5+(by a lot):

  • Sun Fire E25K with dual-core US-IV+ beats the HP Superdome with dual-core Itanium 2.
  • Sun Fire E25K is 6.4 times faster than the fastest IBM POWER5+ p5 570 result (1.9GHz 16 cores) of 326,651 bops. Note: The largest IBM p5 595 only has 4 times as many POWER5+ cores. IBM has not published this benchmark on their largest systems. why does IBM keep avoiding comparison to Sun on accepted standard benchmarks like SPECjbb2005?
  • Sun Fire E25K 1.95GHz US-IV+ also beats the Fujitsu PRIMEPOWER 2500 2.08GHz SPARC64 V by 67%.
The Sun Fire E25K with 1.95GHz US-IV+ set a World Record for systems with 72 or fewer chips, achieving 2,105,264 SPECjbb2005 bops and 29,240 SPECjbb2005 bops/JVM on the SPECjbb2005 benchmark.

The 6.0_02 version of the Java HotSpot(TM) 32-Bit Server VM showed a 27% improvement of the 6.0 version on the SPECjbb2005 benchmark. The Sun Fire E25K result used Solaris 10.

SPECjbb2005 Performance (ordered by performance bops : SPECjbb2005 Business Operations per Second, bigger is better)

System Date Processors Performance
(Chips, Cores, Threads) GHz Type bops JVMs bops/JVM
Sun Fire E25K 5/07 72, 144, 144 1.95 US-IV+ 2,105,264 72 29,240
HP Superdome 9/06 64, 128, 128 1.6 Itanium 2 2,054,864 32 64,215
Fujitsu PP2500 3/06 128, 128, 128 2.08 SPARC64 V 1,251,024 32 39,095
IBM p5 570 1/06 8, 16, 32 2.2 POWER5+ 326,651 8 40,831

Sun results have been submitted to SPEC for review and are on track for publication.

Benchmark Description

SPECjbb2005 (Java Business Benchmark) measures the performance of a Java implemented application tier (server-side Java). The benchmark is based on the order processing in a wholesale supplier application. The performance of the user tier and the database tier are not measured in this test. The metrics given are number of SPECjbb2005 bops (Business Operations per Second) and SPECjbb2005 bops/JVM (bops per JVM instance).

Disclosure Statement:

SPECjbb2005 Sun Fire E25K (72 chips, 144 cores, 1.95 GHz) 2,105,264 SPECjbb2005 bops, 29,240 SPECjbb2005 bops/JVM submitted for review; Sun Fire E25K (72 chips, 144 cores, 1.95 GHz) 1,657,274 SPECjbb2005 bops, 23,018 SPECjbb2005 bops/JVM; HP Itanium Superdome (64 chips, 128 cores, 1.6 GHz) 2,054,864 SPECjbb2005 bops, 64,215 SPECjbb2005 bops/JVM; Fujitsu PRIMEPOWER 2500 (128 chips, 128 cores, 2.08 Ghz) 1,251,024 SPECjbb2005 bops, 39,095 SPECjbb2005 bops/JVM; IBM eServer p5 570 (8 chips, 16 cores, 2.2 GHz) 326,651 SPECjbb2005 bops, 40,831 SPECjbb2005 bops/JVM. SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation. Results as of 5/8/07 on www.spec.org

Certified Results
Performance: 2,105,264 SPECjbb2005 bops
  29,240 SPECjbb2005 bops/JVM
Reference Date: May 8, 2007
Systems: Sun Fire E25K
Processor/GHz: 72 US-IV+ 1.95 GHz
Operating System: Solaris 10
JVM: Java HotSpot(TM) 32-Bit Server, Version 6.0_02

Like this post? del.icio.us | furl | slashdot | technorati | digg

Sun SPARC beats IBM and HP systems on SAP SD standard SAP ERP 2005 application benchmark

Tuesday Apr 03, 2007

The Sun Fire E6900 has great performance on SAP SD standard SAP ERP 2005 application benchmark as of 04/02/07. The 24-processor Sun Fire E6900 with 1.95 GHz UltraSPARC-IV+ achieved 6160 users on the two-tier SAP Sales and Distribution (SD) standard SAP ERP 2005 application benchmark(24 processors, 48 cores, 48 threads).

  • The 24-processor Sun Fire E6900 beat the 16-processor IBM p5-570 POWER5+ by 12%.
  • The 24-processor Sun Fire E6900 beat the 16-processor HP Integrity Superdome Itanium2 dual-core by 10%.
  • Effective 08/31/06 a new SAP R/3 version (ECC 6.0) and kernel (7.00) is required to run the SAP-SD 2-Tier benchmark. The new version is a bit more heavy-weight than the previous version (ECC 5.0) so older results have a performance advantage.

SAP-SD 2-Tier Performance Table (#users is perf metric)

System OS
Database
Users SAP
ERP/ECC
Release
SAPS SAPS/
Proc
Date
Sun Fire E6900
24xUS-IV+ @1.95GHz
96 GB
Solaris 10
Oracle 10g
6160 2005
6.0
30,820 1,284 03-Apr-07
HP Integrity Superdome-16
16xDual-Core Intel Itanium 2 @1.6GHz
256 GB
Windows Server 2003 DE
SQL Server 2005
5600 2005
6.0
28,200 1,762 18-Dec-06
IBM p5 570
16xPOWER5+ @2.2GHz
128 GB
AIX 5.3
DB2 UDB 8.2.2
5520 2004
5.0
27,670 1,729 25-Jul-06
Fuitsu PRIMEQUEST 480
32xIntel Itanium 2 @1.6GHz
256 GB
SuSE LES9
Oracle 9i
5000 2004
5.0
25,050 783 11-May-06
Unisys Enterprise Server Model ES7000/one
16xDual-Core Intel Itanium 2 @1.6GHz
256 GB
Windows Server 2003 DE
SQL Server 2005
4884 2005
6.0
24,570 1,536 19-Dec-06

Complete benchmark results may be found at the SAP benchmark website http://www.sap.com/benchmark.

SAP has specified that the Benchmark Users metric is the only metric to be used for public comparisons. However, Benchmark Users can be traded off with response time in performance tuning, and so comparing Line Items per Hour or SAPS is a better way to compare the actual power of systems.

Benchmark Description

The SAP Standard Application SD (Sales and Distribution) Benchmark is a two-tier ERP business test that is indicative of full business workloads of complete order processing and invoice processing, and demonstrates the ability to run both the application and database software on a single system. The SAP Standard Application SD Benchmark represents the critical tasks performed in real-world ERP business environments.

SAP is one of the premier world-wide ERP application providers, and maintains systems on the various SAP products.

Example Disclosure Statement:

Two-tier SAP Sales and Distribution (SD) standard SAP ERP 2004/2005 application benchmark: Sun Fire E6900 (24-way, 24 processors, 48 cores, 48 threads) 24 x 1.95 GHz UltraSPARC IV+, 96GB memory, 6,160 SD benchmark users, 1.99 sec. avg. response time, Cert#2007023, Oracle 10g database, Solaris 10; HP Integrity Superdome-16 (16-way, 16 processors, 32 cores, 64 threads) 16 x 1.6 GHz Dual-Core Intel Itanium 2 9050, 256GB memory, 5,600 SD benchmark users, 1.91s avg resp time, Cert#2006090, SQL Server 2005, Windows Server 2003 Datacenter Edition; Unisys Enterprise Server Model ES7000/one (16-way, 16 processors, 32 cores, 64 threads) 16 x 1.6 GHz Dual-Core Intel Itanium 2 9050, 256GB memory, 4,884 SD benchmark users, 1.93s avg resp time, Cert#2006091, SQL Server 2005, Windows Server 2003 Datacenter Edition; IBM System p5 570 (16-way, 16 processors, 16 cores, 32 threads) 16 x 2.2 GHz POWER5+, 128 GB memory, 5,520 SD benchmark users, 1.97s avg resp time, Cert#2006044, DB2 UDB 8.2.2, AIX 5.3; Fujitsu PRIMEQUEST 480 (32-way, 32 procs, 32 cores, 32 threads) 32 x 1.6 GHz Intel Itanium 2, 256 GB memory, 5,000 SD benchmark users, 1.97s avg resp time, Cert#2006023, Oracle 9i, SuSE Linux Enterprise Server 9; SAP, R/3, mySAP reg TM of SAP AG in Germany and other countries. More info www.sap.com/benchmark.

Results Summary

Certified Results
Performance: 6,160 benchmark users
Server: Sun Fire E6900
Processors: 24 x 1.95 GHz UltraSPARC IV+ 32MB L3 Ecache
Memory: 96 GB
Operating system: Solaris 10
Database S/W: Oracle 10g
SAP S/W: SAP ECC 6.0
SAP Certification: #2007023
Storage: Sun StorEdge 3510 and 6140

...more to come today, keep checking back.

Note: Sun has always called the socket the processor, IBM in the past several years started calling the core the processor. Also note that IBM cores are completely differently designed than Sun so comparing on a per core basis has MANY Problems, please see: http://blogs.sun.com/bmseer/entry/not_comparing_e25k_p595

Like this post? del.icio.us | furl | slashdot | technorati | digg

HP's tuning on TPC-C

Wednesday Feb 21, 2007

OK I've found some on HP's Itanium TPC-C tuning, now to find IBM's info. HP tired hard to get a good TPC-C but IBM must have done a lot more on TPC-C. ...and this is after IBM did a lot to tune SPECint_rate2000 for Power5+. This covered in http://blogs.sun.com/bmseer/entry/judging_by_the_wrong_things.

But clearly HP did a lot on TPC-C, though it seems like you really need to do a lot at a low level to get good database performance for Itanium2. Also I'm not buying the comment that Itanium2 was beaten by IBM because the CPU was not the bottleneck -- HP did lots to improve CPU performance.

...some questions after reading "Squeezing performance out of Itanium":

  • Do you have to have a PhD in Chip design and Compiler technology to tune your database? :)
  • no improvement going from 400GB to a 600GB SGA... And 2x improvement going from 600GB -> 1000GB. Lots of expensive memory only pays of when you get near 1TB of memory. The latest TPC-C result by HP prices memory at more than 2.2 Million dollars?
  • What about "Out of the Box" performance? -20% without profile feedback optimisation and half the performance without profile and 1TB memory.

[3] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg