BM Seer Unofficial thoughts from an anonymous Sun employee

another useless unrealistic uber-simplistic TPC-C result

Thursday Jun 12, 2008

The IBM Power 595 IBM reached over 6 million tpmC on the TPC-C benchmark, but IBM avoids single-system TPC-H like the plague, why? Why didn't IBM measure and publish server watts actually used on this benchmark? Did that 4TByte of memory flame their power meters?

    {postscript: an IBM blogger says it is Sun speaking and then points to this blog, no these are the BM Seer's opinions (yes I am a Sun Employee) but don't necesarily represent Sun or Sun's management. I'm glad Sun doesn't post on the above mentioned benchmark. It is worthless. Sun publishes on most benchmarks, I'd say more than IBM, to see a huge list of very reasonable benchmarks avoided by IBM on the power6 servers see: blogs.sun.com/bmseer/entry/they_tried_to_make_ibm}

It is no mystery that my opinion is that the 16-year old TPC-C benchmark has been worthless for at least a decade. It isn't the fact that TPC-C is old but that it does not represent databases today (did it even then?).

Has IBM just optimized solely for TPC-C on hyper-expensive cores? Their engineers basically admit extreme benchmark optimization: http://blogs.sun.com/bmseer/entry/careful_reading_shows_a_lot http://blogs.sun.com/bmseer/tags/tpc-c

It is simplistic, small, encourages silly configs, even honest people in IBM admitted a year ago that it is losing relevance: ftp://ftp.software.ibm.com/eserver/benchmarks/wp_TPC-E_Benchmark_022307.pdf

Even IBM admits in the paper above, "TPC-C configurations do not reflect typical client configurations." They go on to call "Ease of partitioning: Unrealistically easy". Also all referential integrity for every table is turned OFF!

"The TPC-C benchmark is comprised of 5 stored procedure calls: New-Order, Payment, Delivery, Order-Status and Stock-Level." see this Microsoft blog from over a year ago. FIVE, Five, really only five - a huge server doing only 5 very-very simple things on 9 tables. No one in the world has a database that looks like this - it is really useless.

IBM and other vendors keep pushing TPC-C for bragging rights. They spend a huge effort telling customers that they need it.

What's next? IBM re-hyping other ancient benchmarks like Dhrystones as the most relevant benchmark for POWER6?

Disclosure Information:

IBM Power 595 (5 GHz, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5 TPC-C result of 6,085,166 tpmC ($2.81/tpmC, configuration available 12/10/08) Results as of 6/10/08, see www.tpc.org. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Performance Processing Council (TPC).

[10] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

As imperfect as some benchmarks are, the reality is that there is a lot of incompetent people taking purchasing decisions based on them, so I would sugest to publish results on them, play by the same rules use unrealistic configs, bios hacks, whatever it takes too have the best scores... it is a lot easier... in paralel you can continue criticizing them and educating customers... Sun's goal should be to sell as much hardware as possible... start selling to everyone!

Posted by Zoltan Farkas on June 13, 2008 at 06:53 AM PDT #

every time I saw your comment on IBM, HP , etc is like to tell everybody that only SUN is the best in the world. If you want to tell that IBM p595 is bad, why don't you ask SUN to prove their T Series, M Series or whatever that can kill IBM or Itanium on TPC-C result ? IBM and HP are dare enough to show their TPC-C result even you think it's not relevant anymore but can SUN show their TPC-C result that is better than IBM, HP or anybody ? By looking on SUN financial result especially on last Q3 result, it is very clear SUN is losing so much their market share in high end server market. Where is ROCK who supposes be able to kill IBM POWER or HP Itanium ? I guess SUN is in very deep trouble to be able launch ROCK server or perhaps it will never be launched. This perception in market is more valid than whatever you want to tell that TPC-C is irrelevant.

Posted by dono on June 15, 2008 at 08:47 PM PDT #

I second Dono - rather than criticize power consumption of IBM servers why not show that Sun can provide higher performance per $ than IBM. As an investor I'm equally disappointed about the Rock processor delays. It looks from the outside like Marc Tremblay and his team bit off more than they could chew with features TM and scout threads. So how about telling us when Sparc64 Jupiter is going to replace Olympus and comparing this to the big blue iron?

Posted by Kevin Hutchinson on June 15, 2008 at 11:22 PM PDT #

OK (as always I speak for myself a Sun employee who offers my opinions and does not officially represent the views of my employer) TPC-C results are meaningless! No valid comparisons can be made with it!
It has been over-done and means nothing. Here are the oddities:
http://blogs.sun.com/bmseer/entry/judging_by_the_wrong_things
http://blogs.sun.com/bmseer/entry/critical_reading_absolutely_required_for

TPC-C is invalid just like Dhrystones are. A benchmark that isn't meaningless for data warehousing is TPC-H - why don't you also post on IBM sites demanding single-system IBM TPC-H results? For a list of many other benchmarks IBM avoids: http://blogs.sun.com/bmseer/entry/they_tried_to_make_ibm

History repeating itself, 16-year old TPC-C looks like the 24-year old Dhrystone benchmark:
http://en.wikipedia.org/wiki/Dhrystone
"Using Dhrystone as a benchmark has many pitfalls: it features unusual code that is not usually representative of real-life programs. It is also susceptible to compiler optimizations."

"Dhrystone is no longer at all useful for performance measurement of systems, as the compiler, operating system and its small code size (allowing it to fit in cache) make it representative of a smaller and smaller subset of programs. Even its author has held this view for a long time now. [1]"

Again, read the comments from IBM/Microsoft in the original posting on TPC-C.

Also I'm asking IBM to stop gaming silly comparisons, look at this:
http://blogs.sun.com/jsavit/entry/no_there_isn_t_a
in response to:
http://www.ibm.com/developerworks/blogs/page/InsideSystemStorage?entry=yes_jon_there_is_a#comments

Finally, I'm not critising the power consumption of IBM servers, I'M ASKING FOR REAL MEASURED DATA - it is simple transparency that IBM avoids!

Posted by BM Seer on June 16, 2008 at 10:06 AM PDT #

Zoltan, my personal view is that sun wants customers to judge systems by real data, so they are not fooled by invalid results. It is about serving the customer, setting good expectations, and making sure customers are making valid purchasing decisions. It is a better long-term strategy in my mind than trying to fool customers with invalid data.

Also kindly remember that Sun criticised TPC-C *EIGHT YEARS AGO*, when setting Sun was on top world record. No, we did not make this comment as sour grapes, as they say.

"It's well-understood in the technical communities that TPC-C no longer represents current customer workloads since the transaction load that its models are made of are small, primitive and disconnected transactions. While this model was acceptable for the workloads of the late 1980s, it misses the mark..."
http://www.sun.com/smi/Press/sunflash/2000-08/sunflash.20000831.1.html

You'll also notice the Aug 2000 press release said, "Customer workloads nowadays require a more ad hoc workload than the TPC-C specifies."

World record TPC-C results referenced above was an overall performance world record at August 31, 2000. Sun Enterprise 10000 server (Starfire) running Sybase Adaptive Server Enterprise (ASE), 156,873.03 tpmC, $48.81 price/tpmC, available February 28, 2001. A full disclosure report and executive summary are available through the TPC Web site located at http://www.tpc.org.

Posted by BM Seer on June 16, 2008 at 10:15 AM PDT #

>my personal view is that sun wants customers to judge systems by real data, so they are not fooled by invalid results.

Yeah, SUN make bios hack for more realistic results in spec benchmarks :)
Why SUN make spec benchmarks witch also doesn't customer workloads ? All vendors criticised TPC-C ...

Posted by Triffids on June 17, 2008 at 12:57 AM PDT #

Agree with you that TPC-C is a unrealistic benchmark, however your competitors publish result on this benchmark for systems not even available yet, and customers can take purchasing decisions based on them...

I remember the reason why BMW joined the Formula 1 racing years ago, it was because Mercedes was chewing away their market share by participating in the contest..

A lot of customers chose Mercedes over BMW, based on results of cars that have nothing to do with the cars they actually bought.

In the end BMW joined the race ...

It's a race, I believe having your brand displayed at the top even if the race is unrealistic is like free advertisement...

Posted by Zoltan Farkas on June 17, 2008 at 10:22 AM PDT #

TPC-C is a unrealistic benchmark, I don't believe customers should be making purchasing decisions based on TPC-C, they will just be fooling themselves. There are so many other benchmarks that are much more realistic to judge system performance (My opinion, as people keep unfairly accuse me of speaking for my employer - Sun).

Zoltan, I don't really see your Formula1 racing comparison. A better analogy is that if Mercedes tested formula 1 cars on a sod track and challenged BMW to do the same. But BMW said running Formula 1 on a sod track is a silly way to see how fast a Formula 1 goes or how well it works.

BIOS hacking (non-default BIOS) used only to improve specific benchmark over-inflates the perception of the product. No customer I know will hack their BIOS for one workload, since most servers in a datacentres do many applications (some which hacked BIOS improve and others that go slower by a lot) no customer I know will actually will ever do in a real datacentre. It almost reminds me of Formula 1 floor device and the rear wing separator issue last year:
http://www.formula1.com/news/headlines/2007/8/6569.html

If BIOSes have to be hacked differently for every benchmark, what does that say about a server's general performance? Not good, in my opinion.

Posted by BM Seer on June 18, 2008 at 10:54 AM PDT #

>I don't believe customers should be making purchasing decisions based on TPC-C, they will just be fooling themselves.

Customers make decisions based on various benchmarks: SAP-SD, OEBS, SPEC and TPC-C is one from the list. Fooling them self is only SUN's employers, because customers understand situation with SPARC64 systems and their perspectives on TPC-C.

Posted by Triffids on June 19, 2008 at 04:26 AM PDT #

Good point customers should use as variety of
SPEC, TPC, and ISV benchmarks, for example:
- SPECjAppServer (2-node and multi-node, hey were are the complete power6 results)
- SPECjbb,
- SPECint_rate,SPECfp_rate (use server benchmarks on servers!)
- SAP-SD,
- SAS,
- LHS,
- Siebel (see links below for IBM & US T2)
- TPC-H (single server),
- TPC-E on power6
...and measured power on everything!

HOLD IT!!!! There are more and IBM should be providing results on power6 so customers could judge performance: Here is what IBM does NOT publish power6 results on:
* no STREAM Bandwidth p570 power6 results
* no TPC-H single-system Power6 results
* no TPC-E single-system Power6 results on Oracle or IBM's own DB2
* no SPECweb2005 Power6 results
* no SPECjAppServer2004 Power6 as database server
* No SPECmail Power6 results
* no SPECompL2001 Power6 results
* no Lotus Domino R6iNotes Power6 results
* no cryptography performance on Power6 results
* no *measured* power-performance on _any published benchmarks_
* no consolidation overheads on Power6, (Solaris Zones=~0%)
etc.

SPEC, SPECint, SPECweb, SPECfp, SPECjAppServer, SPECmail, SPEComp, reg tm of Standard Performance Evaluation Corporation. TPC-E, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info http://www.tpc.org. Lotus Domino more info: www.notesbench.org
Seibel information posted on Oracle's website: http://www.oracle.com/apps_benchmark/doc/Sun_Siebel8_10000_PSPP_On_Solaris.pdf
http://www.oracle.com/apps_benchmark/doc/IBM_Siebel8_7000_PSPP_On_AIX_POWER6%20Final.pdf

Posted by BM Seer on June 19, 2008 at 10:26 AM PDT #

Post a Comment:
Comments are closed for this entry.