BM Seer Unofficial thoughts from an anonymous Sun employee

TPC-C Reminder

Monday Apr 30, 2007

When Sun was had the world record we said it was too simplistic and old, and that was yeast ago. TPC-C has problems, IBM has heavily tuned it like this. Why does IBM still point to this 14+ year old benchmark? Why do they avoid new benchmarks with the lastest GHz full-system IBM p595 on:

  • SPECjbb2005?
  • SPECint_rate2006?
  • SPECfp_rate2006?
  • Linpack?
  • SPECint_2006?
  • SPECfp_2006?
  • ....the list goes on...
Doesn't IBM want fair comparisons? I guess IBM would just be beaten by Sun in performance and $/perf so they want to avoid comparisons.

It is funny that last year I egged HP on about SPECjbb2005, "why no results?" Someone commented that HP thinks it is a bad benchmark, so they won't publish on it. Now HP has the top result. Changed their tune?

Notice how this is different than when established a World Record TPC-C, Sun told the world the benchmark was too simplistic back then and is sticking to it? The world became a lot more complicated in the past 7 years and computing has evolved a lot so we won't go back to something that was created 13 years ago. Sun never quotes 23-year old Dhrystones benchmark anymore either. :)

The press and analysts are overwhelmingly seeing TPC-E the successor to the simplistic 14 year-old TPC-C.

IBM's TPC-C "tuning"(?) that won't apply to anything in the real world

June 2005 Interview with Bruce Lindsay (IBM Fellow) at http://www.sigmod.org/sigmod/record/issues/0506/p71-column-winslet.pdf

    "And the good news is that about 40-70% of the stuff we do in performance tuning actually ends up helping end users."

This means that 30% to 60% of IBM's TPC-C tunings don't help users.

Really beyond the huge disk size of the large TPC-C results (which has a lot to do with the TPC-C being 14 years old), the quote below points to tuning that is legal but seems a bit too "tricky" for my taste...

    "We get down to the level of worrying about the physical column order in the table so the reference columns are near each other, minimizing cache misses during fetching. This is feasible in the TPC-C benchmark because there are only five tables and only ten to fifteen columns in each table. In a more realistic application, where there are many more queries to be considered, the tables are typically much, much wider, in the 80 to 100 column range; and there are dozens if not thousands of tables. Then this kind of analysis is no longer practical." Bruce Linsay, IBM fellow"

For those who may not remember, IBM didn't even end the EOL'ed SPECint_rate2000 on a high note. See: http://www.spec.org/cpu2000/results/rint2000.html and search for "1644" and "1513"

various footnotes:

"It's well-understood in the technical communities that TPC-C no longer represents current customer workloads since the transaction load that its models are made of are small, primitive and disconnected transactions. While this model was acceptable for the workloads of the late 1980s, it misses the mark..." Sun's World Record TPC-C Press release, August2000

Disclosure Statement

TPC-C results referenced above was the fastest overall performance world record at August 31, 2000. Sun Enterprise 10000 server (Starfire) running Sybase Adaptive Server Enterprise (ASE), 156,873.03 transactions per minute (tpmC), $48.81 price/tpmC, available February 28, 2001. A full disclosure report and executive summary are available through the TPC Web site located at www.tpc.org.

[5] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

You seem fond of pointing-out the age of benchmarks and implying that a benchmark's age matters - 14+ years for TPC-C, 23 years for Dhrystones - yet the Linpack benchmark dates to 1979: http://www.netlib.org/utk/people/JackDongarra/faq-linpack.html#_Toc27885709

Posted by rick jones on May 01, 2007 at 08:46 AM PDT #

The issue is linpack is solving a full system of equations it hasn't gotten any more complex since '79. Equation solving is function that is the basis of applied mathematics. It is still key to certain applications. TPC-C is has a 7 simplistic transactions on 9 tables, this database test is a 'joke' in comparison to complex modern database transactions. It is so simplistic it is easy to trick and artificially make your system performance look good. TPC-D and Dhrystones got so tricked out that at the end computers where even doing the functions they were just printing the answer! TPC-H was created to get around all of the issues that benchmarkers were applying to TPC-D. As real workloads get complex our benchmarks must follow them or become useless in evaluating system performance.

Posted by BM Seer on May 01, 2007 at 10:31 AM PDT #

Along the lines of keeping benchmarks up to date and relevant, I think LDAP throughput would make a nice target for cross-platform comparisons. In particular, since OpenLDAP already runs well on all vendor platforms, and can be set up with nearly identical software configurations on each machine, it would be ideal for highlighting the relative strengths of each hardware platform.

Posted by Howard Chu on May 01, 2007 at 03:33 PM PDT #

While some of the newer benchmarks are probably closer to being relevant in the real world isn't the choice of which benchmark to promote or ignore just a marketing decision? If that's true then probably IBM is waiting till the p6 chips are out and doing some damage to existing newer ones at that time.

Posted by John H on May 03, 2007 at 02:39 PM PDT #

From what you are saying, then you believe that POWER5+ is no longer competitive with Sun and they need to wait for POWER6 to have a chance to be competitive.

Posted by BM Seer on May 04, 2007 at 09:57 AM PDT #

Post a Comment:
Comments are closed for this entry.