BM Seer Unofficial thoughts from an anonymous Sun employee

IBM continues funny configurations on benchmarks.

Tuesday Oct 16, 2007

IBM just published a somewhat funny TPC-H results. IBM used 32 (4-core Power6 p570) systems in a clustered TPC-H. Some funny things:

  • Why 32 4-core systems instead of clustering eight 16-core systems? Both configurations are built from the same 4RU 4-core unit.
  • Why no IBM Power6 single-system TPC-H results? No stand-alone 16-core, 8-core, 4-core POWER6 results? Maybe they just don't stand on their own when you look at the performance?
  • Why did they use 96 x 36.4GB drives on 31 systems and 96 x 73.4GB drives only on server14?
  • Why did they use the smallest 36.4GB drives for most drives in the system?
  • If the un-discounted server hardware price for thirty-two 4-core systems is $7,042,378usd, what is the price of one 4-core system?
  • If the un-discounted server hardware price of $7,042,378usd, and 128 cores, what is the un-discounted price per core when configured in a 4-core system with 32GB?

Disclosure statement

IBM TPC-H 10000GB result on the IBM System p6 570 of 343,551.2 QphH@10000GB ($32.89usdd $/QphH@10000GB, avail. 4/15/2008) on a 32-node cluster of 4-core p570 (each with 2 POWER6 4.7 GHz processor chips, 4 cores, 8 threads) and 32GB of memory per node running DB2 Warehouse 9.5 on AIX 5L V5.3. Total disk capacity was 110,489.27 GB in a IBM Totalstorage DS4800 storage subsystem (using 36.4GB drives on 31 nodes and 73.4 GB drives on server 14) and 10Gigabit Ethernet for cluster interconnect. Source: http://www.tpc.org; Results current as of 10/15/07.

[5] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

[Trackback] When you read disclosures about somewhat strange benchmark configurations, it would be nice to mandate explanations of benchmark configuration. Why did you opt for this configuration? For example: "As part x of benchmark y is disk-bound, the most feasi...

Posted by c0t0d0s0.org on October 17, 2007 at 07:55 AM PDT #

Interesting you don't mention any analysis of the actual results! Just to put things in perspective it delivered > 3x the closest result from a Sun configuration, with less number of cores, and significantly better price/performance.

It uses the building block approach defined with the IBM Balanced Warehouse, and the configuration follows the principles / best practices developed over the years.

Posted by An IBMer on October 17, 2007 at 10:30 AM PDT #

Yeah perspective, instead of running one IBM p570 power6 on any smaller size TPC-H that has lots of results and would show the world exactly what *one* system does/costs IBM would rather show a TPC-H result with a 2008 availability date and refer to a 2-year old Sun server (4 processor generations ago)?

Also are there scalability & latency problems with 16-core p570? Why were the 4-core building blocks not just hooked into the 16-core p570?

So the IBM balanced warehouse design specifies that server14 has twice the storage of the other nodes? I guess that requires high-priced IBM pro-service people that get paid loads more than I do to understand that.

Posted by BM Seer on October 17, 2007 at 11:58 AM PDT #

Clearly IBM's goal was to double the performance per core of the HP Itanic Montecito result.

"Why 32 4-core systems instead of clustering eight 16-core systems?"

Most recent IBM TPC-H results have been horizontally scaled. Clearly IBM DB2's partitioning ability exceeds its scalability. There is only one reason IBM does anything in a benchmark configuration, and that is to maximize performance per core. The partitioning capability of IBM DB2 likely derives from the old Informix XPS product, which IBM acquired many years ago.

"Why no IBM Power6 single-system TPC-H results?"

My guess is this is an architecture thing. IBM could probably replicate the architecture into a smaller system, but my guess is IBM would not get the per-core results it desires.

"Why did they use 96 x 36.4GB drives on 31 systems and 96 x 73.4GB drives only on server14?"

36 GB drives are cheaper. TPC benchmarks have a price/performance component. But 73 GB drives stream data faster. Deduction: The node with 73 GB drives is doing something different. It could be IBM is using an architecture similar to Sybase IQ, where one node handles all writes. A 73 GB drives should be able to stream writes faster than a 36 GB drive.

"If the un-discounted server hardware price for thirty-two 4-core systems is $7,042,378usd, what is the price of one 4-core system?"

$220,074.31. About 7X what a T5220 with similar throughput costs.

"If the un-discounted server hardware price of $7,042,378usd, and 128 cores, what is the un-discounted price per core when configured in a 4-core system with 32GB?"

Too much!

Posted by Mark on October 17, 2007 at 04:23 PM PDT #

TPC-H is an embarrassingly parallelizable benchmark, because that is the nature of datawarehousing.

Why sell increase system price unnecessarily by including high speed links?

Posted by ATMx on October 19, 2007 at 07:37 PM PDT #

Post a Comment:
Comments are closed for this entry.