Finally proof POWER6 has meaningless 300 GB/s peak bandwidth
Wednesday Aug 29, 2007
An IBM Hot chips presentation finally revealed what I posted 3 months ago... IBM's iTunes POWER6 comparison was misleading. In their press release IBM added things that can't be added to create a big number that means nothing.
-
IBM Press Release said:
"Even more impressive, the processor bandwidth of the POWER6 chip -- 300 gigabytes per second -- could download the entire iTunes catalog in about 60 seconds -- 30 times faster than HP's Itanium."
IBM MIXED maximum theoretical peaks and added them together to compare in an unfair way to HP Itanium. By the way all of these peaks are guaranteed not to reached by real applications (and certainly not iTunes. like IBM states. Here is my posting from 24-May as written, I've used strikes to correct the posting (not many corrections). Below "B"=bytes. "wr" is write, "rd" is read. in Bold I've added the numbers they revealed in the presentation. IBM only shows this for 4.7GHz, expect less on 3.5GHz and 4.2GHz (the POWER6 GHz it seems customers are getting).
- L3 cache: 16B wr + 16B rd, for a total of 32B x 2.5GHz = 80GB/s
this is a max of 40GB/s in each direction - memory: 8B wr + 16B rd, for a total of 24B x 2.5GHz = 60GB/s
(75GB/s when even adding in address lines)
peak no more than 40GB/s memory read bandwidth
peak no more than 20GB/s copy bandwidth (limited by write) - IO: 4B wr + 4B rd, for a total 8B x 2.5GHz = 20GB/s (but for addr and data)
- chip to chip: 3x links
of 8B in + 8B out, clock rate?50GB/s between 4 chips(8 cores) - node to node: 2x links
of 8B in + 8B out, clock rate?80GB/s for configurations with more than 4 chips(8 cores)
It is funny that on slide 7 of their hot chips paper that the numbers add to 305GB/s but they say "Total=300 GB/s", ah well they probably can't even meet some of their peaks.
Lots of problems with quoting peak bandwidth. Since STREAM is so easy to run, why hide results? are the true numbers so bad?
An analogy:
One can't just add peak rates to say it is
the bandwidth. It is like saying I can run at 10 miles per hour,
my car has gone 100 miles per hour, and the plane travels at 650 miles per hour. Does that mean I traveled at 760 Miles per hour (650+100+10=760).
No it just doesn't work that way.
The slides also indicate that global snoop limits memory bandwidth above 32 cores on IBM POWER6 on cache-intense workloads. But again not enough detail to tell what this really means yet. Currently IBM is only shipping 16-core Power6 p570?
http://blogs.sun.com/bmseer/entry/power6_err_try_10gb_s
Here is the problem with marketing peaks see the "Beware the Ides..." posting: http://blogs.sun.com/bmseer/entry/beware_the_ides_of_may











Posted by c0t0d0s0.org on September 03, 2007 at 02:06 AM PDT #