BM Seer Unofficial thoughts from an anonymous Sun employee

Linpack Benchmark: Sun SPARC Enterprise M8000 Beats IBM POWER6

Friday Jul 13, 2007

The Sun SPARC Enterprise M8000 has topped the performance of the brand new 4.7GHz POWER6 based p570. The Sun Studio 12 Compilers, Solaris 10, and Sun Performance Library played a key role in obtaining this performance.

The Sun SPARC Enterprise M8000 outperforms the best published POWER6 based system from IBM p570 by over 12% on the Linpack benchmark (Highly Parallel Computing). As a reminder IBM cores costs lots more than any other vendor, so you can't just look at perf/core. Compare systems of similar pricing and configuration.

The Sun SPARC Enterprise M8000 tops the HP Itanium 2 rx8640 system by 40% on the Linpack HPC benchmark.

The Sun SPARC Enterprise M8000, using Sun Studio 12 delivered a score of 268.6 GFLOPS on the Linpack HPC benchmark.

    Funny I read an IBM blog that said all was quiet for them in benchmarks, Sun decided to keep working during the summer :), and I almost can't keep going on my regular job, because this blogging hobby is keeping me busy because so many of my friends in the benchmarking group are producing so many great results on Sun systems!

LINPACK HPC Performance Chart - GFLOPS (bigger is better)

System GFLOPS Processors
Total Peak paralellism chips,cores Type GHz
Sun SPARC Enterprise M9000 1032.0 1228.8 128 64,128 SPARC64 VI 2.4
Sun SPARC Enterprise M8000 268.6 307.2 32 16,32 SPARC64 VI 2.4
Sun SPARC Enterprise M8000 255.3 291.84 32 16,32 SPARC64 VI 2.28
IBM p570 239.4 300.8 16 8,16 POWER6 4.7
HP rx8640 192.4 204.8 32 16,32 Itanium 2 1.6

Benchmark Description

The Linpack benchmark suite measures the performance for factoring and solving a dense set of linear equations in double-precision floating-point.

The Linpack HPC benchmark allows the solution of any size matrix with a single right hand side. It was developed to allow vendors to show off their hardware. Because big problems allow for peak performance potentials, the benchmark is seen as an upper bound of potential performance of a machine. The run rules are much more flexible. The solution technique must use a pivoting scheme and the driver must follow the spirit of the Linpack 1000 or Linpack 100 benchmarks.

Disclosure Statement:

Linpack HPC, results from http://www.netlib.org/benchmark/index.html as of 07/13/07. Sun SPARC Enterprise M8000 (SPARC64 VI @2.4, 16 chips, 32 cores), 268.6 GFLOPS. IBM p570 (POWER6 4.7GHz, 8 chips, 16 cores) 239.4 GFLOPS. HP rx8640 (Itanium 2 1.6GHz/24MB, 16 chips, 32 cores), 192.4 GFLOPS. Linpack Benchmark Performance Report

Results Summary

Published Results
Performance: 268.6 GFLOPS
System: Sun SPARC Enterprise M8000, 256GB
Total Number Processors: 16
Processor/GHz of Server: SPARC64 VI, 2.4 GHz
Operating System: Solaris 10
Compiler: Sun Studio 12

[4] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

Hi, me again....

Ok, so let me get this right... you have a box with 2x as many CPU's / cores, starts at only $256k, takes up an entire rack.... and beats an IBM system a third its physical size (4 2 proc modules x 4U ea = 16U) by a whopping 2%?!?

I'm sorry, maybe you should pick on somebody your own size.

Maybe if you focused on the TCO of the two boxes (assuming the ibm cost more), or the total power (the sun box takes 3x220V@30A, 6 if you want redundant power). But from just looking at the numbers above... WTF? The M9000 looks nice but has 64 cpus. Whats next? how the M4000 beats an HP DL385?

But right now, that IBM box looks rather nice compared to the M8000.

Posted by John on July 13, 2007 at 03:35 PM PDT #

OK why do you quote *starting* prices for an tiny-memory 4-core IBM system? or is that a 2-core?

IBM 16-core costs lots more, you'll have to price it out since you seem to have pricing info. Please price it out on same memory size as others. You pay factors more per IBM core than any other vendor.

IBM likes to hide prices, check out the cost of their largest 64-core system with full memory - tech web quotes $4M in 2004: http://www.techweb.com/wire/software/50500180

IBM likes to quote this silly performance result, if it only had clues... http://www.tpc.org/results/individual_results/IBM/IBM_570_20070522_ES.pdf

Posted by BM Seer on July 13, 2007 at 05:08 PM PDT #

Uh, the 256k I quoted is the starting price of the Sun system... not the IBM system. At first glance, I have no idea what a fully configured M8000 cost since it is noticeably absent from Sun's website. (and I was a bit off, its $290,690 and not $256k).

Based on the TPC report, an IBM p570 16 proc/32 core system with, 256G of ram is about $1.1M before any discounts (assume a 40% discount, brining the cost down to a more "reasonable" $600K). Bumping the RAM up to 768G will double the list price to $2.1M.

Knowing that when Sun says 'starting at', you can assume that is for a bootable config with only a single cpu board and maybe 32G of ram. Based on the prices for similar components from sun in the past (200k for a 3800 cpu board anybody?)... I would not be surprised to see a fully loaded M8000 retail for close to $1M. After discount about 600k. Again, there is no published pricing guides on the M8000, so this is a total guess.

But wait... according to this link: http://www.serverwatch.com/hreviews/article.php/3688771 The M8000 with 16 of those nice 2.4GHz cpu's is actually closer to $2.5M. (the article contains a misprint, the 2.5M price appears to be for 16 cpu's, not 4 since a M9000 w/ 64 is _only_ $10M).

As indicated before, if your post provided some useful information about _why_ the sun system is so much better, I may be able to see what you are so excited about. Is it 1/3rd the cost? Does it help solve world hunger? Does it combat terrorism?

Instead, all I see is that a system having 2x as many procs/cores got about the same score as the IBM system. Simple tech spec reviews also show that the IBM system is 1/3rd the size.

From what I dug up, it appears that the IBM system listing for 1.1M is about the same as the sun box that is 2x as expensive, 3x as large and has 2x as many cpus while providing about the same performance. Maybe there is a reason you didnt specify what the tested configurations cost.

Finally, quoting out the cost of a 64 proc IBM system at $4m and using as an indicator of the cost of an 8 proc IBM system is like saying that all sun boxes are expensive because a 72 proc SF25k is $4m (ignoring the fact that I can pick up an 8 proc x4600 for under $45k and a few x4200s for $3k). Remember that a 64 proc M9000 is $10M. Pot... meet kettle.

As an aside, I like the fact that most of sun's system prices are easily available on their website (but not the M series.... hmmmm). So are HP's. So are Dell's. I've always had issues finding IBM prices. Guess what is in my datacenter? yup, sun, hp dell. There are 0 IBM servers of any type.

Posted by John on July 13, 2007 at 09:05 PM PDT #

When it comes to price/performance for Linpack, AMD's Barcelona (when it eventually ships), will offer the best FLOPs/dollar for HPC. The follow-ons to Barcelona will increase clock rate and cache, improving Linpack performance and efficiency. If Intel puts four FLOPs into a Xeon processor, it will be game, set, and match for x86 in HPC.

Sun's big SPARC64 boxes are nice, but probably more suited for enterprise computing than HPC. IBM's POWER6 are powerful but costly. SGI is setting itself up to exit Itanium (just look at their roadmaps about a future Intel CSI-based shared-memory Altix which will have a Xeon option).

With 64-bit addressing, and now four FLOPs/clock on x86, and more scalable architectures (HyperTransport, Intel CSI), and FPGA offload solutions, x86 nodes rapidly displacing RISC and Itanium in HPC. InfiniBand is getting better and better as an HPC cluster interconnect, and IBM has abandoned Federation. Cluster filesystems maturing with pNFS on the horizon as the ultimate, standards-based shared filesystem, causing SGI to abandon CXFS and leaving IBM's GPFS as a niche.

I think IBM's POWER/PPC based HPC solutions have peaked, and the golden age of x86 in HPC is really only starting.

Posted by Mark on July 16, 2007 at 10:52 AM PDT #

Post a Comment:
Comments are closed for this entry.