BM Seer Unofficial thoughts from an anonymous Sun employee

Linpack HPC benchmark Sun SPARC Enterprise M9000 @ 2.52GHz

Monday Jul 14, 2008

The Sun SPARC Enterprise M9000 server running 2.52GHz SPARC64 VII processors delivered 2.023 TFLOPS on the Linpack HPC benchmark.

For single servers, the Sun SPARC Enterprise M9000 server outperforms the best IBM Power 595 5GHz POWER6 published result by two times on the Linpack HPC benchmark. This system is the largest that IBM makes for its 5GHz Power6-based servers.

A single Sun SPARC Enterprise M9000 server shows 2.7 times the performance on the Linpack HPC benchmark when compared to the HP Integrity Superdome Itanium 2 system.

The Sun Performance Library was enhanced to take advantage of the SPARC64 VII architecture.

Benchmark Description

The Linpack benchmark suite measures the performance for factoring and solving a dense set of linear equations in double-precision floating-point.

The Linpack HPC benchmark allows the solution of any size matrix with a single right hand side. It was developed to allow vendors to show off their hardware. Because big problems allow for peak performance potentials, the benchmark is seen as an upper bound of potential performance of a machine. The run rules are much more flexible. The solution technique must use a pivoting scheme and the driver must follow the spirit of the Linpack 1000 or Linpack 100 benchmarks.

LINPACK HPC Performance Chart - GFLOPS (bigger is better)

Table below does not include clustered solutions.

System GFLOPS Processors
Total Peak Threads CPUs Type GHz
Sun SPARC Enterprise M9000 2023.0 2580.5 256 64 SPARC64 VII 2.52
Sun SPARC Enterprise M9000 1032.0 1228.8 128 64 SPARC64 VI 2.4
IBM Power 595 1028.0 1280.0 64 32 POWER6 5.0
HP Superdome 745.5 819.2 128 64 Itanium 2 1.6
Sun SPARC Enterprise M8000 548.2 645.1 64 16 SPARC64 VII 2.52

Disclosure Statement:

Linpack HPC, results from http://www.netlib.org/benchmark/index.html as of 07/01/08. Sun SPARC Enterprise M9000 (SPARC64 VII @2.52, 64 chips, 256 cores), 2.023 TFLOPS. IBM Power 595 (POWER6 5.0GHz, 32 chips, 64 cores) 1028.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips, 128 cores) 745.5 GFLOPS.

Linpack HPC, results from http://www.netlib.org/benchmark/index.html as of 04/13/07. Sun SPARC Enterprise M9000 (SPARC64 VI @2.4, 64 chips, 128 cores), 1.032 TFLOPS. IBM p5 595 (POWER5 1.9GHz, 32 chips, 64 cores) 418.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips, 128 cores) 745.5 GFLOPS.

Results Summary SAE (Strategic Applications Engineering) has submitted results for the LINPACK HPC benchmark
Published Results
Performance: 2.023 TFLOPS
System: Sun SPARC Enterprise M9000
Total Number Processors: 64
Processor/GHz of Server: SPARC64 VII, 2.52 GHz
Operating System: Solaris 10
Compiler: Sun Studio 12

[1] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

may as well post here, seeing as you closed the comments in the previous thread.

Your the one making these bold and baseless statements. Its not upto us to prove your wrong. Why dont you check your facts before posting your nonsense ?

Why does IBM not publish data on I/O rates and virtualization overheads ? Its really quite simple. Overheads are directly linked to configuration, specification, design and workload types and patterns. Its impossible to cover every scenario.

My own testing shows that overheads are dynamic. If a system has 25% overhead, then i would argue thats probably a poorly designed system, or inappopriate for the workloads.

IBM do actually state the following;

"The hypervisor functions running on a system in LPAR mode typically adds less than 5% overhead to normal memory and I/O operations. Running multiple partitions simultaneously generally has little performance impact on the other partitions, but there are circumstances that can affect performance. There is some extra overhead associated with the hypervisor for the virtual memory management. This should be minor for most workloads, but the impact increases with extensive amounts of page-mapping activity. Partitioning may actually help performance in some cases for applications that do not scale well on large SMP systems by enforcing strong separation between workloads running in the separate partitions."

Heres the source for that info. I would agree with this statement, based on my experience in support, and benchmarking.

http://publib.boulder.ibm.com/infocenter/pseries/v5r3/topic/com.ibm.aix.prftungd/doc/prftungd/lpar_perf_impacts.htm

Seeing as your too lazy to do the research, heres the link where you can read about virtual processor folding feature too.

http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/virtual_proc_mngmnt_part.htm

Posted by Alex on July 17, 2008 at 11:05 AM PDT #

Post a Comment:
Comments are closed for this entry.