Sun Blade X6250 Cluster EXA PowerFLOW World Record
Thursday Jan 24, 2008
The EXA PowerFLOW benchmark test suite was run on a mini cluster of Sun Blade X6250 blades with the recently announced 3.33 GHz dual-core Intel 5260. The Sun Blade X6250 mini cluster beats all posted results at the PowerFLOW Performance website up to the eight cores that were considered.
- In runs of the benchmark test suite, the Sun X6250 cluster was nominally 20% faster than the best result from the top IBM, HP, or SGI clusters. The variation was from 15% to 24% faster over the 4-core levels considered.
- The scaling efficiency of the Sun X6250 cluster ranged from 100% (at 1 core) to on average 83% (at 8 cores).
Four 2-socket Sun X6250 blades with Infiniband interconnects were used and runs were made at different core levels: 1, 2, 4, and 8. Comparisons are presented against the current leading competitors' results also obtained with high perfomance interconnects and posted at the EXA PowerFLOW Performance website. This includes results from IBM, HP, and SGI platforms.
EXA PowerFLOW V 3.6c Benchmark Case 1 (Smaller Model), Results are total elapsed time in seconds
| System | Processor | Number of Cores | |||
|---|---|---|---|---|---|
| 1 | 2 | 4 | 8 | ||
| Sun Blade X6250 | 3.33GHz DC Intel X5260 | 720.49 | 389.36 | 195.43 | 110.36 |
| Sun Blade X6250 | 3.0GHz DC Intel 5160 | 822.71 | 418.47 | 214.48 | 118.63 |
| Sun Blade X6250 | 3.0GHz QC Intel X5365 | 844.30 | 430.72 | 214.41 | 121.25 |
| Sun Fire X2200 M2 | 3.0GHz DC Opteron | 943.41 | 461.11 | 232.93 | 123.12 |
| Sun Blade X8440 | 3.0GHz DC Opteron | 937.24 | 472.58 | 238.58 | 127.44 |
| HP BL460 | 3.0GHz DC Xeon | -- | -- | -- | 137.22 |
| Sun Fire X4450 | 2.93GHz QC Intel 7350 | 874.12 | 462.12 | 241.08 | 137.74 |
| IBM e1350 | 3.0GHz DC Xeon 5160 | -- | -- | 240.31 | 141.46 |
| HP BL465 | 2.6GHz DC Opteron | -- | -- | -- | 146.31 |
| SGI Altix | 3.0GHz DC Xeon | 866.09 | 448.80 | 264.82 | 147.88 |
| HP rx2660 | 1.6GHz DC Itanium2 | -- | -- | -- | 214.25 |
| SGI Altix | 1.6GHz DC Itanium2 | 1631.4 | 832.68 | 438.43 | 227.24 |
EXA PowerFLOW V 3.6c Benchmark Case 1 (Smaller Model)
Results are total elapsed time in seconds
| System | Processor | Number of Cores | |||
|---|---|---|---|---|---|
| 1 | 2 | 4 | 8 | ||
| Sun Blade X6250 | 3.33GHz DC Intel X5260 | 1714.22 | 925.77 | 463.11 | 252.64 |
| Sun Blade X6250 | 3.0GHz DC Intel 5160 | 1966.38 | 987.47 | 500.54 | 258.42 |
| Sun Blade X6250 | 3.0GHz QC Intel X5365 | 1991.78 | 1010.63 | 507.21 | 278.64 |
| Sun Fire X2200 M2 | 3.0GHz DC Opteron | 2273.02 | 1086.65 | 550.34 | 282.53 |
| Sun Blade X8440 | 3.0GHz DC Opteron | 2210.13 | 1130.27 | 562.68 | 289.81 |
| HP BL460 | 3.0GHz DC Xeon | -- | -- | -- | 310.03 |
| IBM e1350 | 3.0GHz DC Xeon 5160 | -- | -- | 557.04 | 314.23 |
| SGI Altix | 3.0GHz DC Xeon | 2043.59 | 1062.38 | 620.67 | 315.96 |
| Sun Fire X4450 | 2.93GHz QC Intel 7350 | 2062.00 | 1066.34 | 598.81 | 319.92 |
| HP BL465 | 2.6GHz DC Opteron | -- | -- | -- | 331.82 |
| HP rx2660 | 1.6GHz DC Itanium2 | -- | -- | -- | 490.74 |
| SGI Altix | 1.6GHz DC Itanium2 | 3883.97 | 2000.44 | 1054.45 | 526.74 |
Key Technical Points
- Real world CFD engineering models are typically very large and are best analyzed with many cores in order to achieve reasonable turnaround on run times. Scalability running these large models with PowerFLOW is very good often linear or perfect up to 64 or larger.
- Performance when running PowerFLOW in a multi node configuration is significantly enhanced when using high performance interconnects such as Infiniband
- PowerFLOW supports a variety of interconnects from various hardware vendors (starting with gigabit ethernet then Infiniband e.g. Voltaire, Cisco/Topspin, QLogic, then Myrinet) MPI's (HP-MPI, MVAPICH 2, LAM) and communication protocals (e.g. ssh and rsh)
- There is still not an officially certified version of a Solaris build of PowerFLOW for X86-64 platform architectures.
- The PowerFLOW benchmark test suite consists of two test cases. They are two models of the same analysis but of differnt sizes (different mesh refinement), pertaining to flow over a car body. Both models are rather large and scale very well up to and even beyond 64 cores.
- The two test cases in the suite, require from 6 to 8 GB of memory running with only one core on a single node. This memory requirement per node is reduced when running in a dmp cluster mode on multi nodes.
- PowerFLOW runs are cpu and memory intensive but do not require any special high performance I/O file systems.
- When running the test suite a run script is provided ("exabench") that will automatically run one or both test cases over a range of core levels on the particular cluster nodes as specified in an "mpi_file" along with the number of requested cores to be used per node.
Exa Corporation Copyright All information on the EXA website is Copyrighted @ 2007, 1996-2006 by Exa Corp oration., PowerFLOW is a registered trademark of EXA Corporation. Results from http://www.exa.com/user_center/index.html as of January 17, 2008.
Benchmark Description
The EXA PowerFLOW Benchmark Test Suite
The PowerFLOW performance benchmark test suite consists of two standard
cases, each a simulation of external airflow around an automobile.
-
Case #1
Description: This smaller case has 18.2 million voxels (8.4 million fine-equivalent) and 1.2 million surfels (690 K fine-equivalent).
Case #2
Description: This larger case has 23.6 million voxels (18.9 million
fine-equivalent) and 1.7 million surfels (1.5 million
fine-equivalent).
When describing the size of the cases, it is important to note that voxels and surfels within different VR regions have different computational costs associated with them. To account for this, fine-equivalent voxels and surfels are a measure of computational load that takes into account the lower cost of processing coarser scales of resolution. For example, a voxel at the second-finest scale, is processed only half as often (every other timestep) as a voxel at the finest scale, and thus has half the computational cost.
System Configuration
SUSE Linux Enterprise Server SLES 10
Voltaire OFED GridStack-4.1.5_7-sles-k2.6.16.21-0.8-smp-x86_64
HP-MPI
EXA PowerFLOW 3.6c
PowerFLOW 3.6 Benchmark Test Suite
See Also
Current EXA PowerFLOW V3.6c & V4.6c results at (EXA password required):
http://www.exa.com/user_center/index.html










