Sun Fire X2270 Cluster Fluent Benchmark Results
Significance of Results
A Sun Fire X2270 cluster equipped with 2.93 GHz QC Intel X5570 proceesors and DDR Infiniband interconnects delivered outstanding performance running the FLUENT benchmark test suite.
- The Sun Fire X2270 cluster running at 64-cores delivered the best performance for the 3 largest test cases. On the "truck" workload Sun was 14% faster than SGI Altix ICE 8200.
- The Sun Fire X2270 cluster running at 32-cores delivered the best performance for 5 of the 6 test cases.
- The Sun Fire X2270 cluster running at 16-cores beat all comers in all 6 test cases.
Performance Landscape
| New FLUENT Benchmark Test Suite |
|||||||
|---|---|---|---|---|---|---|---|
|
Results are "Ratings" (bigger is better) Rating = No. of sequential runs of test case possible in 1 day 86,400/(Total Elapsed Run Time in Seconds) Results ordered by truck_poly column |
|||||||
| System (1) |
cores | Benchmark Test Case | |||||
| eddy 417k |
turbo 500k |
aircraft 2m |
sedan 4m |
truck 14m |
truck_poly 14m |
||
| |
|||||||
| Sun Fire X2270, 8 node | 64 | 4645.2 | 23671.2 | 3445.7 | 4909.1 | 566.9 | 494.8 |
| Intel Endeavor, 8 node | 64 | 5016.0 | 25226.3 | 5220.5 | 4614.2 | 513.4 | 490.9 |
| SGI Altix ICE 8200 IP95, 8 node | 64 | 5142.9 | 23834.5 | 4614.2 | 4352.6 | 496.8 | 479.2 |
| |
|||||||
| Sun Fire X2270, 4-node | 32 | 2971.6 | 13824.0 | 3074.7 | 2644.2 | 291.8 | 271.8 |
| Intel Endeavor, 4-node | 32 | 2856.2 | 13041.5 | 2837.4 | 2465.0 | 266.4 | 251.2 |
| SGI Altix ICE 8200 IP95, 4-node | 32 | 3083.0 | 13190.8 | 2563.8 | 2405.0 | 266.6 | 246.5 |
| Sun Fire X2250, 8-node | 32 | 2095.8 | 9600.0 | 1844.2 | 1394.1 | 203.2 | 196.8 |
| |
|||||||
| Sun Fire X2270, 2-node | 16 | 1726.3 | 7595.6 | 1520.5 | 1363.3 | 145.5 | 141.8 |
| SGI Altix ICE 8200 IP95, 2-node | 16 | 1708.4 | 7384.6 | 1507.9 | 1211.8 | 128.8 | 133.5 |
| Intel Endeavor, 2-node | 16 | 1585.3 | 7125.8 | 1428.1 | 1278.6 | 134.7 | 132.5 |
| Sun Fire X2250, 4-node | 16 | 1404.9 | 6249.5 | 1324.6 | 996.3 | 127.7 | 129.2 |
| |
|||||||
| Sun Fire X2270, 1-node | 8 | 945.8 | 4129.0 | 883.0 | 682.5 | 73.5 | 72.4 |
| SGI Altix ICE 8200 IP95, 1-node | 8 | 953.1 | 4032.7 | 843.3 | 651.0 | 71.4 | 72.0 |
| Sun Fire X2250, 2-node | 8 | 824.2 | 3248.1 | 711.4 | 517.9 | 66.1 | 67.9 |
| |
|||||||
| SGI Altix ICE 8200 IP95, 1-node | 4 | 561.6 | 2416.8 | 526.9 | 412.6 | 40.9 | 40.8 |
| Sun Fire X2270, 1-node | 4 | 541.5 | 2346.2 | 515.7 | 409.3 | 40.8 | 40.2 |
| Sun Fire X2250, 1-node | 4 | 449.2 | 1691.6 | 389.0 | 271.8 | 33.6 | 34.9 |
| |
|||||||
| Sun Fire X2270, 1-node | 2 | 292.8 | 1282.4 | 283.4 | 223.1 | 20.9 | 21.2 |
| SGI Altix ICE 8200 IP95, 1-node | 2 | 294.2 | 1302.7 | 289.0 | 226.4 | 20.5 | 21.2 |
| Sun Fire X2250, 1-node | 2 | 224.4 | 881.0 | 197.9 | 134.4 | 16.3 | 17.6 |
| |
|||||||
| Sun Fire X2270, 1-node | 1 | 150.7 | 658.3 | 143.2 | 110.1 | 10.2 | 10.6 |
| SGI Altix ICE 8200 IP95, 1-node | 1 | 153.3 | 677.5 | 147.3 | 111.2 | 10.3 | 9.5 |
| Sun Fire X2250, 1-node | 1 | 115.4 | 458.2 | 100.1 | 66.6 | 8.0 | 9.0 |
| |
|||||||
| Sun Fire X2270, 1-node | serial | 151.4 | 656.7 | 151.3 | 107.1 | 9.3 | 10.1 |
| Intel Endeavor, 1-node | serial | 146.6 | 650.0 | 150.2 | 105.6 | 8.8 | 9.7 |
| Sun Fire X2250, 1-node | serial | 115.2 | 461.7 | 101.0 | 65.0 | 7.2 | 9.0 |
| |
|||||||
(1) SGI Altix ICE 8200, X5570 QC 2.93GHz, DDR
Intel Endeavor, X5560 QC 2.8GHz, DDR
Sun Fire X2250, X5272 DC 3.4GHz, DDR IB
Sun Fire X2270, X5570 QC 3.4GHz, DDR
Results and Configuration Summary
Hardware Configuration
-
8 x Sun Fire X2270 (each with)
2 x 2.93GHz Intel X5570 QC processors (Nehalem)
1333 MHz DDR3 dimms
Infiniband (Voltaire) DDR interconnects & DDR switch, IB
Software Configuration
-
OS: 64-bit SUSE Linux Enterprise Server SLES 10 SP 2
Interconnect software: Voltaire OFED GridStack-5.1.3.1_5
Application: FLUENT Beta V12.0.15
Benchmark: FLUENT "6.3" Benchmark Test Suite
Benchmark Description
The benchmark test are representative of typical user large CFD models intended for execution in distributed memory processor (DMP) mode over a cluster of multi-processor platforms.
Please go here for a more complete description of the tests.
Key Points and Best Practices
Observations About the Results
The Sun Fires X2270 cluster delivered excellent performance, especially shining with the larger problems (truck and truck_poly).These processors include a turbo boost feature coupled with a speedstep option in the CPU section of the Advanced BIOS settings. This, under specific circumstances, can provide a cpu upclocking, temporarily increasing the processor frequency from 2.93GHz to 3.2GHz.
Memory placement is a very significant factor with Nehalem processors. Current Nehalem platforms have two sockets. Each socket has three memory channels and each channel has 3 bays for DIMMs. For example if one DIMM is placed in the 1st bay of each of the 3 channels the DIMM speed will be 1333 MHz with the X5570's altering the DIMM arrangement to an off balance configuration by say adding just one more DIMM into the 2nd bay of one channel will cause the DIMM frequency to drop from 1333 MHz to 1067 MHz.
About the FLUENT "6.3" Benchmark Test Suite
The FLUENT application performs computational fluid dynamic analysis on a variety of different types of flow and allows for chemically reacting species. transient dynamic and can be linear or nonlinear as far
- CFD models tend to be very large where grid refinement is required to capture with accuracy conditions in the boundary layer region adjacent to the body over which flow is occurring. Fine grids are required to also determine accurate turbulence conditions. As such these models can run for many hours or even days as well using a large number of processors.
- CFD models typically scale very well and are very suited for execution on clusters. The FLUENT "6.3" benchmark test cases scale well particularly up to 64 cores.
- The memory requirements for the test cases in the new FLUENT "6.3" benchmark test suite range from a few hundred megabytes to about 25 GB. As the job is distributed over multiple nodes the memory requirements per node correspondingly are reduced.
- The benchmark test cases for the FLUENT module do not have a substantial I/O component. component. However performance will be enhanced very substantially by using high performance interconnects such as Infiniband for inter node cluster message passing. This nodal message passing data can be stored locally on each node or on a shared file system.
- As a result of the large amount of inter node message passing performance can be further enhanced by more than a 3x factor as indicated here by implementing the Lustre based shared file I/O system.
See Also
Current FLUENT "12.0 Beta" Benchmark:
http://www.fluent.com/software/fluent/fl6bench/fl6bench_6.4.x/
Disclosure Statement
All information on the Fluent website is Copyrighted 1995-2009 by Fluent Inc. Results from http://www.fluent.com/software/fluent/fl6bench/ as of June 9, 2009 and this presentation.
