Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Significance of Results
The I/O intensive MSC/Nastran Vendor_2008 benchmark test suite was used to compare the performance on a Sun Fire X2270 server when using SSDs internally instead of HDDs.
The effect on performance from increasing memory to augment I/O caching was also examined. The Sun Fire X2270 server was equipped with Intel QC Xeon X5570 processors (Nehalem). The positive effect of adding memory to increase I/O caching is offset to some degree by the reduction in memory frequency with additional DIMMs in the bays of each memory channel on each cpu socket for these Nehalem processors.
- SSDs can significantly improve NASTRAN performance especially on runs with larger core counts.
- Additional memory in the server can also increase performance, however in some systems additional memory can decrease memory GHz so this may offset the benefits of increased capacity.
- If SSDs are not used striped disks will often improve performance of IO-bound MCAE applications.
- To obtain the highest performance it is recommended that SSDs be used and servers be configured with the largest memory possible without decreasing memory GHz. One should always look at the workload characteristics and compare against this benchmark to correctly set expectations.
SSD vs. HDD Performance
The performance of two striped 30GB SSDs was compared to two striped 7200 rpm 500GB SATA drives on a Sun Fire X2270 server.
- At the 8-core level (maximum cores for a single node) SSDs were 2.2x faster for the larger xxocmd2 and the smaller xlotdf1 cases.
- For 1-core results SSDs are up to 3% faster.
- On the smaller mdomdf1 test case there was no increase in performance on the 1-, 2-, and 4-cores configurations.
Performance Enhancement with I/O Memory Caching
Performance for Nastran can often be increased by additional memory to provide additional in-core space to cache I/O and thereby reduce the IO demands.
The main memory was doubled from 24GB to 48GB. At the 24GB level one 4GB DIMM was placed in the first bay of each of the 3 CPU memory channels on each of the two CPU sockets on the Sun Fire X2270 platform. This configuration allows a memory frequency of 1333MHz.
At the 48GB level a second 4GB DIMM was placed in the second bay of each of the 3 CPU memory channels on each socket. This reduces the memory frequency to 1066MHz.
Adding Memory With HDDs (SATA)
- The additional server memory increased the performance when running with the slower SATA drives at the higher core levels (e.g. 4- & 8-cores on a single node)
- The larger xxocmd2 case was 42% faster and the smaller xlotdf1 case was 32% faster at the maximum 8-core level on a single system.
- The special I/O intensive getrag case was 8% faster at the 1-core level.
Adding Memory With SDDs
- At the maximum 8-core level (for a single node) the larger xxocmd2 case was 47% faster in overall run time.
- The effects were much smaller at lower core counts and in the tests at the 1-core level most test cases ran from 5% to 14% slower with the slower CPU memory frequency dominating over the added in-core space available for I/O caching vs. direct transfer to SSD.
- Only the special I/O intensive getrag case was an exception running 6% faster at the 1-core level.
Increasing performance with Two Striped (SATA) Drives
The performance of multiple striped drives was also compared to single drive. The study compared two striped internal 7200 rpm 500GB SATA drives to a singe single internal SATA drive.
-
On a single node with 8 cores, the largest test
xx0cmd2 was 40% faster, a smaller test case xl0tdf1 was 33% faster and even
the smallest test case mdomdf1 case was 12% faster.
-
On 1-core the added boost in performance with striped disks was
from 4% to 13% on the various test cases.
- One 1-core the special I/O-intensive test case getrag was 29% faster.
Performance Landscape
Times in table are elapsed time (sec).
| MSC/Nastran Vendor_2008 Benchmark Test Suite |
||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Test | Cores | Sun Fire X2270 2 x X5570 QC 2.93 GHz 2 x 7200 RPM SATA HDDs |
Sun Fire X2270 2 x X5570 QC 2.93 GHz 2 x SSDs |
|||||||
| 48 GB 1067MHz |
24 GB 2 SATA 1333MHz |
24 GB 1 SATA 1333MHz |
Ratio (2xSATA): 48GB/ 24GB |
Ratio: 2xSATA/ 1xSATA |
48 GB 1067MHz |
24 GB 1333MHz |
Ratio: 48GB/ 24GB |
Ratio (24GB): 2xSATA/ 2xSSD |
||
| |
||||||||||
| vlosst1 | 1 | 133 | 127 | 134 | 1.05 | 0.95 | 133 | 126 | 1.05 | 1.01 |
| |
||||||||||
| xxocmd2 | 1 2 4 8 |
946 622 466 1049 |
895 614 631 1554 |
978 703 991 2590 |
1.06 1.01 0.74 0.68 |
0.87 0.87 0.64 0.60 |
947 600 426 381 |
884 583 404 711 |
1.07 1.03 1.05 0.53 |
1.01 1.05 1.56 2.18 |
| |
||||||||||
| xlotdf1 | 1 2 4 8 |
2226 1307 858 912 |
2000 1240 833 1562 |
2081 1308 1030 2336 |
1.11 1.05 1.03 0.58 |
0.96 0.95 0.81 0.67 |
2214 1315 744 674 |
1939 1189 751 712 |
1.14 1.10 0.99 0.95 |
1.03 1.04 1.11 2.19 |
| |
||||||||||
| xloimf1 | 1 | 1216 | 1151 | 1236 | 1.06 | 0.93 | 1228 | 1290 | 0.95 | 0.89 |
| |
||||||||||
| mdomdf1 | 1 2 4 |
987 524 270 |
913 485 237 |
983 520 269 |
1.08 1.08 1.14 |
0.93 0.93 0.88 |
987 524 270 |
911 484 250 |
1.08 1.08 1.08 |
1.00 1.00 0.95 |
| |
||||||||||
| Sol400_1 (xl1fn40_1) |
1 | 2555 | 2479 | 2674 | 1.03 | 0.93 | 2549 | 2402 | 1.06 | 1.03 |
| |
||||||||||
| Sol400_S (xl1fn40_S) |
1 | 2450 | 2302 | 2481 | 1.06 | 0.93 | 2449 | 2262 | 1.08 | 1.02 |
| |
||||||||||
| getrag (xx0xst0) |
1 | 778 | 843 | 1178 | 0.92 | 0.71 | 771 | 817 | 0.94 | 1.03 |
| |
||||||||||
Results and Configuration Summary
Hardware Configuration:-
Sun Fire X2270
-
1 2-socket rack mounted server
2 x 2.93 GHz QC Intel Xeon X5570 processors
2 x internal striped SSDs
2 x internal striped 7200 rpm 500GB SATA drives
Software Configuration:
-
O/S: Linux 64-bit SUSE SLES 10 SP 2
Application: MSC/NASTRAN MD 2008
Benchmark: MSC/NASTRAN Vendor_2008 Benchmark Test Suite
HP MPI: 02.03.00.00 [7585] Linux x86-64
Voltaire OFED-5.1.3.1_5 GridStack for SLES 10
Benchmark Description
The benchmark tests are representative of typical MSC/Nastran applications including both SMP and DMP runs involving linear statics, nonlinear statics, and natural frequency extraction.
The MD (Multi Discipline) Nastran 2008 application performs both structural (stress) analysis and thermal analysis. These analyses may be either static or transient dynamic and can be linear or nonlinear as far as material behavior and/or deformations are concerned. The new release includes the MARC module for general purpose nonlinear analyses and the Dytran module that employs an explicit solver to analyze crash and high velocity impact conditions.
- As of the Summer '08 there is now an official Solaris X64 version of the MD Nastran 2008 system that is certified and maintained.
- The memory requirements for the test cases in the new MSC/Nastran Vendor 2008 benchmark test suite range from a few hundred megabytes to no more than 5 GB.
Please go here for a more complete description of the tests.
Key Points and Best Practices
For more on Best Practices of SSD on HPC applications also see the Sun Blueprint:
http://wikis.sun.com/display/BluePrints/Solid+State+Drives+in+HPC+-+Reducing+the+IO+Bottleneck
Additional information on the MSC/Nastran Vendor 2008 benchmark test suite.
-
Based on the maximum physical memory on a platform the user can stipulate the maximum portion of this memory that can be allocated to the Nastran job. This is done on the command line with the mem= option. On Linux based systems where the platform has a large amount of memory and where the model does not have large scratch I/O requirements the memory can be allocated to a tmpfs scratch space file system. On Solaris X64 systems advantage can be taken of ZFS for higher I/O performance.
-
The MSC/Nastran Vendor 2008 test cases don't scale very well, a few not at all and the rest on up to 8 cores at best.
-
The test cases for the MSC/Nastran module all have a substantial I/O component where 15% to 25% of the total run times are associated with I/O activity (primarily scratch files). The required scratch file size ranges from less than 1 GB on up to about 140 GB. Performance will be enhanced by using the fastest available drives and striping together more than one of them or using a high performance disk storage system, further enhanced as indicated here by implementing the Lustre based I/O system. High performance interconnects such as Infiniband for inter node cluster message passing as well as I/O transfer from the storage system can also enhance performance substantially.
See Also
Disclosure Statement
MSC.Software is a registered trademark of MSC. All information on the MSC.Software website is copyrighted. MSC/Nastran Vendor 2008 results from http://www.mscsoftware.com and this report as of June 9, 2009.

You can take the advantage of SSD only when the test scales up to at least 4 cores. Is it because otherwise the memory requirements are no more than 10GB and most of the I/O time is absorbed by I/O cache?
Posted by Joshua on June 17, 2009 at 10:57 AM PDT #
Below 4 cores the IO demands are not very great for this workload, so one shouldn't expect a big boost in performance. Other workloads or applications will benefit from SSDs even on 1-4 core counts if they have more IO.
Posted by 1234core on June 17, 2009 at 02:35 PM PDT #