BM Seer Unofficial thoughts from an anonymous Sun employee

World Record ANSYS on Sun Blade X6250 (Xeon 3GHz DC 5160)

Tuesday Jul 10, 2007

The Sun Blade X6250 outperfoms all posted ANSYS V11.0 (MCAE) results at www.ansys.com website. A single Sun Blade X6250 beats a single Intel S5000 XAL (same 3GHZ Xeon 5160) by as much as 40% at each of the three "cpu" levels tested (1-, 2-, and all 4 cores available on both 2 socket platforms equipped with dual core processors). Sun Wins at these processor configurations in 6 of the total 7 cases in the benchmark test suite. Overall, on the geometric mean, Sun was 10% higher.

The only case "bm-2" where the Sun X6250 looses has an exceptionally high I/O component, and even so Sun was only 3-4% slower. The Sun X6250 had 10K rpm internal disk drives where the Intel S5000 XAL had 15K rpm drives.

The Sun Blade X6250 with 3.0GHz Xeon EM64T 5160 (Woodcrest) and under 64-bit Linux SuSE SLES 10 beats all of the following platforms with results posted at the ANSYS website for all 7 test cases in the ANSYS "Standard" benchmark test suite (1-, 2- & 4-cpu).

Yes this result was run with Linux, Sun wants to show that we can win with every OS. There now is an officially certified, supported and maintained version of a Solaris build of ANSYS V11.0 for X86-64 platform architectures compiled with recent Sun Studio 11 compilers. This is the first SX64 version that has become available.

Competitive Landscape

ANSYS V 11.0 "Standard" Benchmark Test Suite on X2200 M2 & Constellation Blades (run times in seconds, smaller is better; for % bigger is better)

System Cores bm-1 bm-2 bm-3 bm-4 bm-5 bm-6 bm-7
 
Sun X6250/5160 4 100 1362 343 164 181 131 752
Intel S5000XAL/5160 4 109 1312 369 169 187 161 1048
Sun % better   9% -4% 8% 3% 3% 23% 39%
 
Sun X6250/5160 2 118 1398 385 183 223 169 1064
Intel S5000XAL/5160 2 128 1356 417 186 244 211 1437
Sun % better   9% -3% 8% 2% 9% 25% 35%
 
Sun X6250/5160 1 150 1455 456 211 339 253 1770
Intel S5000XAL/5160 1 164 1416 489 215 340 314 2330
Sun % better   9% -3% 7% 2% 1% 24% 32%

    (please note: per core performance isn't the right metric for comparing different CPUs, as system costs vary greatly, but they are used here to identify configuration) It is "SYSTEM" performance not 'core' performance that matters!)

Key Technical Points

  • The test cases from the ANSYS standard benchmark test suite all have a substantial I/O component where 15% to 20% of the total run times are associated with I/O activity (primarily scratch files). Performance will be enhanced by using the fastest available drives and striping together more than one of them or using a high performance disk storage system with high performance interconnects. When running with the SX64 build a ZFS system might be a good idea to employ.

ANSYS 11.0 Standard Test Cases

    bm-1
    Name:Exhaust Elbow Manifold
    Description:Static structural analysis. Solved for equivalent stresses.
    Statistics:~850,000 DOF Model

    bm-2
    Name:Floor Panel
    Description:Surface body geometry. Harmonic analysis with mode superposition.
    Statistics:~765,000 DOF Model

    bm-3
    Name:Engine Assembly - Piston and Crank
    Description:Assembly with contact. Nonlinear structural DOF solution.
    Statistics:~250,000 DOF Model

    bm-4
    Name:Electric Motor
    Description:Electromagnetic analysis. Solved for magnetic field intensities.
    Statistics:~250,000 DOF Model

    bm-5
    Name:Brake Rotor
    Description:Thermal transient analysis. Solved for temperature DOF?s.
    Statistics:~230,000 DOF Model

    bm-6
    Name:Wing Section
    Description:Static structural analysis.
    Statistics:~250,000 DOF Model
    Comparing bm-6 and bm-7 is a good indication of performance characteristics for systems as larger problems are attempted. These problems will differentiate hardware performance most accurately for users expecting to solve problems approaching 1 million degrees of freedom or more.

    bm-7
    Name:Wing Section
    Description:Static structural analysis.
    Statistics:~800,000 DOF Model
    Notes:bm-6 and bm-7 are designed to demonstrate ability of systems to handle larger memory demands and increased I/O. bm-6 should run well on any system. Bm-7 will be substantially impared in performance on a 32-bit machine limited to 2 or 3 Gbytes of memory. The model used for these runs selects Solid95 20-node brick elements. The cost of matrix factorization for these elements is much higher than the shell dominated model in bm-1 Bm-7 generates a large 12.8 Gb file containing the factored matrix. It requires aver 1 Gbyte of solver memory to run in optimal out-of-core mode. On PC workstations the solver will run using less than optimal out-of-core memory requiring excessive I/O during factorization. Comparing bm-6 and bm-7 is a good indication of performance characteristics for systems as larger problems are attempted. These problems will differentiate hardware performance most accurately for users expecting to solve problems approaching 1 million degrees of freedom or more.

Disclosure Statement:

The following are trademarks or registered trademarks of ANSYS, Inc. : ANSYS Multiphysics TM All information on the ANSYS website is Copyrighted 2007 by ANSYS, Inc. Results at http://www.ansys.com/services/hardware-support-db.htm, July 2, 2007.

Hardware Configuration:

Sun Blade X6250

    4 2-socket Sun Blade X6250's
    2x3.0 GHz DC Intel Xeon EM64T 5160 (Woodcrest) processors
    32 GB memory
Software Configuration:
    64-bit Linux SuSE SLES 10
    (note: Sun works great with Linux, that is why we show all kinds of benchmarks! )
    ANSYS V11.0
    ANSYS 11 "Standard" Benchmark Test Suite

Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

Post a Comment:
Comments are closed for this entry.