BM Seer Facts & Questions from an Anonymous Sun Source

ABAQUS V6.7 Benchmarks Sun Blade X6250 Cluster World Record

Friday Jan 25, 2008

The ABAQUS "Explicit" benchmark test suite was run on a mini cluster of Sun Blade X6250 blades with the recently announced 3.33 GHz dual-core Intel 5260. The Sun Blade X6250 mini cluster beats all posted results at the ABAQUS V6.7 website up to the eight cores.

  • The closest posted results from a competitor's platform were primarily from an HP XC with dual-core 3GHz 5160 processors and to a limited degree (at the 4 "cpu" level) by an Intel Supermicro with 3GHz quad-core E5472's.
  • In runs of the six cases in the benchmark test suite, the X6250 cluster was nominally 17% faster than the best results coming either from the top HP or Intel cluster over the 4-core levels considered and considering results for all 6 test cases.
  • The scalability efficiency of the X6250 cluster ranged from 100% (at 1 core) to 81% (geometric mean at 8 cores) and considering all 6 test cases at each of the four core levels.
  • Four 2 socket Sun X6250 blades with Infiniband interconnects were used and runs were made at different core levels: 1, 2, 4, and 8. Comparisons are presented against the current leading competitors' results also obtained with high performance interconnects and posted at the ABAQUS V6.7 website. This includes results from IBM, HP, and Intel platforms and clusters with current dual-core and quad-core Intel processors.

    ABAQUS V6.7 "Explicit" Benchmark Test Suite, time in elapsed seconds

    Please note, this table has been modified since the original posting to correct the table and make sure only V6.7 results are shown, sorry for the confusion, but the Sun internal information sites changed since my posting.
    System CPU Benchmark Test
    e1 e2 e3 e4 e5 e6
    One core results
    Sun Blade X6250 3.33GHz DC 5260 23565 12399 11037 4884 4648 11975
    Sun Blade X6250 3.0GHz QC 5365 26401 14236 12302 5456 5349 13266
    Intel Supermicro 3.0GHz QC E5472 24815 13738 12504 5273 5299 13456
    HP XC 3.0GHz DC 5160 23957 13659 11289 5157 5122 12601
    Bull R440 3.0GHz DC 5160 25132 14086 12237 5352 5231 13213
    Two core results
    Sun Blade X6250 3.33GHz DC 5260 12008 6465 5218 2647 2447 6739
    Sun Blade X6250 3.0GHz QC 5365 14262 7501 6379 2959 2742 7486
    Intel Supermicro 3.0GHz QC E5472 14060 7151 6341 2900 2693 7880
    HP XC 3.0GHz DC 5160 13229 6998 6201 2838 2657 7336
    Bull R440 3.0GHz DC 5160 13859 7283 6575 2997 2756 7752
    Four core results
    Sun Blade X6250 3.33GHz DC 5260 7868 3888 3064 1482 1328 4025
    Sun Blade X6250 3.0GHz QC 5365 8595 4195 3372 1577 1440 4375
    Intel Supermicro 3.0GHz QC E5472 8264 3857 3438 1616 1440 4534
    HP XC 3.0GHz DC 5160 9843 4434 4413 1856 1619 5235
    Bull R440 3.0GHz DC 5160 10067 4559 4485 1964 1651 5378
    Eight core results
    Sun Blade X6250 3.33GHz DC 5260 5209 2439 1922 979 736 2510
    Sun Blade X6250 3.0GHz QC 5365 5650 2556 2158 1090 824 2774
    Intel Supermicro 3.0GHz QC E5472 6077 2473 2529 1205 910 3339
    HP XC 3.0GHz DC 5160 5140 2311 2280 1074 823 2948
    Bull R440 3.0GHz DC 5160 5366 2406 2303 1127 860 3092

    About The ABAQUS Explicit Module

    This module designed for crash and high velocity impact analyses is very scalable and analysis models tend to be very large similar to CFD models. Timely results are best obtained using multiple processing units for typically large jobs either on a single multi core server in smp mode or on a multi node cluster of multi core platforms interconnected in dmp mode.

    • The test cases in the ABAQUS "Explicit" benchmark test suite do not require much memory (all around a few hundred megabytes)
    • The ABAQUS test cases scale very well up to 16 cores. All of the solvers in the Explicit module work in dmp mode on clusters. The ABAQUS default mode for MPI is HP-MPI.
    • Based on the maximum physical memory on a platform the user can stipulate the maximum portion of this memory that can be allocated to the ABAQUS job. This is done in the "abaqus_v6.env" file that either resides in the subdirectory from where the job was launched or in the abaqus "site" subdirectory under the home installation directory.
    • The test cases for the ABAQUS benchmark test suites all have a substantial I/O component. This I/O activity is primarily associated with temporary scratch files. Performance will be enhanced by using the fastest available drives and striping together more than one of them or using a high performance disk storage system with high performance interconnects.

    System Configuration

  • 4 Sun Blade X6250
  • 3.33 GHz dual-core Intel 5260
  • 2 internal striped 15K SAS drives (cluster shared file system)
  • Infiniband (Voltaire) interconnects
  • 64-bit SUSE Linux Enterprise Server SLES 10
  • Voltaire OFED GridStack-4.1.5_7-sles-k2.6.16.21-0.8-smp-x86_64
  • HP-MPI
  • ABAQUS V6.7 Explicit Module
  • ABAQUS 6.7 Explicit Benchmark Test Suite
  • Disclosure Statement:

    The following are trademarks or registered trademarks of Dassault Systems or its subsidiaries in the United States and/or other countries: Abaqus, Abaqus/Standard, Abaqus/Explicit. All information on the ABAQUS website is Copyrighted 2004-2007 by Dassault Systemes. Results from http://www.simulia.com/support/v67/v67_performance.html as of Jan. 18, 2008.

    [2] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg

    World Record ABAQUS V6.6 on the Sun Blade X6250 Cluster

    Wednesday Jul 11, 2007

    Sun Blade X6250 posted World Record on the ABAQUS Explicit benchmark test suite the Sun Blade X6250 on the MCAE application ABAQUS V6.6. the Sun Blade X6250 used Xeon 3GHz DC 5160. On the various test cases Sun beats the Intel Supermicro by or by 1% to 39% !! The Sun Blade X6250 beats the Intel Supermicro even when you average all of the test case by an average 4% to 9% (geometric mean of all 6 tests cases at all cpu levels listed).

    Both machines have 2 sockets and dual core processors. Runs were made at 1- 2- and 4-cores and a geometric mean was established at each of these "cpu" levels based on the 6 test cases in the benchmark test suite.

    The Sun Blade X6250 with 3.0GHz Xeon EM64T 5160 (Woodcrest) processors and under 64-bit Linux SuSE SLES 10 beats all of the following platforms with results posted at the ABAQUS website and for all 6 test cases in the ABAQUS "Explicit" benchmark test suite and at the 3 "cpu" levels (1-, 2- & 4-"cpu's"):

    About The ABAQUS Explicit Module

    This module designed for crash and high velocity impact analyses (including wave propagation and inertia effects) is very scalable and analysis models tend to be very large similar to CFD models. Timely results are best obtained using multiple processing units for typically large jobs either on a single multi core server in smp mode or on a multi node cluster of multi core platforms interconnected in dmp mode.

    Consequently this module is meant to run primarily in a multi cpu situation either in smp mode on a single large multi core machine or in dmp mode over a cluster of machines.

    ABAQUS V6.6-1 Benchmark Test Suites Explicit Benchmark Test Suite Landscape (time in seconds where smaller is better, Sun % better where bigger is better)

    Platform Cores e1 e2 e3 e4 e5 e6 Geometric Mean
     
    Sun Blade X6250/5160 4 10451 4509 3853 1887 1990 5202  
    Intel Super/5160's/RH4 4 10696 4646 3881 1997 2126 5460  
    Sun % Faster   2% 3% 1% 6% 7% 5% 4%
     
    Sun Blade X6250/5160 2 14232 7401 5477 2935 3327 7582  
    Intel Super/5160's/RH4 2 14878 8044 6316 3310 3483 8048  
    Sun % Faster   5% 9% 15% 13% 5% 6% 9%
     
    Sun Blade X6250/5160 1 24800 14198 10174 5147 6112 9553  
    Intel Super/5160 1 25076 14616 10563 5225 6272 13242  
    Sun % Faster   1% 3% 4% 1% 3% 39% 8%

    Abaqus/Explicit Benchmark Problems

    The problems described below provide an estimate of the performance that can be expected when running Abaqus/Explicit on different computers. The jobs are representative of typical Abaqus/Explicit applications including high-speed dynamic impact events and quasi-static events with complicated contact conditions. The number of increments listed in the tables below are approximate and can vary somewhat depending on the hardware platform and the number of parallel domains.

      E1: Car crash
      This benchmark consists of passenger car impacting a rigid wall. The car is meshed primarily with shell elements of type S3RS and S4RS with isotropic hardening Mises plasticity material behavior. The various compenents of the car are connected using multi-point constraints and connector elements. Many of the suspension and drivetrain components are modeled as rigid bodies. The car, road surface, and wall are placed into a single general contact domain and the car is given an initial velocity of 25 mph.

      E1
      Increments: 62,934
      Number of elements: 274,632

      E2: Cell phone drop
      This benchmark consists of a simplified model of a cell phone impacting a fixed rigid floor. The cell phone components are meshed using a variety of element types including C3D8R, C3D10M, and S4R. The material behavior is modeled using linear elasticity, isotropic hardening Mises plasticity, and hyperelasticity. The components are assembled using surface-based mesh ties and placed into a general contact domain that also includes the floor. The initial velocity and orientation of the cell phone is defined such that a severe oblique impact occurs.

      E2
      Increments: 87,369
      Number of elements: 45,785
      Memory requirement: 300 MB

      E3: Sheet forming
      This benchmark consists of forming a sheet metal part by the deep drawing process. The deformable sheet metal blank is meshed with shell elements of type S4R and uses an isotropic hardening Mises plasticity material model. The tools are meshed using surface elements of type SFM3D4R which are declared rigid. General contact is defined between the blank and tools. The analysis sequence consists of two steps. During the first step the blank is clamped between the binder and die and then during the second step the punch is displaced to form the part. Since the process is essentially quasi-static the computations are performed over a sufficiently long time period to render inertial effects negligible. The performance of this analysis is a direct measure of the performance of the three-dimensional general contact algorithm.

      E3
      Increments: 31,177
      Number of elements: 34,540 (deformable only)
      Memory requirement: 550 MB

      E4: Projectile penetration
      This benchmark consists of a projectile penetrating a steel plate at an oblique angle. Both the projectile and plate are meshed using hexahedral elements of type C3D8R and use a rate-dependent isotropic hardening Mises plasticity material model with failure. The projectile and plate are placed into a general contact domain with surface erosion. The edges of the plate are held fixed and the initial velocity of the projectile is specified so that the projectile passes completely through the plate.

      E4
      Increments: 12,433
      Number of elements: 237,100
      Memory requirement: 1400 MB

      E5: Blast loaded plate
      This benchmark consists of a stiffened steel plate subjected to a high intensity blast load. The plate is meshed using shell elements of type S4R and uses an isotropic hardening Mises plasticity material model. There is no contact.

      E5
      Increments: 81,716
      Number of elements: 50,000
      Memory requirement: 150 MB

      E6: Concentric spheres
      This benchmark consists of a large number of concentric spheres with clearance between each sphere. The spheres are meshed using hexahedral elements of type C3D8R and use an isotropic hardening Mises plasticity material model. All of the spheres are placed into a single general contact domain and the outer sphere is violently shaken which results in complex contact interactions between the contained spheres.

      E6
      Increments: 23,291
      Number of elements: 244,124
      Memory requirement: 1000 MB

      ABAQUS "Standard" & "Explicit" Benchmark Test Suites
      Voltaire GridStack 4.1.5-7 for SLES 10

    Disclosure Statement:

    The following are trademarks or registered trademarks of Abaqus, Inc. or its subsidiaries in the United States and/or other countries: Abaqus, Abaqus/Standard, Abaqus/Explicit. All information on the ABAQUS website is Copyrighted 2004-2007 by Dassault Systems. Results from http://www.simulia.com/support/v66/v66_performance.html as of 7/2/07.

    System Configuration

    Hardware Configuration:

    Sun Blade X6250

      4 2-socket Sun Blade X6250's
      2x3.0 GHz DC Intel Xeon EM64T 5160 (Woodcrest) processors
      Infiniband (Voltaire) Interconnects (PCI-Express HCA's)
    Software Configuration:

      Linux: 64-bit SUSE SLES 10
    ABAQUS V6.6-3

    Like this post? del.icio.us | furl | slashdot | technorati | digg