BM Seer Unofficial thoughts from an anonymous Sun employee

IBM too tricky for good of others?

Thursday Feb 22, 2007

IBM's TPC-C results not worthy of belief? Lots of unrealistic optimisations? Sometimes you never know what you find when you start searching the web. After yesterday's posting I started looking. Here is info from June 2005 IBM interview: (who knows what they've done since that doesn't benefit users?)
http://www.sigmod.org/sigmod/record/issues/0506/p71-column-winslet.pdf

    "And the good news is that about 40-70% of the stuff we do in performance tuning actually ends up helping end users. " Bruce Lindsay, IBM fellow

Ouch! Sun aims for benchmark tuning that end users actually use! Does this explain IBM's over-inflated TPC-C results?

    Q: "Is there any particularly sneaky but still totally legal aspect of TPC-C tuning that you would like to mention?"

    A: "Well, we do things that are very, what should I say? Intense. We get down to the level of worrying about the physical column order in the table so the reference columns are near each other, minimizing cache misses during fetching. This is feasible in the TPC-C benchmark because there are only five tables and only ten to fifteen columns in each table. In a more realistic application, where there are many more queries to be considered, the tables are typically much, much wider, in the 80 to 100 column range; and there are dozens if not thousands of tables. Then this kind of analysis is no longer practical." Bruce Linsay, IBM fellow

Good reason to make benchmarks messy and change them often. Is this why IBM hasn't published SPECint_rate2006 because they can't do the above?

We were right with these past postings:
http://blogs.sun.com/bmseer/entry/ibm_continues_to_abuse_and
http://blogs.sun.com/bmseer/entry/selective_vision
...interesting...

[2] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

Sun Ultra 40 M2 Workstation New World Records in Performance

Wednesday Jan 10, 2007

Engineering Visualisation World Record: The Sun Ultra 40 M2 equipped with the new dual core 2.8 GHz Opteron 2220 SE processors, nVidia Quadro FX 5500 framebuffer(s) and 667 MHz DDR2 dimms has established new world records running the graphics-intensive engineering visualization Ensight benchmark with an overall Composite Score of 9.55

The world record was obtained on an Ultra 40 M2 with dual nVidia Quadro FX 5500 framebuffers operating in "split frame" SLI mode. The No. 2 result of 9.39 was obtained on the the same platform also with dual FX 5500 framebuffers but in "alternate frame" SLI mode.

The Ultra 40 M2 also set the world record for desktop platforms with a single framebuffer. Equipped with one nVidia Quadro FX 5500 the Ultra 40 M2 obtained an overall composite score of 9.29 ahead of the previous world record mark of 9.26 held by an H-P XW 8400 equipped with four 2.66 GHz Xeon processors and an nVidia Quadro FX 3500.

The next best overall composite score of 9.21 (5th spot) was recorderd on an Ultra 40 with single core 3.0 GHz Opteron 256 processors and an nVidia Quadro FX 4500. Sun Ultra 40 M2, Ultra 40 and Ultra 20 desktop configurations dominate the top 10.

Currently the best Ensight BM results for all hardware vendor platforms are obtained under a 64-bit Linux OS. The closest result under a Windows operating system was obtained on an Ultra 40 with dual single core 3.0 GHz Opteron 256 processors and with dual nVidia Quadro FX 4500 framebuffers in SLI mode.
Overall Composite Score: 8.86

Benchmark Information

The Sun Ultra 40 M2 workstation gives the best engineering visualization performance as demonstrated with the Ensight MCAE benchmark. The 5 task benchmark consists of graphics intensive operations that are representative of rendering activity. The benchmark also stresses the cpu and memory systems of the platform.

Ensight V8.0 Engineering MCAE Visualization Benchmark Composite score: weighted sum of task frame rates

System Processor Graphics OS Composite Score
Sun Ultra 40 M2 2x2.8 GHz DC 2220 SE 2xFX 5500 (SLI-SFR) SLED 10 9.55
Sun Ultra 40 M2 2x2.8 GHz DC 2220 SE 2xFX 5500 (SLI-AFR) SLED 10 9.39
Sun Ultra 40 M2 2x2.8 GHz DC 2220 SE FX 5500 SLED 10 9.29
HP XW 8400 4x2.66 GHz Xeon FX 3500 Linux 2.6 9.26
Sun Ultra 40 2x3.0 GHz 256 FX 4500 SLES 9 SP 3 9.21
Sun Ultra 40 2x2.8 GHz 254 FX 4500 SLES 9 SP 3 9.18
Sun Ultra 40 2x3.0 GHz 256 2xFX 4500 (SLI) Win64 XP 8.86
HP XW 9300 2x2.8 GHz 254 FX 3450 Win64 XP 8.67

For more results see Ensight website.

Benchmark Description

The Ensight Benchmark consists of five tests that stress the graphics system as well as the CPU and memory systems of a platform. All five tests operate on a geometry containing three parts with a total of 6324 quads and 128,534 triangles. These parts are then duplicated for a total of six instances. Thus the total number of polygons in the test are 37,944 quads and 771,204 triangles. All tests are run using a display area measuring 600x500 pixels. All polygons are randomly oriented (i.e., no stripping is done).

A description of the tasks follows:

  • The first test is a line drawing test which rotates the scene 360 degrees in 12 degree increments for 30 refreshes of the screen. The total number of lines drawn during the test is (37,944*4 + 771,204*3)*30 = 73,961,640 lines. Each part has a single color.
  • The second test is a shaded test which rotates the scene 360 degrees in 12 degree increments for 30 refreshes of the screen. The total number of polygons drawn during the test is 24,274,440. Each part has a single color.
  • The third test is a repeat of the second test, but here the parts are colored on a per vertex basis.
  • The fourth test is a repeat of the third test, except it is run in immediate mode (as opposed to display list mode for the previous tests). Immediate mode is used by Ensight for flipbook animations, hidden line display, and all detached (VR) displays.
  • The fifth test is a repeat of the fourth test, but here the two large isosurface parts are transparent and the rotation uses 72 degree increments for a total of 5 refreshes. This test stresses not only the graphics subsystem, but the cpu/memory as well since the polygons have to be sorted for each refresh of the screen.

The Ensight Benchmark composite score is a weighted average of the individual 5 tests. Given test times T1 through T5, the composite score C is computed as:

    C = 0.25*(30.0/T1) + 0.2*(30/T2) + 0.2*(30/T3) + 0.2*(30/T4) + 0.15*(5/T5)

Disclosure Statement:

Ensight V8.0 is a reg tm of CEI Corporation

    Approved Results http://www.ensight.com/rendering-performance-tests.html
    Reference Date 12 December 2006
     
    Platform Sun Ultra 40 M2 Workstation
    Total Number Processors 2
    Processor/MHz of Workstation Opteron 2220 SE /DC 2.8 GHz
    Memory 8x1 GB DDR2 667 MHz dimms
    Operating System 64-bit SLED 10
    Graphics 2xnVidia Quadro FX 5500 framebuffers (SLI)
    Disks 2x250 GB 7200 rpm SATA striped
    Software Ensight V8.0
    Rendering Performance Benchmark
    Composite Score 9.55
    Total Elapsed Time 28.65 seconds
    Task Times (seconds) 2.27 2.31 2.33 5.83 15.90
    Task Frame Rates 13.19 13.00 12.89 5.14 0.31
     
    Platform Sun Ultra 40 M2 Workstation
    Total Number Processors 2
    Processor/MHz of Workstation Opteron 2220 SE /DC 2.8 GHz
    Memory 8x1 GB DDR2 667 MHz dimms
    Operating System 64-bit SLED 10
    Graphics nVidia Quadro FX 5500 framebuffer
    Disks 2x250 GB 7200 rpm SATA striped
    Software Ensight V8.0
    Rendering Performance Benchmark
    Composite Score 9.29
    Total Elapsed Time 28.47 seconds
    Task Times (seconds) 2.34 2.36 2.45 5.75 15.57
    Task Frame Rates 12.84 12.72 12.22 5.21 0.32

Like this post? del.icio.us | furl | slashdot | technorati | digg