Friday Mar 30, 2007
The Sun Fire X2200 M2 server beats Woodcrest on
large CFD models. The X2200 M2 Cluster beats all currently posted
Opteron cluster results (dual core HP XC4000 2.2GHz, HP DL145 G2 2.2GHz,
HP XW9300 2.4GHz, and HP DL585 2.6GHz) for all "cpu" levels and for all
test cases. All clusters had the high performance Infiniband interconnects.
The X2200 M2 beats the IBM X3650 2.66GHz quad core Clovertown across the board at
all cpu levels and for all test cases.
Tests were run on the official version of Fluent (lnxamd64 V6.3.26 build).
The Sun Opteron server numbers were generated under 64-bit SUSE SLES 9 SP 3.
Sun many customers that use Solaris, Linux, and windows so we show
benchmarks on all of these.
Although the X2200 M2 cluster has the best performance on the larger
and more complex tests, "FL5L3". It is most closely representative of
actual customer benchmarks (requires over 9GB of memory, best run using
several cpu's). FL5L3 simulates turbulent flow through a transition duct.
Note that the X2200 M2 cluster results shown in following table are consistently
better than those obtained on the two Woodcrest cluster systems at the same
"cpu" levels and for all indicated "cpu" levels (4 to 32).
The efficiency of the Sun X2200 M2 cluster is superb at well above 90% up to 32 cores. This essentially perfect scalability is contrasted with the Woodcrest
clusters where scalability has dropped off and efficiency is below 70% at
and above 4 cores.
Scaling Performance : Results in "Ratings" (# runs/day, bigger is better)
| System |
4 Cores |
8 Cores |
16 Cores |
32 Cores |
Sun X2200 M2 2.8GHz Operton |
89.9 |
174.4 |
341.5 |
664.4 |
HP BL460C 3.0GHz Woodcrest |
80.3 |
155.4 |
299.0 |
576.0 |
HP DL140 3.0GHz Woodcrest |
N/A |
160.7 |
320.5 |
620.1 |
Bull NovaScale 3.0GHz Woodcrest |
78.9 |
157.8 |
313.2 |
619.0 |
Fluent Performance : Results in "Ratings" (# runs/day, bigger is better)
| System |
Interconnect/MPI |
cores |
FL5L1 |
FL5L2 |
FL5L3 |
| X2200 2.8GHz DC 2220 SLES 9 SP 3 |
IB(V)/HP-MPI |
8 |
1219.5 |
952.1 |
174.4 |
| X2100 3.0GHz SC 156 SLES 9 SP3 |
IB(V)/MVAPICH |
8 |
1148.2 |
1063.4 |
184.6 |
| HPDL140 3.0GHz DC WC EM64T Linux |
IB/HP-MPI |
8 |
1378.0 |
915.0 |
160.7 |
| Bull Nova 3.0 GHz DC WC EM64T RHEL4 |
IB |
8 |
1323.6 |
884.1 |
157.8 |
| HP BL460C 3.0GHz WC EM64T WinCCS |
IB(V) |
8 |
1289.6 |
881.6 |
155.4 |
| Intel White 3.0GHz WC EM64T DC RHAS4 |
IB(Mellanox) |
8 |
--- |
828.0 |
137.8 |
| Tyan Typh. 630 2.3GHz WC SLES 10 |
GbE |
8 |
1011.7 |
692.4 |
122.7 |
| Tyan Typh. 630 2.3GHz WC WinCCS |
GbE |
8 |
981.8 |
635.3 |
--- |
| HPDL140 3.6GHz EM64T WINCCS |
IB |
8 |
970.8 |
675.0 |
120.0 |
| HPDL585 2.6GHz DC 152 RHEL4 |
IB(V)/HP-MPI |
8 |
966.2 |
723.2 |
119.2 |
| HPXC4000 2.2GHz DC 148 Linux |
IB(V)/HP-MPI |
8 |
951.0 |
680.4 |
102.7 |
| HPDL145 G2 Opteron 2.2GHz DC WinCCS |
IB(V) |
8 |
847.1 |
654.5 |
119.2 |
| IBMX3650 2.66GHz 4C Clovert. EM64T RHEL4 |
? |
8 |
953.6 |
551.2 |
93.3 |
Benchmark Description
Nine industrial CFD applications ranging in size from 32,000 to
10,000,000 cells have been selected to demonstrate the performance of
FLUENT on a variety of hardware platforms. The performance of a CFD
code will depend on several factors including size and topology of the
mesh, physical models, numerics and parallelization, compilers and
optimization, in addition to performance characteristics of the
hardware where the simulation is performed. The problems selected
represent a range of simulations typical of those which might be found
in industry. The principal objective of this benchmark suite is to
provide comprehensive and fair comparative information of the
performance of FLUENT on available hardware platforms.
System Configuration
Hardware Configuration:
Sun Fire X2200 M2
2-socket 2x2.8 GHz dual core Opteron 2220 processors
4x1GB + 4x2GB (12GB) DDR2 667 MHz dimms
IB(Voltaire)/PCI-Express (interconnect)
Software Configuration:
64-bit SuSE SLES 9 SP 3
Fluent V6.3.26
Voltaire Infiniband Software Stack: 3.5.5_16-S2sles9.k2.6.5_7.244_smp.x86_64
Message Passing Interface: HP-MPI V hpmpi-2.02.05.00-20061003r.x86_64
See Also
Current V6.2(.16) results at:
http://www.fluent.com/software/fluent/fl5bench/flbench_6.3/fullres.htm
Thursday Mar 22, 2007
Intel is really playing with information, ZDNet writes:
"...instead of comparing Intel's latest greatest chips to AMD's latest greatest chips (as Intel should be doing to legimately convince Wall St., the press, and customers of leadership and/or breakaway performance), more than half of the data points that show Intel leading or breaking away show it doing so against older AMD chips (in some cases, single-core chips or chips from an older generation of Opterons) and in some cases, with retired benchmarks"
SOURCE: "AMD’s no angel, but Intel’s public usage of benchmark data is feloniously misleading"
Posted by David Berlind @ 3:33pm
http://blogs.zdnet.com/Berlind/?p=366
David missed something we've blogged about here, some Intel systems with normal-size memory
use more watts when compared to some systems. Watt & configuration data really needs to be
shown somewhere in the Intel presentation mentioned above if a chart is to have validity.
http://blogs.sun.com/bmseer/entry/woodcrest_memory_lacks_some_important
and one other thing, in this presentation Intel used perf/$ and perf/watt. As blogged yesterday,
everyone needs to use $/perf and watt/perf to really help customers.
Thursday Jan 11, 2007
Sun Blade X8420 is 1.9x faster than the
best Intel Woodcrest system on SPECint_rate2006 and is also 2.1x faster than the best Intel
Woodcrest on SPECfp_rate2006. The Sun Blade X8420 is also 22% faster than 4-way Itanium2 dual-core on
SPECfp_rate.
Sun Blade X8420 delivered the best result with SPECint_rate2006 score of 93.1, using Solaris 10 and Studio 11 combo. The Sun Blade X8420 also
delivered the best result of of 87.3 for the SPECfp_rate2006
benchmark for all x86 systems.
SPEC CPU2006 Performance Charts (bigger is better, selected recent results)
SPECint_rate2006
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun Blade X8420 |
AMD Opteron 8220 |
2.8 |
4 |
8 |
8 |
93.1 |
80.4 |
| Fujitsu CELSIUS R640 |
Xeon 5160 (Woodcrest) |
3.0 |
2 |
4 |
4 |
50.3 |
48.8 |
| Sun Ultra 40 M2 |
AMD Opteron 2220SE |
2.8 |
2 |
4 |
4 |
48.8 |
41.9 |
| HP DL585 |
Opteron 854 |
2.8 |
4 |
4 |
4 |
46.9 |
41.4 |
| Supermicro X7DBE |
Xeon 5160 (Woodcrest) |
3.0 |
2 |
4 |
4 |
--- |
45.2 |
| Sun Fire X4200 |
Opteron 285 |
2.6 |
2 |
4 |
4 |
42.8 |
37.8 |
| Fujjitsu RX220 |
Opteron 280 |
2.4 |
2 |
4 |
4 |
40.0 |
35.7 |
| Sun Fire X4200 |
Opteron 256 |
3.0 |
2 |
2 |
2 |
26.4 |
23.1 |
| HP DL585 |
Opteron 854 |
2.8 |
2 |
2 |
2 |
25.2 |
22.3 |
| Dell PrecWork 380 |
Pentium EE |
3.73 |
1 |
2 |
2 |
-- |
23.1 |
| HP DL380 G4 |
Pentium 4 |
3.8 |
2 |
2 |
2 |
-- |
20.9 |
SPECfp_rate2006
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun Blade X8420 |
AMD Opteron 8220 |
2.8 |
4 |
8 |
8 |
87.3 |
82.5 |
| HP rx6600 |
Itanium2 dual-core |
1.6 |
4 |
8 |
8 |
71.4 |
69.1 |
| HP DL585 |
Opteron 854 |
2.8 |
4 |
4 |
4 |
49.3 |
45.6 |
| FSC CELSIUS R640 |
Intel Xeon 5160 (Woodcrest), WinXP Pro |
3.0 |
2 |
4 |
4 |
42.5 |
41.4 |
| Sun Fire X4200 |
Opteron 285 |
2.6 |
2 |
4 |
4 |
38.1 |
36.0 |
Results as of 09 Jan 2007 from www.spec.org.
Benchmark Description
SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and
CINT2006. CFP2006 targets floating-point performance, while CINT2006
targets integer performance.
Each suite has two different measures. First is the CPU measure, which
is the performance on the suite as a single stream. This can be either
a single thread or automatic compiled parallel run. This measure is
further defined by base and optimized runs. Base uses the same compiler
flags for all kernels, where optimized is allowed to use different
compiler flags for each kernel. Results are compared against a baseline
system run that was standardized by SPEC.
The second measure is Rate. It is a measure of how many CPU measures
can be run at a time. Typically, it is run as n processes on n
processors. It shows how well the same job mix can run on a system
under some load. It also is run as a base and optimized set of
results.
Disclosure Statement:
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 1/9/07.
Sun Blade X8420 (AMD Opteron 8220, 4chips/8cores, Solaris 10) 93.1 SPECint_rate2006.
Sun Blade X8420 (AMD Opteron 8220, 4chips/8cores, Solaris 10) 87.3 SPECint_rate2006.
Results Summary
| Results |
|
X8420 |
|
93.1 SPECint_rate2006 |
|
X8420 |
|
87.3 SPECfp_rate2006 |
| Reference Date: |
|
Jan 09, 2007 |
| System: |
|
Sun Blade X8420, 64GB memory |
| Processors: |
|
four 2.8 GHz Opteron 8220 |
| Software: |
|
Solaris 10, Sun Studio 11 |
Thursday Dec 14, 2006
In case you missed it: "Java 6 Leads Out-of-the-Box Server Performance"
Dave Dagastine's blog this week goes into the advances of Java 6.
Java6 is Sun's fastest most-reliable release and specifically targets out-of-the-box performance. link:
http://blogs.sun.com/dagastine/entry/java_6_leads_out_of
Bottom line: It means no tuning options are needed for the JVM to achieve optimal performance. YEAH! Lots of great details in the blog.
Friday Dec 08, 2006
Here is a processor wattage chart, but notice that Intel has the memory controller off chip so you need to add 30-35 watts to this figures
(opteron includes this on chip)
http://www.intel.com/products/processor_number/chart/xeon.htm
For more details on power budget breakdown I got pointed to these
two pages for an AMD comparison (looks like it is part of a bigger preso?, no confidential statement):
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/opt_vs_wc_8_dimms.pps
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/4_opt_vs_4_pax.pps
Thursday Dec 07, 2006
More details on power budget differences that give Opteron at least a 34% lead
over Woodcrest.
I gave some basics of this in this posting:
http://blogs.sun.com/bmseer/entry/design_strategies%3A_wattage_advantage_of
Woodcrest power budget: Dual-core Xeon's : 160 watts per socket (80w each) PLUS 44.8 watts for chipset (incl memory controllers) PLUS 66.4 watts
166.4 watts FBDIMM (16 DIMMs).
{{typo corrected: yes FB-DIMMs suck an amazing 170 watts for 16 DIMMs -- that's nearly 100watts more than DDR2. That is why Intel-based systems only report wattage on small memory configs, but still use the same large memory configs for various benchmarks.}}
Opteron power budget: Dual-core Opteron's: 190 watts socket (95w max each) PLUS 16 watts for chipset PLUS 70.4 watts for DDR2 (16 DIMMs).
...and this is just looking at just the chips -- and not adding the typical
controllers you'd have for a functioning system like disk , network, etc...
Thursday Dec 07, 2006
There are technical reasons why a 32GB 2-processor Woodcrest server draws a hefty
510 watts. Intel decided not to implement the energy saving "page open mode"
for the power-hungry FB-DIMMs. So CPU power throttling may have limited benefit
on Woodcrest systems.
Intel has shown that a 10GB 1 socket Woodcrest draws 400 watts, but you have
to dig past some marketing spin to find it, see page 3 of
www.intel.com/it/pdf/energy-efficient-perf-for-the-data-center.pdf.
Sun publishes benchmark performance and watts on Sun Fire T2000(~330 watts) and the Sun Fire T1000(~185 watts), performance, and
configuration on all of its benchmarks
http://www.sun.com/servers/coolthreads/t1000/benchmarks.jsp.
- 330 watts 32GB Sun Fire T2000
- 32GB; 4 x 73GB 10K rpm SAS disks, 3 Northstar NICs, Crystal FCAL
- 32GB T2000 has 100 less watts and twice the memory of the 16GB Woodcrest config
- measured by Sun, CPUs busy, network busy, disks idle
- 185 watts 16GB Sun Fire T1000
- Measured on every T2000/T1000 benchmark
Woodcrest 16GB 430 watt measured config details:
Dell 2950
2 x 3GHz Woodcrest Xeon 5160 (4MB L2 cache)
16GB = 8 x 2GB DIMM;
one 73GB 15K rpm SAS (disk idle)
1.333MHz FSB
PERC 5/i, x6 Backplane Integrated Controller Card
QLogic 2462 Dual Channel 4GB Optical FC HBA PCI-E
OS: SuSE - SLES
all bios settings correct
Tuesday Dec 05, 2006
Some things to look at when you seen marketing around wattage. You
can avoid errors by really looking at total measured wattage when systems
running and doing real work. I've seen a lot of Intel marketing about
wattage of Woodcrest being 65 watts. But that really doesn't show the
whole picture. I'll break it down a bit...
What GHz at what wattage?:First recognize that Woodcrest
2.66 GHz & 2.33 GHz is 65 watts for chip only, but Woodcrest at 3.0 GHz
is 80 watts. ...and all benchmarks I've seen is on the 80 watt 3.0 GHz
systems.
What about the memory controller?: The CPU isn't everything.
Woodcrest designs have an external memory controller. Opteron designs have
an integrated memory controller. So you need to add another 30 watts (or more)
for the pair of Woodcrest CPUs.
What about the memory technology differences?: The CPU+Memory_controller
isn't everything. Woodcrest designs use FB-DIMMs. Opteron designs use the
more power efficient DDR2. FB-DIMMS draw a lot more power. In fact, as
I've blogged about before, 32GB 2-socket Woodcrest system draws 500 watts!
Measured when the CPU is busy. Sun's Opteron systems is way over 100 watts less.
Every IT department I talk to really wants to cut cost out -- power consumption
is a growing a major factor in IT costs.
...this just in...
Sun is now shipping a wattage meter with the "Try-and-buy" program for
Sun Fire T2000. More details at:
http://blogs.sun.com/cohen/entry/kill_a_watt_--_power
Friday Dec 01, 2006
Hallway discussion: "What diff does 200 watts really make?" I was
having a discussion about 32GB T2000 (330watts) vs 32GB Woodcrest(500 watts) and was told this is only about 200watt difference really in the end a very small change.
But the problem is that when you aggregate and consider all
of the losses from utility to computing it can add up rather
quickly. ...maybe up to 600kW or more. To give everyone an
example, I'll base the power losses on a real Sun datacenter
of mixed systems, but use the T2000 as an example and a Woodcrest
system as a comparable.
For example let's say we had 700 KW of T2000 32GB servers (700,000kW/330watt = 2,121 servers or 100 racks). We lose
about 40% due to air conditioning and power distribution in
the datacenter and 3% loss in utility power distribution.
all and all this is 1200kW of power out of the Utility.
OK with woodcrest this is 2121 servers x 500watt/server or
1.06MW for servers. Assuming same percentage loss at every
stage this means the utility has to provide 1.83MW(for Woodcrest)
vs. 1.2MW(for Sun Fire T2000).
Friday Nov 17, 2006
Surprised that a 32GB 2-processor Woodcrest server draws a hefty 510 watts. Woodcrest vendors need to be transparent... it will get out. A recent internet search found that even Intel knows
a 10GB 1 socket Woodcrest draws 400 watts, see page 3 of
www.intel.com/it/pdf/energy-efficient-perf-for-the-data-center.pdf.
Woodcrest vendors need to
publish configuration, performance and watts all together whenever they show performance! No more games.
Sun publishes benchmark performance and watts on Sun Fire T2000(~330 watts) and the Sun Fire T1000(~185 watts), performance, and
configuration on all of its benchmarks
http://www.sun.com/servers/coolthreads/t1000/benchmarks.jsp.
- 330 watts 32GB Sun Fire T2000
- 32GB; 4 x 73GB 10K rpm SAS disks, 3 Northstar NICs, Crystal FCAL
- 32GB T2000 has 100 less watts and twice the memory of the 16GB Woodcrest config
- measured by Sun, CPUs busy, network busy, disks idle
- 185 watts 16GB Sun Fire T1000
- Measured on every T2000/T1000 benchmark
Woodcrest 16GB 430 watt measured config details:
Dell 2950
2 x 3GHz Woodcrest Xeon 5160 (4MB L2 cache)
16GB = 8 x 2GB DIMM;
one 73GB 15K rpm SAS (disk idle)
1.333MHz FSB
PERC 5/i, x6 Backplane Integrated Controller Card
QLogic 2462 Dual Channel 4GB Optical FC HBA PCI-E
OS: SuSE - SLES
all bios settings correct
If you have a woodcrest measure the watts and post them, clearly wastecrest vendors don't want you to know.
Seems the Intel likes to use marketing spin and avoid the facts:
http://www.intel.com/business/bss/infrastructure/enterprise/power_thermal.pdf
and http://www.intel.com/performance/server/xeon/ppw.htm
Also
http://www.principledtechnologies.com/clients/reports/Intel/WSPECint_rate_0506.pdf
Friday Nov 17, 2006
The Total Tyranny of low utilization datacenters
In this blog and other blogs I've commented on, Woodcrest supporters always
want to say their servers are better at low utilisation. This is
totally the wrong way to go! They first claim typical datacenters are
running at low utilisations, example: Xen claims typical datacenters are at 15%.
Horrible, HORRIBLE.
So why shouldn't use just add all kinds of techniques to power at lower
utilisations, clearly that is the best way to save money? Right? Wrong.
Lets take a simple example of a 400 watt server(@ 100%) that saves 20 watts for
each 10% reduction in utilisation. Will show this in a table below and
compare equivalent work done compared to 100% so you can see the hyperbolic nature of the curve. Of course I'm only looking at one server so there
is some discretisation but when you have a datacenter it will quickly
approach these numbers.
| %Utilisation |
100% |
90% |
80% |
70% |
60% |
50% |
40% |
30% |
20% |
10% |
0% |
| Watts-at-Util |
400 |
380 |
360 |
340 |
320 |
300 |
280 |
260 |
240 |
220 |
220 |
| watts/work |
400 |
422 |
450 |
486 |
533 |
600 |
700 |
867 |
1200 |
2200 |
inf. |
Now that I've got you shocked, let's look at a more typical example.
Lets compare 5 servers running at 10% utilisation (that is 220 watts
each or 1100 watts for the 5 of them). A single server running at
50% utilisation only uses 300 watts! The 10% case
almost require 3.7 times more power! OUCH!
Bottom line: It is far too easy to be fooled to think you are saving
money if power-saving features at low utilisation is your answer.
By the by, a significant number of Sun's large servers run at over
80% utilisation using Solaris, of course.
Here is an example from 2004 of someone on different products who likely understands this math.
As reported in
Computerworld:
"Dennis Callahan, CIO at The Guardian Life Insurance Company of America in New York, server utilization has shot up to nearly 50% in the past 18 months, with a goal in coming years of nearly 70%.
Thursday Nov 16, 2006
The Sun Ultra 40 M2 Workstation demonstrates a new World Record integer throughput
performance for all x86 systems, sets a new world record on the
new and improved SPEC cpu benchmark called "SPECint_rate2006." It fixes
things like SPECint_rate2000 has/had floating-point applications in the integer suite, whaaaat? yes
strange but true.
The Sun Ultra 40 M2 delivered the SPECint_rate2006 score of 48.8, using
Solaris 10 and Studio 11 combination. Sun's Opteron beats Woodcrest by 7%.
As you can see below 'Peak' means
you add a few more compiler flags. I guess Woodcrest didn't have any
others to try on Woodcrest or maybe they saw no improvement so they avoided
publishing? Anyone know?
Competitive Landscape
Selected SPEC CPU2006 (SPECint_rate2006) Performance Results -
bigger is better, see
www.spec.org for complete results.
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun Ultra 40 M2 |
AMD Opteron 2220SE |
2.8 |
2 |
4 |
4 |
48.4 |
41.9 |
| HP DL585 |
Opteron 854 |
2.8 |
4 |
4 |
4 |
46.9 |
41.4 |
| Supermicro X7DBE |
Woodcrest, Xeon 5160 |
3.0 |
2 |
4 |
4 |
--- |
45.2 |
| Sun Fire X4200 |
Opteron 285 |
2.6 |
2 |
4 |
4 |
42.8 |
37.8 |
| Fujjitsu RX220 |
Opteron 280 |
2.4 |
2 |
4 |
4 |
40.0 |
35.7 |
| Sun Fire X4200 |
Opteron 256 |
3.0 |
2 |
2 |
2 |
26.4 |
23.1 |
| HP DL585 |
Opteron 854 |
2.8 |
2 |
2 |
2 |
25.2 |
22.3 |
| Dell PrecWork 380 |
Pentium EE |
3.73 |
1 |
2 |
2 |
-- |
23.1 |
| HP DL380 G4 |
Pentium 4 |
3.8 |
2 |
2 |
2 |
-- |
20.9 |
Benchmark Description
SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and
CINT2006. CFP2006 targets floating-point performance, while CINT2006
targets integer performance.
Each suite has two different measures. First is the CPU measure, which
is the performance on the suite as a single stream. This can be either
a single thread or automatic compiled parallel run. This measure is
further defined by base and optimized runs. Base uses the same compiler
flags for all kernels, where optimized is allowed to use different
compiler flags for each kernel. Results are compared against a baseline
system run that was standardized by www.spec.org.
The second measure is Rate. I think this one is a LOT more important. It
is a measure of how many CPU measures
can be run at a time. Typically, it is run as n processes on n
processors or threads. It shows how well the same job mix can run on a system
under some load. It also is run as a base and optimized set of
results. "Rate" is what you use for any mult-threaded workstation and
all servers.
Disclosure Statement:
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 11/14/06. Sun Ultra 40 M2, 48.8 SPECint_rate2006.
System Configuration
- Sun Ultra 40 M2
- 2 x 2.8 GHz Opteron 2220SE
- 16GB memory
- Solaris 10
- Sun Studio 11
- 48.8 SPECint_rate2006
Wednesday Nov 15, 2006
Woodcrest scaling issues? Yes, remember scaling is critical for
system performance, so don't look too much at single core performance or single job performance as it can lead to the wrong conclusions. In fact Sun's Opteron scaling means that the Sun systems can outperform Woodcrest by 18% to 22% as shown below.
On a 4 core/2chip
Intel Woodcrest systems they are only seeing 2.8x to 2.9x on 4 cores -- this doesn't bode well for quad-core or larger systems made out of these. Sun sees 3.6x to 4.1x scaling in the table below. Couple this with the high-wattage of these Woodcrest (31-Oct posting) and Woodcrest may have issues?
Opteron leads poor Woodcrest scaling & performance on Fluent 6 Benchmark (Both systems 2 sockets and using dual-core)
| System |
GHz/Chip |
#cores |
FL5M3 (scaling) |
FL5L2 (scaling) |
| INTEL S5000XAL |
3.0GHz Xeon Woodcrest 5160 |
4-core |
827.0 (2.8x) |
400.0 (2.9x) |
| INTEL S5000XAL |
3.0GHz Xeon Woodcrest 5160 |
2-core |
553.7 (1.9x) |
226.0 (1.6x) |
| INTEL S5000XAL |
3.0GHz Xeon Woodcrest 5160 |
1-core |
297.3 (1.0x) |
138.0 (1.0x) |
| Sun |
| Sun X4100 M2 |
2.8GHz Opteron DC 2200 |
4-core |
979.9 (3.6x) |
486.6 (4.1x) |
| Sun X4100 M2 |
2.8GHz Opteron DC 2200 |
2-core |
516.1 (1.9x) |
241.8 (2.1x) |
| Sun X4100 M2 |
2.8GHz Opteron DC 2200 |
1-core |
273.5 (1.0x) |
117.6 (1.0x) |
Rating = No. of sequential runs of test case possible in 1 day,
86,400/(Total Elapsed Run Time in Seconds)
Fluent results at:
http://www.fluent.com/software/fluent/fl5bench/flbench_6.2/fullres.htm
...I suspect even better performance and scaling on Sun Fire X4100 M2
with Solaris...
Tuesday Nov 14, 2006
Sun Ultra 40 M2 w/2xFX 5500 nVidia Framebuffers (SLI)
World Record Performance SPECapc Unigraphics UGS-NX3
The Sun Ultra 40 M2 with dual nVidia Quadro FX 5500s
in SLI mode sets a world record running the SPEC APC
UGS-NX3 graphics oriented MCAD benchmark beating all desktop platforms, including the Woodcrest and Intel Core2 "Extreme Processor" X6800 cpu's.
The SPEC APC MCAD benchmarks consist of tasks
representative of what a designer would do in a typical
session. This consists of "Graphics", "CPU", and "I/O" activities.
- In dual framebuffer SLI mode the Ultra 40 M2 with 2.8GHz
2220SE dual core Opteron processors outperforms a
Dell 690 (3.0 GHz Woodcrest) by 14% overall and by 37% in
the graphics test components.
- In addition, in dual framebuffer SLI mode the existing
Ultra 40 outperforms the Dell 690 (Woocrest 3.0 GHz)
by 16% overall and by 61% in the graphics component.
The Ultra 40 with 3.0 GHz single core
Opteron 256 processors (400 MHz DDR1 dimms) versus
the 2.8 GHz dual core Opteron 2220SE processors
(667 MHz DDR2 dimms), edging the
Ultra 40 M2 by about 1%.
The Sun Ultra 40 with a single nVidia Quadro FX 5500
outperforms most other high end desktops equipped with
a single framebuffer with currently posted results
obtained running the SPEC APC UGS-NX3 benchmark.
- The Sun Ultra 40 with FX 5500 framebuffer
outperforms (is faster than) Woodcrest desktops.
H-P XW 6400 (4% overall, 39% on graphics);
Dell Precision 690 (9% overall, 52% on graphics);
IBM Intellistation Z Pro 9228 (14% overall, 62% on graphics)
- The Sun Ultra 40 with FX 5500 framebuffer also outperforms
all desktops equipped with the Intel 2.93 GHz X6800
"Extreme Processors".
H-P XW 4400 (6% overall, 47% on graphics);
Dell Precision 390 (10% overall, 47% on graphics)
Sun Opteron desktops have dominated with leading
MCAD benchmark results dating back to the introduction
of the Sun W1100 and W2100.
Sun desktops continue to exhibit excellent MCAD performance
as demonstrated by the world record results here for this
SPEC APC UGS-NX3 benchmark.
SPECapc Unigraphics NX 3 Benchmark Competitive Landscape (larger is faster):
| System |
Overall Composite |
CPU Composite |
File I/0 Composite |
Graphics Composite |
Sun Ultra 40
3.0GHz Opteron 256
2x FX 5500 (SLI) |
7.28 |
2.94 |
2.85 |
19.81 |
Sun Ultra 40 M2
2.8GHz Opteron 2220SE
2x FX 5500 (SLI) |
7.19 |
3.08 |
3.00 |
16.85 |
Fujitsu Siemens CELSIUS
3.0GHz Intel 5150
FX 5500 |
6.42 |
3.67 |
2.28 |
10.17 |
Dell Precision 690
3.0GHz Woodcrest
2x FX 4500 (SLI) |
6.30 |
3.25 |
1.64 |
12.29 |
Sun Ultra 40
3.0GHz Opteron 256
FX 5500 |
5.66 |
2.94 |
1.96 |
10.11 |
HP xw6400 WS
3.0GHz Woodcrest
FX 4500 |
5.42 |
3.39 |
3.51 |
7.26 |
HP xw4400
2.93GHz X6800
FX 3500 |
5.33 |
3.40 |
4.52 |
6.87 |
Dell Precision 690
3.00 GHz Woodcrest
FX 3500 |
5.17 |
3.38 |
3.69 |
6.64 |
Dell Precision 390
2.93 GHz X6800
FX 3500 |
5.16 |
3.46 |
2.18 |
6.87 |
IBM Intellistation Z Pro 9228
3.0GHz Woodcrest
FX 3500 |
4.96 |
3.43 |
2.84 |
6.23 |
Results Summary for the SPECapc Unigraphics NX 3 benchmark:
| Results |
|
|
|
Dual FX 5500 |
|
Dual FX 5500 |
|
Overall Composite: |
|
7.19 |
|
7.28 |
|
CPU Composite: |
|
3.08 |
|
2.94 |
|
File I/O Composite: |
|
3.00 |
|
2.85 |
|
Graphics Composite: |
|
16.85 |
|
19.81 |
| Reference Date: |
|
11/10/06 |
|
10/12/06 |
| System: |
|
Sun Ultra U40 M2 |
|
Sun Ultra U40 |
| Processor/GHz: |
|
Opteron 2220SE/2.8 |
|
Opteron 256/3.0 |
Disclosure Statement:
SPEC reg tm, SPECapc server mark of Standard Performance
Evaluation Corporation.
Results from www.spec.org as of Oct 12, 2006:
Sun Ultra 40, 2xFX 5500, overall composite 7.28;
Dell Precision 690, 2xFX 4500, overall composite 6.30.
Results from www.spec.org as of Oct 12, 2006:
Sun Ultra 40, FX 5500, overall composite 5.66;
HP xw6400, FX 4500, overall composite 5.42;
Dell Precision 690, FX 3500, overall composite 5.17;
IBM Intellistation Z Pro 9228, FX 3500, overall composite 4.96.
Results from www.spec.org as of Nov. 8, 2006:
Fujitsu Siemens CELSIUS, FX 5500, overall composite 6.42.
Results from www.spec.org as of Nov 10, 2006:
Sun Ultra 40 M2, 2xFX 5500, overall composite 7.19.
Results from www.spec.org as of Oct 12, 2006:
Sun Ultra 40, FX 5500, overall composite 5.66;
HP xw4400, FX 3500, overall composite 5.33;
Dell Precision 390, FX 3500, overall composite 5.16.
Friday Nov 03, 2006
Does the Woodcrest have scaling issues now? It may be caused by the rush
to increase core count without really considering design. On a 4 core/2chip
Intel Woodcrest systems we are only seeing 3.0x to 3.3x on 4 cores -- this doesn't bode well for quad-core or larger systems made out of these. Couple this with the high-wattage of these chips (Tuesday's posting) and this chip may have issues?
Poor Woodcrest scaling & Performance on Fluent 6 Benchmark
| System |
GHz/Chip |
#cores |
FL5L1 (scaling) |
FL5L2 (scaling) |
| INTEL S5000XAL |
3.0GHz Xeon Woodcrest 5160 |
4-core 2-Socket |
631.8 (3.3x) |
400.0 (3.0x) |
| INTEL S5000XAL |
3.0GHz Xeon Woodcrest 5160 |
2-core 1-Socket |
372.8 (1.9x) |
226.0 (1.7x) |
| INTEL S5000XAL |
3.0GHz Xeon Woodcrest 5160 |
1-core |
194.0 (1.0x) |
133.0 (1.0x) |
Rating = No. of sequential runs of test case possible in 1 day,
86,400/(Total Elapsed Run Time in Seconds)
Fluent results at:
http://www.fluent.com/software/fluent/fl5bench/flbench_6.2/fullres.htm
Do you notice on the Xeon 7000 series (NetBurst ba...