Friday Apr 11, 2008
What most people forget, is that datacenters are really throughput
engines. I don't know any datacenter (besides home ones) that only
use one thread or one core. When you look at racks of servers in a datacenter, you are
looking at thousands of threads! Which means 10,000 to 100,000 or more in a complete datacenter. Lots of work to be done, lots of threads doing it!
Sun has announced blade system world record results for
SPECint_rate2006 and SPECfp_rate2006.
These results were
run on the Sun Blade 6000 system with 10 Sun Blade T6320 server modules which
use the 1.4 GHz UltraSPARC T2 processor.
The Sun Blade 6000 system fully populated with 10 T6320 server modules
delivered a SPECint_rate2006 score of 838, a world record result for
blade systems.
The Sun Blade 6000 system (10 RUs) powered by 10 Sun UltraSPARC T2 1.4 GHz
processors provides 73% more integer throughput than the IBM p 570 (16 RUs)
equipped with 8 POWER6 4.7 GHz processors, as measured by
SPECint_rate2006.
The Sun Blade 6000 system fully populated with 10 T6320 server modules
delivered a blade system world record SPECfp_rate2006 score of 571.
Sun has chosen to submit a single run as both
SPECfp_rate_base2006 and SPECfp_rate2006,
(which is allowed under the run rules), in order
to emphasize that even without aggressive tuning, the
score of 571 is a record for both base and peak.
The Sun Blade 6000 system powered by 10 Sun UltraSPARC T2 1.4 GHz
processors provides 73% more floating-point throughput than the IBM p 570
equipped with 8 POWER6 4.7 GHz processors, as measured by
SPECfp_rate_base2006.
The IBM p 570 system (16RU) uses 1.6x times more rack units than the 10RU Sun Blade 6000 system(16 RU vs. 10 RU).
SPEC CPU2006 Performance Charts -
bigger is better, selected recent results
SPECint_rate2006
Please see
www.spec.org for complete results
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun B6000 w/10 x T6320 |
UltraSPARC T2 |
1.4 |
10 |
80 |
640 |
838 |
752 |
| HP Superdome |
Itanium 2 |
1.6 |
32 |
64 |
64 |
824 |
770 |
| Sun M9000 |
SPARC VI |
2.4 |
32 |
64 |
64 |
650 |
553 |
| IBM p 570 |
POWER6 |
4.7 |
8 |
16 |
32 |
484 |
420 |
Results as of 7 Apr 2008 from www.spec.org.
SPECfp_rate2006
Please see
www.spec.org for complete results
or for just
SPECfp_rate2006 results ordered by peak score.
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun M9000 |
SPARC VI |
2.4 |
32 |
64 |
64 |
600 |
556 |
| Sun B6000 w/10 x T6320 |
UltraSPARC T2 |
1.4 |
10 |
80 |
640 |
571 |
571 |
| IBM p 570 |
POWER6 |
4.7 |
8 |
16 |
32 |
430 |
369 |
| HP rx8640 |
Itanium 2 |
1.6 |
16 |
32 |
32 |
371 |
357 |
Results as of 7 Apr 2008 from www.spec.org.
Benchmark Description<
SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and
CINT2006. CFP2006 targets floating-point performance, while CINT2006
targets integer performance.
Each suite has two different measures. First is the CPU measure, which
is the performance on the suite as a single stream. This can be either
a single thread or automatic compiled parallel run. This measure is
further defined by base and optimized runs. Base uses the same compiler
flags for all kernels, where optimized is allowed to use different
compiler flags for each kernel. Results are compared against a baseline
system run that was standardized by SPEC.
The second measure is Rate. It is a measure of how many CPU measures
can be run at a time. Typically, it is run as n processes on n
processors. It shows how well the same job mix can run on a system
under some load. It also is run as a base and optimized set of
results.
Disclosure Statement:
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation.
Sun result submitted to SPEC,
other results from www.spec.org as of 4/7/08.
Sun Blade T6320 (UltraSPARC T2, 10 chips, 80 cores),
838 SPECint_rate2006, 752 SPECint_rate_base2006.
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation.
Sun result submitted to SPEC,
other results from www.spec.org as of 4/7/08.
Sun Blade T6320 (UltraSPARC T2, 10 chips, 80 cores),
838 SPECint_rate2006, 752 SPECint_rate_base2006.
IBM p 570 (POWER6, 8 chips, 16 cores), 484 SPECint_rate2006,
420 SPECint_rate_base2006.
SPEC, SPECfp reg tm of Standard Performance Evaluation Corporation.
Sun result submitted to SPEC,
other results from www.spec.org as of 4/7/08.
Sun Blade T6320 (UltraSPARC T2, 10 chips, 80 cores),
571 SPECfp_rate2006, 571 SPECfp_rate_base2006.
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation.
Sun result submitted to SPEC,
other results from www.spec.org as of 4/7/08.
Sun Blade T6320 (UltraSPARC T2, 10 chips, 80 cores),
571 SPECfp_rate_base2006.
IBM p 570 (POWER6, 8 chips, 16 cores),
369 SPECfp_rate_base2006.
Results Summary
| Results |
| Reference Date: |
|
Apr 7, 2008 |
| System: |
|
Sun Blade 6000 with 10 T6320 Modules |
| Processor: |
|
10 Sun UltraSPARC T2, 1.4 GHz |
|
|
|
838 SPECint_rate2006 |
|
|
|
752 SPECint_rate_base2006 |
|
|
|
571 SPECfp_rate2006 |
|
|
|
571 SPECfp_rate_base2006 |
| Software: |
|
Solaris 10, Sun Studio 12 Compiler gccfss |
Thursday Feb 14, 2008
World's fastest chip. The Sun SPARC Enterprise T5120 server, running
at 1.4 GHz, delivered a world record single chip result of
83.9 SPECint_rate2006. Please remember it is about system performance and chips
not about things inside a chip (like perf/transistor, perf/NAND-gate, perf/metal-layers, perf/thread, perf/bore, oopps perf/core, perf/silicon grain).
The Sun SPARC Enterprise T5120 using the GCC for SPARC Systems (gccfss)
compiler topped all competitor's single-chip results including beating
the IBM p570 single-chip 4.7GHZ POWER6 result by 38%.
IBM used its proprietary compiler, XL C/C++.
The Sun SPARC Enterprise T5120 using the GCC for SPARC Systems (gccfss)
compiler beat the performance of the HP DL360 G5 with a single chip
quad-core 3.16GHz Xeon X5460 by 15%.
The gccfss compiler allows one to use the optimal Sun SPARC optimization tools
along with the popular gcc coding conventions and deliver performance
that has not been possible before without time consuming code
changes.
For more information on gccfss and how to get it, go to
http://cooltools.sunsource.net/gcc/.
Sun also submitted results on the SPECfp_rate2006 benchmark
suite using just a single disk. The Sun SPARC Enterprise
T5120 server, running at 1.4 GHz, delivered a
result of 62.1 SPECint_rate2006.
This result was run on a single disk. The previously reported result
used the electrical equivalence rule of SPEC, but the configuration
used more disks than fit in a T5120. This result shows that the
performance is comparable, regardless of the disk configuration.
SPEC CPU2006 Performance Charts -
bigger is better, selected recent results, see
www.spec.org for complete results
SPECint_rate2006
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| T5120 (gccfss 4.2) |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
83.9 |
76.2 |
| T5220 (gccfss 4.2) |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
83.2 |
75.6 |
| T5120/T5220 |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
78.5 |
73.0 |
| T5220 (gccfss) |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
78.0 |
71.6 |
| Asus P5E3 |
Intel QX9650 |
3.0 |
1 |
4 |
4 |
76.7 |
69.0 |
| HP DL360 G5 |
Intel X5460 |
3.16 |
1 |
4 |
4 |
73.0 |
62.1 |
| Asus P5E3 |
Intel QX6850 |
3.0 |
1 |
4 |
4 |
69.1 |
64.9 |
| Dell T3400 |
Intel QX9650 |
3.0 |
1 |
4 |
4 |
68.8 |
61.4 |
| IBM p 570 |
Power6 |
4.7 |
1 |
2 |
4 |
60.9 |
53.2 |
| Fujitsu RX100 |
Intel X3210 |
2.13 |
1 |
4 |
4 |
54.4 |
48.0 |
SPECfp_rate2006
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| T6320 |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
62.3 |
58.1 |
| T5120/T5220 |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
62.3 |
57.9 |
| T5120 (one disk) |
UltraSPARC T2 |
1.4 |
1 |
8 |
64 |
62.1 |
57.9 |
| IBM p 570 |
Power6 |
4.7 |
1 |
2 |
4 |
58.0 |
51.5 |
| Intel Asus P5E3 |
Intel QX9650 |
3.0 |
1 |
4 |
4 |
52.0 |
49.9 |
| Dell T3400 |
Intel QX9650 |
3.0 |
1 |
4 |
4 |
47.2 |
44.9 |
| HP DL360 G5 |
Intel X5460 |
3.16 |
1 |
4 |
4 |
44.5 |
41.3 |
Results as of 12 Feb 2008 from www.spec.org.
Benchmark Description
SPEC CPU2006 is SPEC's most popular benchmark. It measures:
"Rate" - system performance of CPUs, memory, compiler
"Speed" - single thread performance of chip, memory, compiler;
not intended to stress multi-core designs
The strategic metrics include:
SPECint_rate2006: throughput for 12 integer benchmarks
derived from real applications such as
perl, gcc, XML processing, and pathfinding
SPECfp_rate2006: throughput for 17 floating point benchmarks
derived from real applications, including
chemistry, physics, genetics, and weather.
There are "base" variants of both the above metrics that require
more conservative compilation, such as using the same flags for
all benchmarks.
Disclosure Statement:
SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 2/12/08. Sun SPARC Enterprise T5120 gccfss (UltraSPARC T2, 1 chip, 8 cores),
83.9 SPECint_rate2006. IBM p570 (POWER6, 1 chip, 2 cores), 60.9 SPECint_rate2006. HP DL360 G5 (Xeon X5460, 1 chip, 4 cores), 73.0 SPECint_rate2006. Sun SPARC Enterprise T5120 (UltraSPARC T2, 1 chip, 8 cores),
62.1 SPECfp_rate2006.
Results Summary
| Results |
| Reference Date: |
|
Feb 12, 2008 |
| System: |
|
Sun SPARC Enterprise T5120 |
| Processor: |
|
Sun UltraSPARC T2, 1.4 GHz |
|
|
|
83.9 SPECint_rate2006 |
|
|
|
62.1 SPECfp_rate2006 |
| Software: |
|
Solaris 10, Sun Studio 12 Compiler gccfss |
Tuesday Aug 07, 2007
More about floating-point on the Sun UltraSPARC T2 in this posting, In
the previous posting SPECfp_2006 scores and the UltraSPARC T2 design being open-sourced were discussed.
In the UltraSPARC T2 there are eight floating-point units that are well suited for scientific applications. Based upon preliminary runs the
Sun UltraSPARC T2 processor at 1.4 GHz beats all single chip scores
showing 14230(est)/15081(est) SPECompMbase2001/SPECompMpeak2001.
How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECompMbase2001/SPECompMpeak2001 scores?
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
IBM p520 POWER5+ 1.9GHz processor published result by 85%.
- ...Sun is waiting for POWER6 4.7GHz results, maybe UltraSPARC T2 results will scare IBM from ever publishing a single-chip result?
Benchmark description:
The SpecOMP benchmark is a test of the performance of 9 High
Performance computing applications. It is used to compare the
performance of shared memory servers. All C/C++ and FORTRAN
applications in this suite use the OpenMP programming model that
provides a portable, scalable model for developing parallel
applications for platforms ranging from the desktop to the
supercomputer.
The OpenMP Application Program Interface (API) supports
multi-platform shared-memory parallel programming in C/C++ and Fortran
on all architectures, from the largest Unix servers to the small
Windows NT platforms.
Disclosure statement:
All UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable” runs,
but are nevertheless designated as “estimates” because they use preproduction
systems. SPEC, and SPEComp registered trademarks of Standard Performance
Evaluation Corporation.
Sun UltraSPARC T2 1.4GHz (1 chip, 8 cores, 64 threads) 14230 (est)/ 15081 (est) SPECompMbase2001/SPECompMpeak2001.
Competitive results from www.spec.org as of
August 6, 2007. IBM p520 1.9GHz (1 chip, 2 cores, 4 threads) published 8141/8174 SPECompMbase2001/SPECompMpeak2001.
Tuesday Aug 07, 2007
Sun UltraSPARC T2 is an amazing chip and very fast! The UltraSPARC T2 features several industry firsts:
- Eight cores and 64 threads
- Integrated 10 GbE networking and I/O
- Dedicated, cryptographic and floating point units per core
- 10 cryptographic functions supported with hardware
- open-source design: www.opensparc.net
Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz,
beat all single chip scores showing 78.3 est. SPECint_rate2006.
How do these preliminary runs (we must use the term "estimated" by
SPEC rules) compare to SPECint_rate2006 results.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
IBM POWER6 4.7GHz processor published result by 29%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
estimated scores of the AMD Barcelona by 23%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
published scores of the 2.66GHz Intel X5355 (Clovertown) by 48%.
Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz,
beat all single chip scores showing 62.3 est. SPECfp_rate2006.
How do these preliminary runs (we must use the term "estimated" by
SPEC rules) compare to SPECfp_rate2006 results.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best
published single-chip IBM POWER6 4.7GHz processor result by 7%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip estimated scores of the AMD Barcelona by 11%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
published scores of the 2.66GHz Intel X5355 (Clovertown) by 66%.
Performance per core doesn't matter GHz doesn't matter, what matters
is numbers of cores, efficiency, and design of the chip! Competitors
are saying that UltraSPARC T2 is proprietary... this makes no sense.
both UltraSPARC T1 and UltraSPARC T2 are open source designs (www.opensparc.net). You do not find the
latest design of Intel, AMD, or IBM as open source designs.
Disclosure Statement:
All Sun UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable”
runs, but are nevertheless designated as “estimates” because they use
preproduction systems. SPEC, SPECint, SPECfp registered trademarks of
Standard Performance Evaluation Corporation. Sun UltraSPARC T2
1.4GHz (1 chip, 8 cores, 64 threads) 78.3 est. SPECint_rate2006,
62.3 est. SPECfp_rate2006.
Competitive results from www.spec.org as of August 6, 2007.
IBM POWER6 4.7GHz (1 chip, 2 cores, 4 threads) 60.9. SPECint_rate2006,
58.0 SPECfp_rate2006.
AMD Barcelona 2.6 GHz (1 chip, 4 cores, 4 threads) 63.9 est SPECint_rate2006,
56.3 est. SPECfp_rate2006. Barcelona estimates based upon "The Register"
article stating 2.6GHz quad is 21% and 50% faster than Intel 2.66 system.
Fujitsu RX300 Intel X5355 2.66 GHz (1 chip, 4 cores, 4 threads) 52.8 SPECint_rate2006, 47.5 SPECfp_rate2006.
Reminder: The Niagara 2 score was obtained from a full "reportable" SPEC
run, but is designated as an "estimate" because a pre-production system
was used.
...more information on the UltraSPARC T2 later today.
Tuesday Jul 31, 2007
AMD made a new two-processor (2-chip) quad-core estimated SPECfp_rate result public.
The two-chip quad-core result for the Barcelona is an estimated 69.5 SPECfp_rate2006.
Note this is not been submitted it is therefore marked "estimated".
added note: all SPEC members are allowed to post preliminary numbers and mark
them with the term "estimated". Given the velocity that AMD & Intel incorporate
chips into full systems, it won't be too long before we see submitted results.
I imagine they will use newer software when they submit systems as well.
IBM has a
submitted result on a 1-chip IBM POWER6 p 570 (4.7 GHz) 58.0 SPECfp_rate2006 result.
Clearly performance per core doesn't matter as everyone puts different numbers of cores
of different processing strengths and much different costs. So one has to look at
chips & more importantly system performance and know system price. Does anyone know
the system cost differential of 2-chip AMD quad-cores vs. 1-chip POWER6 IBM dual-cores?
added note: We've all learned the cores price can vary by orders of magnitude with
IBM leading the industry in $/core by a huge amount. When IBM cites best performance per core every customer should ask what is the $/core when configured with the memory I want.
You will be surprised by comparison.
AMD's data can be found on slide 20:
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/July_2007_AMD_Analyst_Day_Randy_Allen_FINAL.pdf
Required Disclosure:
SPEC and the benchmark name SPECfp_rate2006 are registered trademarks of the Standard Performance
Evaluation Corporation. AMD Barcelona (two-chip, quad-core, 8 cores total) estimated
69.5 SPECfp_rate2006. IBM System p 570 POWER6 (1 chip, 2 cores/chip, 4 threads total,
4.7 GHz) of 58.0 SPECfp_rate2006 result. Results as of July 26, 2007. For latest
scores visit www.spec.org.
Thursday Jan 11, 2007
Sun Blade X8420 is 1.9x faster than the
best Intel Woodcrest system on SPECint_rate2006 and is also 2.1x faster than the best Intel
Woodcrest on SPECfp_rate2006. The Sun Blade X8420 is also 22% faster than 4-way Itanium2 dual-core on
SPECfp_rate.
Sun Blade X8420 delivered the best result with SPECint_rate2006 score of 93.1, using Solaris 10 and Studio 11 combo. The Sun Blade X8420 also
delivered the best result of of 87.3 for the SPECfp_rate2006
benchmark for all x86 systems.
SPEC CPU2006 Performance Charts (bigger is better, selected recent results)
SPECint_rate2006
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun Blade X8420 |
AMD Opteron 8220 |
2.8 |
4 |
8 |
8 |
93.1 |
80.4 |
| Fujitsu CELSIUS R640 |
Xeon 5160 (Woodcrest) |
3.0 |
2 |
4 |
4 |
50.3 |
48.8 |
| Sun Ultra 40 M2 |
AMD Opteron 2220SE |
2.8 |
2 |
4 |
4 |
48.8 |
41.9 |
| HP DL585 |
Opteron 854 |
2.8 |
4 |
4 |
4 |
46.9 |
41.4 |
| Supermicro X7DBE |
Xeon 5160 (Woodcrest) |
3.0 |
2 |
4 |
4 |
--- |
45.2 |
| Sun Fire X4200 |
Opteron 285 |
2.6 |
2 |
4 |
4 |
42.8 |
37.8 |
| Fujjitsu RX220 |
Opteron 280 |
2.4 |
2 |
4 |
4 |
40.0 |
35.7 |
| Sun Fire X4200 |
Opteron 256 |
3.0 |
2 |
2 |
2 |
26.4 |
23.1 |
| HP DL585 |
Opteron 854 |
2.8 |
2 |
2 |
2 |
25.2 |
22.3 |
| Dell PrecWork 380 |
Pentium EE |
3.73 |
1 |
2 |
2 |
-- |
23.1 |
| HP DL380 G4 |
Pentium 4 |
3.8 |
2 |
2 |
2 |
-- |
20.9 |
SPECfp_rate2006
| System |
Processors |
Performance Results |
| Type |
GHz |
Chips |
Cores |
Threads |
Peak |
Base |
| Sun Blade X8420 |
AMD Opteron 8220 |
2.8 |
4 |
8 |
8 |
87.3 |
82.5 |
| HP rx6600 |
Itanium2 dual-core |
1.6 |
4 |
8 |
8 |
71.4 |
69.1 |
| HP DL585 |
Opteron 854 |
2.8 |
4 |
4 |
4 |
49.3 |
45.6 |
| FSC CELSIUS R640 |
Intel Xeon 5160 (Woodcrest), WinXP Pro |
3.0 |
2 |
4 |
4 |
42.5 |
41.4 |
| Sun Fire X4200 |
Opteron 285 |
2.6 |
2 |
4 |
4 |
38.1 |
36.0 |
Results as of 09 Jan 2007 from www.spec.org.
Benchmark Description
SPEC CPU2006 is made up of two suites of benchmarks, CFP2006 and
CINT2006. CFP2006 targets floating-point performance, while CINT2006
targets integer performance.
Each suite has two different measures. First is the CPU measure, which
is the performance on the suite as a single stream. This can be either
a single thread or automatic compiled parallel run. This measure is
further defined by base and optimized runs. Base uses the same compiler
flags for all kernels, where optimized is allowed to use different
compiler flags for each kernel. Results are compared against a baseline
system run that was standardized by SPEC.
The second measure is Rate. It is a measure of how many CPU measures
can be run at a time. Typically, it is run as n processes on n
processors. It shows how well the same job mix can run on a system
under some load. It also is run as a base and optimized set of
results.
Disclosure Statement:
SPEC, SPECint reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 1/9/07.
Sun Blade X8420 (AMD Opteron 8220, 4chips/8cores, Solaris 10) 93.1 SPECint_rate2006.
Sun Blade X8420 (AMD Opteron 8220, 4chips/8cores, Solaris 10) 87.3 SPECint_rate2006.
Results Summary
| Results |
|
X8420 |
|
93.1 SPECint_rate2006 |
|
X8420 |
|
87.3 SPECfp_rate2006 |
| Reference Date: |
|
Jan 09, 2007 |
| System: |
|
Sun Blade X8420, 64GB memory |
| Processors: |
|
four 2.8 GHz Opteron 8220 |
| Software: |
|
Solaris 10, Sun Studio 11 |
So, was that a single system image, or was it simp...
Quite clear that SPECrate benchmarks are runs of i...