Tuesday Aug 19, 2008
The Sun Fire X4600 M2 (8 Opteron 2.5 Ghz QC)
running Sun Java SE 6 Update 6-p achieved a result of 683542 SPECjbb2005 bops,
85443 SPECjbb2005 bops/JVM for the best score for all x86 based
servers on the SPECjbb2005 benchmark.
The Sun Fire X4600 M2 demonstrated 53% better performance over the
Dell PowerEdge R900 result of 446209 SPECjbb2005 bops, 55776
SPECjbb2005 bops/JVM which used 4 Intel Xeon quad-core processors at
2.93 GHz and the BEA JRocket JDK 1.6.0_02.
The Sun Fire X4600 M2 (8-chip) demonstrated 3% better performance over the
IBM p570 result of 664167 SPECjbb2005 bops, 83021
SPECjbb2005 bops/JVM which used 8 Power6 dual-core processors.
Note: An IBM blogger made a snarky comment about the fact that Sun "should know better". Sun clearly pointed out this is for 8-chip systems. No other vendor posted results on x86 systems with this many chips on this benchmark so Sun compared to 4-chip x86 results. Maybe IBM would like to publish the price of their (16 RU)POWER6 8-chip system compared to the Sun X4600 M2 as configured for this benchmark? ...or is it easier ignore system price and to confuse people by pointing to core-count, thereby dodging the cost-per-core issue?
The Sun Fire X4600 M2 used Solaris 10 5/08 and Sun JDK 1.6.0_06 Performance
Release to obtain this leading result.
SPECjbb2005 Performance Chart (ordered by performance)
bops: SPECjbb2005 Business Operations per Second (bigger is better)
| System |
Processors |
Performance |
| Chips, Cores, Threads |
GHz Type |
SPECjbb2005 bops |
SPECjbb2005 bops/JVM |
| Sun Fire X4600 M2 |
8,32,32 |
2.5 8360SE |
683542 |
85443 |
| IBM p570 |
8,16,32 |
4.7 POWER6 |
664167 |
83021 |
| Dell PE R900 |
4,16,16 |
2.93 X7350 |
446209 |
55776 |
Complete benchmark results may be found at the SPEC benchmark website http://www.spec.org.
Benchmark Description
SPECjbb2005 (Java Business Benchmark) measures the performance of a Java implemented application tier (server-side Java). The benchmark is based on the order processing in a wholesale supplier application. The performance of the user tier and the
database tier are not measured in this test. The metrics given are number of SPECjbb2005 bops (Business Operations per Second) and SPECjbb2005 bops/JVM (bops per JVM instance).
Disclosure Statement:
SPEC, SPECjbb reg tm of Standard Performance Evaluation
Corporation. Results as of 8/7/2008 on www.spec.org.
Sun Fire X4600 M2(8 chips, 32 cores) 683542 SPECjbb2005 bops, 85443 SPECjbb2005 bops/JVM submitted for review. Dell PE R900(4 chips, 16 cores) 446209 SPECjbb2005 bops, 55776 SPECjbb2005 bops/JVM. IBM p570 (8 chips, 16 cores) 664167 SPECjbb2005 bops, 83021 SPECjbb2005 bops/JVM.
Results Summary
| Reference Date: |
|
Aug 8, 2008 |
| Results |
|
683542 SPECjbb2005 bops, 85443 SPECjbb2005 bops/JVM |
| System: |
|
Sun Fire X4600 M2 |
| Processor: |
|
8 x AMD Opteron 8360SE 2.5 GHz |
| Operating System: |
|
Solaris 10 5/08 |
| JVM: |
|
Java HotSpot(TM) 32-Bit Server, Version 1.6.0_06-p |
Monday Aug 18, 2008
At lunch today one of my co-workers let me know about the The Environmental Working Group's list of 44 fruits and vegetables ranked by the amount of pesticide residue that each contains. You may be better off buying those on the dirty dozen from a place that grows them organically.
The EWG's "dirty dozen":
- peaches,
- apples,
- sweet bell peppers,
- celery,
- nectarines,
- strawberries,
- cherries,
- lettuce,
- grapes (imported),
- pears,
- spinach, and
- potatoes.
The "cleanest 12" are:
- onions,
- avocados,
- sweet corn (frozen),
- pineapples,
- mangos,
- sweet peas (frozen),
- asparagus,
- kiwis,
- bananas,
- cabbage,
- broccoli, and
- eggplants.
A wallet guide can be found here:
http://www.foodnews.org/walletguide.php
Friday Aug 08, 2008
Sun SPARC Enterprise T5220 / T5240 beats IBM Cell Broadband Engine with
significantly easier application code development!
Pattern matching or string searching are important to a variety of
commercial, government and HPC applications. One of the core
functions needed for text identification algorithms in data
repositories is real-time string searching. For this benchmark, both
IBM and Sun used the Aho-Corasick algorithm for string searching.
Note: Got this from an internal website on info that is going public.
The 2-chip Sun SPARC Enterprise T5240 performed string searching at a
rate of 6.12 GB/s (49.0 Gbit/sec) whereas the 2-chip IBM Cell Broadband Engine
DD3 Blade performed string searching at a rate of 0.48 GB/s (3.8 Gbit/sec).
The 1-chip Sun SPARC Enterprise T5220 performed string searching at a
rate of 3.08 GB/s (24.6 Gbits/s).
The Sun SPARC Enterprise T5240 demonstrated a 2x speedup over the Sun
SPARC Enterprise T5220.
The Aho-Corasick algorithm as deployed on the IBM Cell Broadband
Engine DD3 Blade required substantial optimization and tuning to achieve the
reported performance, whereas on the Sun SPARC Enterprise T5220 or T5240
only a basic implementation of the algorithm and a simple compilation were
needed.
Performance Summary
| System |
Throughput (GBits/sec) |
Chips |
Cores |
GHz |
Sun SPARC Enterprise T5240 |
49.0 |
2 |
16 |
1.4 |
Sun SPARC Enterprise T5220 |
24.6 |
1 |
8 |
1.4 |
IBM Cell Broadband Engine DD3 Blade |
3.8 |
2 |
16 |
3.2 |
IBM results are obtained from Figure 7(d) of
IEEE Computer, Volume 41, Number
4, pp. 42-50, April 2008. Sun benchmark results as of 08/05/2008.
Benchmark Description
One of the core functions needed for text identification algorithms in data
repositories is real-time string searching. This string searching benchmark
demonstrates the usefulness of Sun's UltraSPARC T2 and T2 Plus processors for
both ease of code creation and speed of code execution.
In IEEE Computer, Volume 41, Number 4, pp. 42-50, April 2008, IBM describes a
variant of the Aho-Corasick string searching algorithm that uses deterministic
finite automata. The algorithm first constructs a graph that represents a
dictionary, then walks that graph using successive input characters from a text
file. Each "state" in the graph includes a state transition table (STT)
that is accessed using the next input character from the text file to
determine the address of the next state in the graph. IBM defines an automaton
as a two-step loop that: (1) obtains the address of the next state from the
STT, and (2) fetches the next state in the graph.
IBM reports the performance of its Cell Broadband Engine (CBE) to execute this
algorithm to search a 4.4 MB version of the King James Bible using a dictionary
of the 20,000 most used words in the English language (average word length of
7.59 characters). Each of the 8 synergistic processing elements (SPEs) of each
of the two CBEs executes 16 automata, for a total of 256 automata.
All automata and hence all SPEs access a single, shared dictionary.
IBM describes elaborate optimizations of the Aho-Corasick algorithm, including
state shuffling, state replication, alphabet shuffling and state caching.
These optimizations were required to: (1) overcome "memory congestion", i.e.,
contention amongst the SPEs for access to the shared dictionary, and (2)
compensate for the limited local storage that is associated with each SPE.
These optimizations were necessary to achieve the performance reported for
the CBE DD3 Blade.
IBM does not provide references that indicate where to obtain the dictionary
and Bible. IBM reports the algorithmic performance in Gbits/s but does not
indicate whether an 8-bit byte is extended to 10 bits as required for network
transmission.
In order to closely approximate the dictionary and Bible that were used by IBM,
Sun used a dictionary of 25,144 English words (the Open Solaris
file cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/spell/list)
for which the average word length is 8.22 characters, and a 4.6 MB version
of the King James Bible (www.patriot.net/users/bmcgin/kjv12.zip). For
reporting of results in Gbits/s, the length of a byte is assumed to be 8 bits.
In order to demonstrate the usefulness of Sun's UltraSPARC T2 and T2 Plus
processors for both ease of code generation and speed of code execution, Sun
implemented the Aho-Corasick algorithm using ANSI C. No optimizations of the
algorithm were required to achieve the performance reported for the T5220 and TT5240.
The source code was compiled using the -m64 -xO3 and -xopenmp options. The
dictionary is represented using a graph that comprises 187 MB. Each core of
the T5220 or T5240 executes 8 automata using one OpenMP thread per automaton.
Thus, the T5220 executes 64 total automata and the T5240 executes 128 total
automata. All automata and hence all cores access a single, shared dictionary.
Access to this dictionary is accelerated by the large, shared L2 caches of the
Sun SPARC Enterprise T5220 and T5240.
Disclosure Statement:
Pattern Matching: Sun SPARC Enterprise T5240 (2 x 1.4 GHz UltraSPARC T2 Plus, 2 chips, 16 cores),
Solaris 10, Sun C 5.9, 49.0 GBits/sec;
Sun SPARC Enterprise T5220 (1 x 1.4 GHz UltraSPARC T2, 1 chip, 8 cores),
Solaris 10, Sun C 5.9, 24.6 GBits/sec;
IBM Cell Broadband Engine DD3 Blade (2 x 3.2 GHz Cell Broadband Engine,
2 chips, 16 cores), Linux kernel v2.6.16, IBM CBE Software Development Kit
v2.1, 3.8 GBits/sec.
System Configuration
| Throughput (GBits/sec) |
|
24.6 T5220 |
|   |
|
49.0 T5240 |
| Reference Date: |
|
August 5, 2008 |
| Systems: |
|
Sun SPARC Enterprise T5220, T5240
|
| Total Number Processors: |
|
1, 2 |
| Processor/GHz of Server: |
|
1.4 GHz UltraSPARC T2, T2 Plus |
| Operating System: |
|
Solaris 10 |
Thursday Jul 17, 2008
The Sun Fire X4450 server demonstrates Sun's position of leadership
in Java based computing by publishing the first result ever for
the new SPEC benchmark JVM2008. The Sun Fire X4450 server delivered a result of 260.08 SPECjvm2008 Base ops/m.
Now we just need the other vendors (SPEC members who must have approved the benchmark) to step up and start publishing...
SPECjvm2008 Performance Chart (ordered by performance)
base: SPECjvm2008 Base ops/m (bigger is better)
peak: SPECjvm2008 Peak ops/m (bigger is better)
Ch/Co/Lc: Chips, Cores, Logical CPUs
| System |
Processors |
Performance |
| Ch |
Co |
Lc |
GHz |
Type |
base |
peak |
| Sun Fire X4450 |
4 |
16 |
16 |
2.933 |
X7350 QC |
260.08 |
- |
Benchmark Description
SPECjvm2008 (Java Virtual Machine Benchmark) is a benchmark suite for
measuring the performance of a Java Runtime Environment (JRE),
containing several real life applications and benchmarks focusing on
core java functionality. The suite focuses on the performance of the
JRE executing a single application; it reflects the performance of the
hardware processor and memory subsystem, but has low dependence on file
I/O and includes no network I/O across machines. The SPECjvm2008
workload mimics a variety of common general purpose application
computations. These characteristics reflect the intent that this
benchmark will be applicable to measuring basic Java performance on a
wide variety of both client and server systems.
SPEC also finds user experience of Java important, and the suite
therefore includes startup benchmarks and has a required run category
called base, which must be run without any tuning of the JVM to improve
the out of the box performance.
SPECjvm2008 benchmark highlights:
-
Leverages real life applications (like derby, sunflow, and javac) and
area-focused benchmarks (like xml, serialization, crypto, and
scimark).
Also measures the performance of the operating system and hardware in
the context of executing the JRE.
Disclosure Statement:
SPEC, SPECjvm reg tm of Standard Performance Evaluation Corporation.
Results as of 07/16/08 on www.spec.org.
Sun Fire X4405 260.08 SPECjvm2008 Base ops/m
Results Summary
|
Certified Results
|
|
|
Performance:
|
|
260.08 SPECjvm2008 Base ops/m
|
|
|
Reference Date:
|
|
July 8, 2008
|
|
Systems:
|
|
Sun Fire X4450
|
|
Total Number Processors:
|
|
4
|
|
Processor/ GHz of Server:
|
|
Intel Xeon X7350 QC 2.933 GHz
|
|
Operating System:
|
|
Solaris 10 5/08
|
|
JVM:
|
|
Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.6.0_06 Performance Release
|
Wednesday Jul 16, 2008
The Sun SPARC Enterprise M8000 server using the new
SPARC64 VII 2.52 GHz processor delivered a SPECompM2001 result of 104,714.
The Sun SPARC Enterprise M8000 server (2.52GHz SPARC64 VII processors)
beat the best posted IBM p570 result(4.7GHz Power6) by 11% on the
SPECompM2001 benchmark.
The Sun SPARC Enterprise M8000 server (2.52GHz SPARC64 VII processors)
delivered a SPECompL2001 result of 581,807, the fastest result
using 16 chips or less.
The new 2.52GHz SPARC64 VII processors delivered 58% more performance for
the Sun SPARC Enterprise M8000 when compared to the SPARC64 VI 2.28GHz
processors as measured by the SPECompM2001 benchmark.
Benchmark Description
The SPEC OMPM2001 Benchmark Suite was released in June 2001 and
tests HPC performance using OpenMP for parallelism.
-
11 programs (3 in C and 8 in Fortran)
parallelized using OpenMP API
Goals of suite:
- Targeted to mid-range (4-32 processor) parallel systems
- Run rules, tools and reporting similar to SPEC CPU2000
- Programs representative of HPC and Scientific Applications
Result Landscape SPECompM2001 (bigger is better, ordered by peak metric, representative results)
| Result |
Cores |
Chips |
OpenMP Threads |
System |
| Peak |
Base |
| 157880 |
148510 |
64 |
32 |
64 |
IBM p5 p595, POWER5 2.3GHz |
| 104714 |
75418 |
64 |
16 |
127 |
Sun SE M8000, SPARC64 VII 2.52GHz |
| 94350 |
84017 |
16 |
8 |
32 |
IBM p570, POWER6 4.7GHz |
| 66283 |
59179 |
32 |
16 |
32 |
Sun SE M8000, SPARC64 VI 2.28GHz |
| 56211 |
45275 |
16 |
8 |
32 |
IBM p5-575, POWER5 1.9GHz |
| 46444 |
44164 |
32 |
16 |
32 |
SGI Altix 4700, Itanium2 1.6GHz |
| 45895 |
35534 |
16 |
8 |
32 |
IBM p5-560Q, POWER5+ 1.8GHz |
Results from www.spec.org as of 14 July 2008
SPECompL2001 (bigger is better,Results ordered by peak metric)
| Result |
Cores |
Chips |
OpenMP Threads |
System |
| Peak |
Base |
| 1456653 |
1250890 |
256 |
64 |
192 |
Sun SE M9000, SPARC64 VII 2.52GHz |
| 1230446 |
1148235 |
128 |
64 |
128 |
Sun SE M9000, SPARC64 VI 2.4GHz |
| 1056459 |
1005583 |
64 |
32 |
128 |
IBM p5 595, POWER5 2.3GHz |
| 1005076 |
987139 |
256 |
128 |
256 |
SGI Altix 4700, Itanium 2 1.6GHz |
| 672757 |
620741 |
64 |
32 |
128 |
IBM p5 595, POWER5 1.9GHz |
| 581807 |
532576 |
64 |
16 |
64 |
Sun SE M8000, SPARC64 VII 2.52GHz |
Results from www.spec.org as of 14 July 2008
Disclosure Statement:
SPEC, SPEComp reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 07/14/08. Sun results submitted to SPEC.
Sun SPARC Enterprise M8000 (64 cores, 16 chips, 64/127 OMP threads, 2.52GHz)
104714 SPECompM2001, 75418 SPECompMbase2001.
Sun SPARC Enterprise M8000 (32 cores, 16 chips, 32 OMP threads, 2.28GHz)
59179 SPECompMbase2001. IBM p 570 (16 cores, 8 chips, 32 OMP threads, 4.7GHz Power6) 94350 SPECompM2001.
SPEC, SPEComp reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 07/14/08. Sun results submitted to SPEC.
Sun SPARC Enterprise M8000 (64 cores, 16 chips, 64 OMP threads, 2.52GHz)
581807 SPECompL2001, 532576 SPECompLbase2001.
Results Summary
| Result |
|
M8000: |
|
581807 SPECompL2001 |
|
M8000: |
|
104714 SPECompM2001 |
| Reference Date: |
|
Jul 14, 2008 |
| System: |
|
Sun SPARC Enterprise M8000 |
| Total Number Processors: |
|
16 |
| Total Memory : |
|
256 GB (128x2GB DIMMs) |
| Processor/GHz of Server: |
|
SPARC64 VII, 2.52 GHz |
| Operating System: |
|
Solaris 10 |
| Compiler: |
|
Sun Studio 12 |
Wednesday Jul 16, 2008
The Sun SPARC Enterprise M9000 server using the new
SPARC64 VII 2.52 GHz processor delivered results
on the SPEC OMPL2001 benchmarks.
The Sun SPARC Enterprise M9000 server, powered by 2.52GHz SPARC64 VII
processors reset the World Record for SPECompL2001 with a result of
1,456,653 and a world record SPECompLbase2001 result of
1,250,890.
The Sun SPARC Enterprise M9000 server beats the 128-socket SGI Altix 4700, Itanium2 DC 1.6GHz by 45% on SPECompL2001. There are no POWER6 results on this
benchmark at this scale. (post a comment if I missed it on the SPEC website
and I will post a correction).
Benchmark Description
The SPEC OMPM2001 Benchmark Suite was released in June 2001 and
tests HPC performance using OpenMP for parallelism.
-
11 programs (3 in C and 8 in Fortran)
parallelized using OpenMP API
Goals of suite:
- Targeted to mid-range (4-32 processor) parallel systems
- Run rules, tools and reporting similar to SPEC CPU2000
- Programs representative of HPC and Scientific Applications
Result Landscape SPECompL2001 (bigger is better, Results ordered by peak metric)
| Result |
Cores |
Chips |
OpenMP Threads |
System |
| Peak |
Base |
| 1456653 |
1250890 |
256 |
64 |
192 |
Sun SE M9000, SPARC64 VII 2.52GHz |
| 1230446 |
1148235 |
128 |
64 |
128 |
Sun SE M9000, SPARC64 VI 2.4GHz |
| 1056459 |
1005583 |
64 |
32 |
128 |
IBM p5 595, POWER5 2.3GHz |
| 1005076 |
987139 |
256 |
128 |
256 |
SGI Altix 4700, Itanium 2 1.6GHz |
| 672757 |
620741 |
64 |
32 |
128 |
IBM p5 595, POWER5 1.9GHz |
| 581807 |
532576 |
64 |
16 |
64 |
Sun SE M8000, SPARC64 VII 2.52GHz |
Results from www.spec.org as of 14 July 2008
Disclosure Statement:
SPEC, SPEComp reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 07/14/08. Sun results submitted to SPEC.
Sun SPARC Enterprise M9000 (256 cores, 64 chips, 192/256 OMP threads, 2.52GHz)
1456653 SPECompM2001, 1250890 SPECompMbase2001.
SGI Altix 4700 (256 cores, 128 chips, 256 OMP threads, Itanium 2 1.6GHz)
1005076 SPECompM2001, 987139 SPECompMbase2001.
Results Summary
| Result |
|
M9000: |
|
1456653 SPECompL2001 |
|
|
|
1250890 SPECompLbase2001 |
| Reference Date: |
|
Jul 14, 2008 |
| System: |
|
Sun SPARC Enterprise M9000 |
| Total Number Processors: |
|
64 |
| Total Memory : |
|
1 TB (512x2GB DIMMs) |
| Processor/GHz of Server: |
|
SPARC64 VII, 2.52 GHz |
| Operating System: |
|
Solaris 10 |
| Compiler: |
|
Sun Studio 12 |
Tuesday Jul 15, 2008
ecogeek covers whey MPG (miles/gallon) is a silly measuremnt and why gallons/mile
is a much better metric. The didn't draw the conclusions that watt/performance is also
the better measurement, but the same reasoning applies. For more read:
http://www.ecogeek.org/content/view/1875/69/
http://blogs.sun.com/bmseer/entry/mpg_and_perf_watts_are
http://blogs.sun.com/bmseer/entry/miles_gal_perf_watt_use
Also read this interesting fact:
"At the moment, the world's data centres are estimated to consume about 14 gigawatts of power, and to be responsible for 2% of global carbon-dioxide emissions—roughly the same as air traffic." The Economist article.
So why do other companies only measure watts on slow low-GHz CPUs and tiny 8GB-16GB memory instead of measuring on a wide variety of benchmarks like Sun does? My guess (and some unofficial measurements) is that HP and IBM would lose. I'm still waiting for HP's 2.93GHz X64 and IBM Power6 (5GHz or 4.7GHz) system power measurements for instance...
Monday Jul 14, 2008
The Sun SPARC Enterprise M9000 server running 2.52GHz SPARC64 VII processors
delivered 2.023 TFLOPS on the Linpack HPC benchmark.
For single servers, the Sun SPARC Enterprise M9000 server outperforms the best IBM Power 595 5GHz POWER6 published result by two times on the Linpack HPC benchmark. This system is the largest that IBM makes for its 5GHz Power6-based servers.
A single Sun SPARC Enterprise M9000 server shows 2.7 times the performance
on the Linpack HPC benchmark when compared to the HP Integrity Superdome
Itanium 2 system.
The Sun Performance Library was enhanced to take advantage of the
SPARC64 VII architecture.
Benchmark Description
The Linpack benchmark suite measures the performance for factoring
and solving a dense set of linear equations in double-precision
floating-point.
The Linpack HPC benchmark allows the solution of any size
matrix with a single right hand side. It was developed to allow vendors
to show off their hardware. Because big problems allow for peak
performance potentials, the benchmark is seen as an upper bound of
potential performance of a machine. The run rules are much more
flexible. The solution technique must use a pivoting scheme and the
driver must follow the spirit of the Linpack 1000 or Linpack 100
benchmarks.
LINPACK HPC Performance Chart - GFLOPS (bigger is better)
Table below does not include clustered solutions.
| System |
GFLOPS |
Processors |
| Total |
Peak |
Threads |
CPUs |
Type |
GHz |
| Sun SPARC Enterprise M9000 |
2023.0 |
2580.5 |
256 |
64 |
SPARC64 VII |
2.52 |
| Sun SPARC Enterprise M9000 |
1032.0 |
1228.8 |
128 |
64 |
SPARC64 VI |
2.4 |
| IBM Power 595 |
1028.0 |
1280.0 |
64 |
32 |
POWER6 |
5.0 |
| HP Superdome |
745.5 |
819.2 |
128 |
64 |
Itanium 2 |
1.6 |
| Sun SPARC Enterprise M8000 |
548.2 |
645.1 |
64 |
16 |
SPARC64 VII |
2.52 |
Disclosure Statement:
Linpack HPC, results from http://www.netlib.org/benchmark/index.html
as of 07/01/08. Sun SPARC Enterprise M9000 (SPARC64 VII @2.52, 64 chips,
256 cores), 2.023 TFLOPS. IBM Power 595 (POWER6 5.0GHz, 32 chips, 64 cores)
1028.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips,
128 cores) 745.5 GFLOPS.
Linpack HPC, results from http://www.netlib.org/benchmark/index.html
as of 04/13/07. Sun SPARC Enterprise M9000 (SPARC64 VI @2.4, 64 chips,
128 cores), 1.032 TFLOPS. IBM p5 595 (POWER5 1.9GHz, 32 chips, 64 cores)
418.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips, 128 cores)
745.5 GFLOPS.
Results Summary
SAE (Strategic Applications Engineering) has submitted results
for the LINPACK HPC benchmark
| Published Results |
|
Performance: |
|
2.023 TFLOPS |
| System: |
|
Sun SPARC Enterprise M9000 |
| Total Number Processors: |
|
64 |
| Processor/GHz of Server: |
|
SPARC64 VII, 2.52 GHz |
| Operating System: |
|
Solaris 10 |
| Compiler: |
|
Sun Studio 12 |
Monday Jul 14, 2008
The Sun SPARC Enterprise M9000 (64 processors, 256 cores, 512 threads) set a World Record for the SAP-SD 2-Tier Standard Application benchmark. World Record SAP-SD 2-Tier: Sun SPARC Enterprise M9000 SPARC64 VII SAP-SD 2-Tier ERP 6.0 (2005) outperforms largest IBM Power 595 / 5GHz POWER6.
The 64-way Sun SPARC Enterprise M9000 with 2.52 GHz SPARC64 VII processors achieved 39,100 users on the two-tier SAP Sales and Distribution (SD) standard SAP ERP 2005 application benchmark.
The 64-way Sun SPARC Enterprise M9000 beat the 32-way IBM Power 595 (5GHz 64-core Power6) by 10%. This is the largest configuration that IBM makes. IBM has a very different and very complicated core. Users should compare hardware system costs for these two systems.
The IBM p595 achieved 35,400 users on SAP-SD 2005 6.0 (177,950 SAPS, 5,561 SAPS/proc, 08-Apr-08).
The 64-way Sun SPARC Enterprise M9000 beat the 64-way HP Integrity Superdome by 30%.
The IBM p595 achieved 30,000 users on SAP-SD 2005 6.0 (152,530 SAPS, 2,383 SAPS/proc, 18-Dec-06).
SAP-SD 2-Tier ERP 6.0 (2005) Benchmark Description
The SAP Standard Application SD (Sales and Distribution) Benchmark is a
two-tier ERP business test that is indicative of full business workloads
of complete order processing and invoice processing, and demonstrates the
ability to run both the application and database software on a single
system. The SAP Standard Application SD Benchmark represents the critical
tasks performed in real-world ERP business environments.
SAP is one of the premier world-wide ERP application providers, and maintains
systems on the various SAP products.
SAP-SD 2-Tier Performance Table (in decreasing performance order).
| System |
OS
Database |
Users |
SAP ERP/ECC Release |
SAPS |
SAPS/ Proc |
Date |
Sun SPARC Enterprise M9000
64xSPARC64 VII @2.52GHz
1024 GB |
Solaris 10
Oracle 10g |
39,100 |
2005 6.0 |
196,564 |
3,071 |
14-Jul-08 |
IBM Power 595
32xPOWER6 @5.0GHz
64 cores, 512 GB |
AIX 6.1
DB2 9.5 |
35,400 |
2005 6.0 |
177,950 |
5,561 |
08-Apr-08 |
HP Integrity SD64B
64xItanium2 @1.6GHz
128 cores, 512 GB |
HP-UX 11iV3
Oracle 10g |
30,000 |
2005 6.0 |
152,530 |
2,383 |
18-Dec-06 |
Sun SPARC Enterprise M9000
64xSPARC64 VI @2.4GHz
1024 GB |
Solaris 10
Oracle 10g |
25,130 |
2005 6.0 |
129,420 |
2,022 |
11-Jul-08 |
IBM p5 595
64xPOWER5+ @2.3GHz
64 cores, 512 GB |
AIX 5.3
DB2 9 |
23,456 |
2004 5.0 |
117,520 |
1,836 |
25-Jul-06 |
Sun SPARC Enterprise M8000
16xSPARC64 VI @2.4GHz
256 GB |
Solaris 10
Oracle 10g |
7,300 |
2005 6.0 |
36,570 |
2,285 |
17-Apr-07 |
Complete benchmark results may be found at the SAP benchmark website
http://www.sap.com/benchmark.
Disclosure Statement:
Two-tier SAP Sales and Distribution (SD) standard SAP ERP 2004/2005 application benchmark
as of 07/14/08:
Sun SPARC Enterprise M9000 (64 processors, 256 cores, 512 threads) 64 x 2.52 GHz SPARC64 VII,
1024GB memory, 39,100 SD benchmark users, 1.93 sec. avg. response time,
Cert#2008042, Oracle 10g, Solaris 10, SAP ECC Release 6.0;
Sun SPARC Enterprise M9000 (64 processors, 128 cores, 256 threads) 64 x 2.4 GHz SPARC64
VI, 1024GB memory, 25,130 SD benchmark users, 1.65 sec. avg. response time,
Cert#2008040, Oracle 10g, Solaris 10, SAP ECC Release 6.0;
Sun SPARC Enterprise M8000 (16 processors, 32 cores, 64 threads) 16 x 2.4 GHz SPARC64 VI,
256GB memory, 7,300 SD benchmark users, 1.98 sec. avg. response time, Cert#2007026,
Oracle 10g, Solaris 10, SAP ECC Release 6.0;
IBM Power 595 (32 processors, 64 cores, 128 threads),
35,400 SD benchmark users, 32 x 5.0 GHz POWER6, 512 GB,
DB2 9.5, AIX 6.1, Cert. 2008019, SAP ECC Release 6.0;
IBM System p5 595 (64 processors, 64 cores, 128 threads),
23,456 SD benchmark users, 64 x 2.3 GHz POWER5+, 512 GB,
DB2 9, AIX 5.3, Cert. 2006045, SAP ECC Release 5.0;
HP Integrity SD64B (64 processors, 128 cores, 256 threads),
30,000 SD benchmark users, 64 x 1.6 GHz Dual-Core Intel Itanium 2, 512 GB,
Oracle 10g, HP-UX 11iV3, Cert#2006089, SAP ECC Release 6.0;
SAP, R/3, mySAP reg TM of SAP AG in Germany and other countries.
More info www.sap.com/benchmark.
Sun's submitted results for the SAP-SD 2-Tier benchmark
| Certified Results |
|
Performance: |
|
39,100 benchmark users |
|
Server: |
|
Sun SPARC Enterprise M9000 |
|
Processors: |
|
64 x 2.52 GHz SPARC64 VII |
|
Memory: |
|
1024 GB |
|
Operating system: |
|
Solaris 10 |
|
Database S/W: |
|
Oracle 10g |
|
SAP S/W: |
|
SAP ECC 6.0 |
|
SAP Certification: |
|
#2008042 |
|
Storage: |
|
1 x Internal System Disk
8 x Sun StorageTek(tm) 6140 Arrays |
Monday Jun 23, 2008
Last friday I blogged about an article on Duke University's Larrick & Soll's research:
Posting a vehicle’s fuel efficiency in “gallons per mile” (GPM) rather than “miles per gallon” (MPG) would help consumers make better decisions about car purchases and environmental impact, researchers from Duke University’s Fuqua School of Business report in the June 20 issue of Science magazine.
The main issue is that people usually make comparisons by linear improvement in miles/gallon, but this leads most to errors. Switching to gallons/mile (and as
I said for servers watt/performance) avoids these problems.
If one does the calculations correctly of course it doesn't matter, but on a quick look one can be mislead. For example, (do this quickly!) if one climbs
10 miles a hill and gets 10mpg and then coasts down the hill for 10 miles getting 100 mpg, how many mpg does one average? If you didn't come up with an answer of
18mpg (or nearly double the uphill rate), then you should consider looking at the reciprocal calculation.
If on that same hill that same car gets 1 gal/10 miles (10mpg) uphill and 0.1gal/10miles (100mpg), then it is easy to that coasting downhill can only come close doubling your fuel efficiency. Even if you doubled your fuel efficiency on the downhill section to 200mpg (0.05gal/10 miiles) you can see that your average fuel efficiency doesn't change much.
As I've said before on servers it is also critical to understand watt/performance on a wide variety of benchmarks, Sun understands this. This way you avoid benchmarks were vendors only highlight small-memory and low-GHz configurations.
Finally increase your server utilisation (even a small amount) and closely look at power-performance (watt/perf).
Friday Jun 20, 2008
miles/gallon is as misleading to consumers! Remember when I said perf/watt is misleading. How do we all avoid these 'math illusions'? Duke University researchers tell us this is simple, just "flip 'em"
Posting a vehicle’s fuel efficiency in “gallons per mile” (GPM) rather than “miles per gallon” (MPG) would help consumers make better decisions about car purchases and environmental impact, researchers from Duke University’s Fuqua School of Business report in the June 20 issue of Science magazine.
Video of Larrick & Soll discussing their research:
click here
Article on Larrick & Soll’s research, which was funded by Duke University.
Check out the above video, you can see that people try to judge by linear improvement in miles/gallon, but this is very misleading. The recommend that we switch to gallons/mile!
Remember back in March 2007, where I said the metric is watt/performance and not perf/watt. http://blogs.sun.com/bmseer/entry/power_efficiency_metrics_clearing_up. Time for SPEC to reconsider their metrics, and only allow default settings to be measured in benchmarks (if power-management is not on by factory default it should NOT be measured in a test - that way customers are best served.
Improving inefficient cars saves a lot of gas, the same valid reasoning shows improving %utilisation IS the big win especially when coupled with efficient servers.
Nothing like a little vindication to start the weekend, OK it's getting late, cya next week
for a table on savings at differnet miles/gallon see:
http://www.fuqua.duke.edu/news/mpg/table.pdf
Friday Jun 20, 2008
Sun is talking more about its IB switches. The large-scale switch is called
the Sun Datacenter Switch 3456. You may have seen its internal Sun code name "Magnum."
Sun uses some of these highlights about the Sun Datacenter Switch 3456:
- Ideally suited for the Sun Blade 6048 modular system to deliver an open PetaScale architecture
- Highly scalable supporting up to four Sun Datacenter Switches and up to 13,824 server nodes
- Replaces 300 discrete InfiniBand switches and thousands of cables with a single core switch
- A 3:1 reduction of physical ports and cables for server connectivity
The smallest IB switch is called
the Sun Datacenter Switch 3x24. You may have seen its internal Sun code name "NanoMagnum."
Sun uses some of these highlights about the Sun Datacenter Switch 3x24:
- When combined with the Sun Blade 6048 modular system, this 1RU 19" InfiniBand switch delivers a high performance switching solution for two up to 288 blade servers
- Extremely low latency using industry-standard IB transport, and commodity processors from AMD, Intel, and Sun
- Substitutes competitive 12RU solutions and hundreds of cables with 1/3 number of cables and only occupies up to four rack units for clusters of up to 288 blade servers
If you want to see how you can use this switch to connect six Sun Blade 6048 racks, see this blog:
http://blogs.sun.com/simons/entry/inside_nano_magnum_the_sun.
Wednesday Jun 18, 2008
Today Sun announced its powerful new Sun Blade X6450 server module at the International Supercomputing Conference (ISC) in Dresden, Germany (blog pics here).
sun.com Sun Blade X6450 features
Sun writes: "The Sun Blade X6450 is the newest in the 6000 series server modules, and brings the Sun Constellation System to the next level of performance through enhanced features that include up to four Intel Xeon dual- or quad-core processors, an optional 16GB Compact Flash storage subsystem, 24 DIMM slots and 110Gbps I/O throughput. This potent technology combination can deliver up to seven teraflops of performance per fully populated Sun Blade 6048 chassis and up to 71% more compute cores."
I'm sure http://blogs.sun.com/HPC/ will write more about ISC in Dresden.
Thursday Jun 12, 2008
The IBM Power 595 IBM reached over 6 million tpmC on the TPC-C benchmark,
but IBM avoids single-system TPC-H like the plague, why? Why didn't IBM measure and publish server watts actually used on this benchmark? Did that 4TByte of memory flame their power meters?
{postscript: an IBM blogger says it is Sun speaking and then points to this blog, no these are the BM Seer's opinions (yes I am a Sun Employee) but don't necesarily represent Sun or Sun's management. I'm glad Sun doesn't post on the above mentioned benchmark. It is worthless. Sun publishes on most benchmarks, I'd say more than IBM, to see a huge list of very reasonable benchmarks avoided by IBM on the power6 servers see:
blogs.sun.com/bmseer/entry/they_tried_to_make_ibm}
It is no mystery that my opinion is that the 16-year old TPC-C benchmark has been worthless for at least a decade. It isn't the fact that TPC-C is old but that it does not represent databases today (did it even then?).
Has IBM just optimized solely for TPC-C on hyper-expensive cores? Their engineers basically admit extreme benchmark optimization:
http://blogs.sun.com/bmseer/entry/careful_reading_shows_a_lot
http://blogs.sun.com/bmseer/tags/tpc-c
It is simplistic, small, encourages silly configs, even honest people in IBM admitted a year ago that it is losing relevance:
ftp://ftp.software.ibm.com/eserver/benchmarks/wp_TPC-E_Benchmark_022307.pdf
Even IBM admits in the paper above, "TPC-C configurations do not reflect typical client configurations." They go on to call "Ease of partitioning: Unrealistically easy". Also all referential integrity for every table is turned OFF!
"The TPC-C benchmark is comprised of 5 stored procedure calls: New-Order, Payment, Delivery, Order-Status and Stock-Level." see this Microsoft blog from over a year ago. FIVE, Five, really only five - a huge server doing only 5 very-very simple things on 9 tables. No one in the world has a database that looks like this - it is really useless.
IBM and other vendors keep pushing TPC-C for bragging rights. They spend a huge effort telling customers that they need it.
What's next? IBM re-hyping other ancient benchmarks like Dhrystones as the most relevant benchmark for POWER6?
Disclosure Information:
IBM Power 595 (5 GHz, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5 TPC-C result of 6,085,166 tpmC ($2.81/tpmC, configuration available 12/10/08)
Results as of 6/10/08, see www.tpc.org. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Performance Processing Council (TPC).
Wednesday Jun 11, 2008
Thick versus Thin Clients: Today online at 4PM(east)/1PM(pac) there will be a debate to discuss the energy use and TCO of a thin client model versus their thicker alternatives. See: http://blogs.intel.com/technology/2008/06/ecotechnology_great_debates_at.php
I'm sure the "think thin" will have followup thoughts on this afterwards at: blogs.sun.com/ThinkThin
Desktop systems have a very different usage model (for most people's work environment they are mostly IDLE or very low utilisation), so thin usually wins. Servers are a different beast and you really want to run fewer servers at high utilisation)
Found this today whilst browsing through the ...
OK so in 6 months IBM will have what Sun has had s...
yes, im sure thats really cost effective........
..why is it that IBM supporters when told a fact...
By the way alex, since you are commenting on ...
Alex I forgot to say, if you decided to buy the ol...
The fact that cpu's cant be hot swapped today, isn...