BM Seer Unofficial thoughts from an anonymous Sun employee

Sun's New World Record on SPECcpu

Tuesday Apr 14, 2009

Today Sun announced world records for SPECfp2006: 50.4 on a 2-chip Nehalem (Intel Xeon X5570) Sun Blade X6270 as well as SPECint2006: 36.9 on a 2-chip Nehalem (Intel Xeon X5570) Sun Blade X6270.

Read more at: http://blogs.sun.com/jhenning/entry/sun_studio_trounces_intel_compiler.

Yes, even on servers based on the same CPUs as others, Sun can make a difference. Congrats to those on the Sun Studio Compiler team. They beats Intel's own compiler on this Intel chip by 20%, due to the optimization technologies found in the Sun Studio 12 Update 1 compiler.

See John's posting above for more info. On a different note, notice how much information Sun puts out our benchmarks - lots! Fun to look at IBM bloggers, some of whom spend 90% of their blog on "cute" and only 10% talking about benchmark results. Information is not ones enemy.

Disclosure Statement:

SPEC, SPECint, SPECfp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 4/14/2009. Sun Blade X6270 (Intel Xeon X5570 / 2 chips / 8 cores) 50.4 SPECfp2006, IBM System p 570 (POWER6, 1 chip / 1 core) 24.9 SPECfp2006.

Like this post? del.icio.us | furl | slashdot | technorati | digg

Sun Fire X4275 Sybase IQ TPC-H 1000GB World Record Price Performance, Non-Clustered

Tuesday Apr 14, 2009

This TPC-H result demonstrates that the Sun Fire X4275 server, powered by 2 Quad-core 2.93 GHz Intel Nehalem X5570 processors, using only 12 internal disks (SAS 300GB 15K RPM), achieved a QphH@1000GB of 23,365 with a price performance of $2.41. This is the best price performance among all non-clustered server results at 1000GB.

Best price/performance among all TPC-H results at 1000GB, 70% better than the previous best (Sun Fire X4500) and 75% better than the previous second best ie. the HP DL585.

It is the Best 2-chip or 2-socket server result, even better than many 4-sockets servers.

To put this result in perspective, the best non Sun single server submission at 1000GB was the HP Superdome. The Superdome achieved  a QphH of 69,999 (about 3 times the Sun Fire X4275 performance) BUT:  it required almost 100 times the number of disks, more than 35 times the price and 8 times the number of cores when compared  to the Sun Fire X4275 configuration!

Once again, the Sun/SybaseIQ combination has produced a self-contained (i.e. a server without any external storage or external processing engines) data warehousing solution. Only Sun has the hardware and expertise to produce such TPC-H results. To date, Sun has published self-contained TPC-H results for each of the 100GB, 300GB, 1000GB and 3000GB scale-factors.

This is a extremely compact solution requiring only 2 rack units in total. Again contrast the Sun result with the HP Superdome, using 97 storage arrays at 3 RU each plus a 48 inch cabinet for the server.

Extremely efficient power consumption; peak power consumption throughout the entire benchmark run was 825 Watts with an average consumption of 750 Watts.

{humor: Any comments from HP or Dell or IBM why they never publish watts on any standard benchmarks with real size memory (i.e. anything above 16GB) ? } I'll take comments from incognito HP, IBM, or Dell employees below, as always. :)

Performance Results

In order to put the Sun Fire X4275 TPC-H result in perspective, the table below shows the top non-clustered TPC-H@1000 results from Sun, Bull and HP in ascending order of  $/QphH as of April 14, 2009.

System
CPU

so/
co/
th

DB

QphH

$/QphH

Price
$USD

# Disks

Avail-
able

Data
Ratio

Sun Fire X4275, 72GB
Intel X5540, 2.93GHz

2/8/16

Sybase IQ

23,365

2.41

56,263.91

12

4/14/09

3.5

Sun Fire X4500, 64GB
AMD Opteron 2.8GHz

2/4/4

Sybase IQ

5,604

8.11

45,439

48

10/15/07

11.2

HP DL585 G2, 32GB
AMD Opteron 2.8GHz

4/8/8

SQL Server

14,773

9.73

143,736

206

4/25/07

7.8

Bull Novascale 3045, 64GB
Itanium 1.6GHz

4/8/16

SQL Server

12,087

12.56

151,870

160

3/6/07

5.7

HP DL585 G1, 64GB
AMD Opteron 2.4GHz

4/4/4

SQL Server

10,493

13.83

145,264

164

3/2/06

6.4

HP Superdome

32/
64/
64

SQL Server

69,999

28.69

2,008,168

1198

6/18/07

40.63

Legend:

so/co/th = sockets, cores, threads
QphH  = Overall TPC-H Composite Metric (bigger is better).
$/QphH  = TPC-H Price/Performance metric (smaller is better)
Data Ratio = Total disk to actual data ratio

Complete benchmark results may be found at http://www.tpc.org.

Benchmark Description

The results reported here were performed on a Sun Fire X4275 system and used Sybase IQ as the database manager. Sybase IQ is a special product designed specifically for data warehousing applications. Sybase IQ was developed as a totally separate product from the more widely known Sybase database management system (Sybase Adaptive Server).

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (300GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor.

The QphH composite metric is the Geometric Mean of 2 components: (1) a single user component, called Power, and a (2) multi-user component, called Throughput.  Power is a performance measurement of a single user stream of 22 queries, one batch insert and one batch delete, all run serially. The Throughput metric, instead, consists of essentially N concurrent Power streams (or “users” submitting queries), where N is a minimum number of required streams dependent upon the database size. For example, at 300GB, N must be at least 5 and at 300GB N must be at least 6. Both Power and Throughput are calculated metrics and each is inversely proportional to the queries elapsed time: thus the faster the queries finish, the larger the metric becomes and the better the result.

Disclosure Statement:

TPC-H, QphH, $/QphH are registered trademarks of the Transaction Processing Performance Council (TPC). More info at http://www.tpc.org/. Sun Fire X4275 23,365@1000GB, $2.41/QphH@1000GB, available 4/14/09.

Results Summary

Audited Results
  Database Size:   1000 GB (Scale Factor 1000)  
  TPC-H Composite:   23,365.3  
  Price/performance:   $2.41  
  Available   4/14/09  
Number of Systems:   1  
Total Number Processors:   2  
Total Number of Cores   8  
Total Number of Threads   16  
Processor/MHz of Server:   Intel Nehalem 2.93 GHz X5570 Quad Core  
Storage:   12 x 15K SAS drives (all internal)  
Database:   Sybase IQ 15  
Operating System:   Solaris 10  
Total 3 year Cost:   $56,263.91  
Other Performance Metrics      
  TPC-H Power:   29,824.6  
  TPC-H Throughput:   18,304.9  
  Database Load Time:   5 Hr 39 Min  
  Storage Ratio:   3.35  

[1] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

SPECjvm2008 on Sun Blade X6270 World record result

Tuesday Apr 14, 2009

The Sun Blade X6270 server demonstrates Sun's position of leadership in Java based computing by publishing world record results for the SPECjvm2008 benchmark. The Sun Blade X6270 server delivered a result of 317.13 SPECjvm2008 Base ops/m using the Sun Java JDK 1.6.0_14 Performance Release with the OpenSolaris 2008.11 Operating System.

SPECjvm2008 Performance Chart (ordered by performance)

base: SPECjvm2008 Base ops/m (bigger is better)
peak: SPECjvm2008 Peak ops/m (bigger is better)
Ch/Co/Lc: Chips, Cores, Logical CPUs

System Processors Performance
Ch Co Lc GHz Type base peak
Sun Blade X6270 2 8 16 2.93 X5570 QC 317.13 -
Sun Fire X4450 4 24 24 2.66 X7450 6C 283.79 -
Sun Fire X4450 4 16 16 2.93 X7350 QC 260.08 -
Benchmark Description

SPECjvm2008 (Java Virtual Machine Benchmark) is a benchmark suite for measuring the performance of a Java Runtime Environment (JRE), containing several real life applications and benchmarks focusing on core java functionality. The suite focuses on the performance of the JRE executing a single application; it reflects the performance of the hardware processor and memory subsystem, but has low dependence on file I/O and includes no network I/O across machines. The SPECjvm2008 workload mimics a variety of common general purpose application computations. These characteristics reflect the intent that this benchmark will be applicable to measuring basic Java performance on a wide variety of both client and server systems.

SPEC also finds user experience of Java important, and the suite therefore includes startup benchmarks and has a required run category called base, which must be run without any tuning of the JVM to improve the out of the box performance.

SPECjvm2008 benchmark highlights:

  • Leverages real life applications (like derby, sunflow, and javac) and area-focused benchmarks (like xml, serialization, crypto, and scimark).
  • Also measures the performance of the operating system and hardware in the context of executing the JRE.

Disclosure Statement:

SPEC, SPECjvm reg tm of Standard Performance Evaluation Corporation. Results as of 4/14/08 on http://www.spec.org. Sun Blade X6270(2 chips, 8 cores) 317.13 SPECjvm2008 Base ops/m submitted to SPEC for review. Sun Fire X4450(4 chips, 24 cores) 283.79 SPECjvm2008 Base ops/m Sun Fire X4450(4 chips, 16 cores) 260.08 SPECjvm2008 Base ops/m

System Configuration
Results
Performance: 317.13 SPECjvm2008 Base ops/m
Reference Date: Apr 14, 2009
Systems: Sun Blade X6270
Total Number Processors: 2
Processor/ GHz of Server: Intel Xeon X5570 QC 2.93 GHz
Operating System: OpenSolaris 2008.11
JVM: Java HotSpot(TM) 64-Bit Server VM on Solaris, version 1.6.0_14 Performance Release

[1] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

Web2.0 Consolidation Sun SPARC Enterprise T5120

Wednesday Nov 05, 2008

This is another interesting consolidation test. This one used zones/containers. The previous posting was "native consolidation" that just used Solaris for consolidation without any additional features. How you consolidate and what your requirements are will of course change what consolidation software you use.

Before the commenters who love competitive technologies post, clearly this test was done by Sun to show the value of upgrading.

Web2.0 data centers are filled with racks of x86 servers. Data center architects simply put a single app on a single box, but this can be difficult to manage and inefficient in terms of utilization, power, and space. There is a very easy way however to consolidate many web servers onto a single CMT server.

With the introduction of the UltraSPARC T2+ processor, the compute density has once again been massively increased and is a natural platform for consolidation. UltraSPARC CMT servers can use LDOMs, zones and resources groups to further the managability of compute resources in order to provide fantastic benefits in power, space, and performance. Sun can consolidate ten 2-socket x64 systems into a single 1RU CMT server. Sun also produces a 7.8x better power performance results with this consolidation benchmark.

Reduce the overall cost and footprint using Sun UltraSPARC CMT servers with an optimized web2.0 software stack.

Proves CMT architecture can scale up and out better than traditional x86 based machines.

Sun's optimized Coolstack 1.3.1 scales to meet the needs of web2.0 workloads.

Virtulization with Solaris Zones allows for easy replication of the web2.0 stack.

Consolidation of previous generation gear can easily be mapped to Solaris Zones. IP addresses stay the same, server names, etc..

Ten times or greater reduction in footprint when upgrading the data center from traditional x86 architectures.

3,200 users per UltraSPARC T2 socket, and 400 users supported per Zone.

Results Summary

For each UltraSPARC T2 socket, ten older x86 servers can be eliminated. Additionally, by increasing average utilization and expending less watts/user this vastly improves compute density and creating a huge savings in floor space and power.

System Processors Results
Ch, Cr, Thr GHz Type users Util% RU watts / user users / RU
Sun Fire T5120 1, 8, 64 1.4 UltraSPARC T2 3,200 95 1 0.15 3,200
Sun Fire T5120 1, 8, 64 1.4 UltraSPARC T2 2,400 60 1 0.20 2,400
Sun Fire v20z 2, 2, 2 2.2 AMD 248 300 40 1 1.163 300
4 x Sun Fire x4200 (Distributed) 16, 16, 16 2.2 AMD 248 1,900 xx 8 0.73 238

Consolidation process

For easy mapping from the old environment to the new, we created one zone for each "core" on the UltraSPARC T2. For the 2,400 user run, we had 300 users per zone simulating the consolidation of 8 x v20z servers. We chose 8 zones so that each zone could be mapped to a core. For UltraSPARC T2+ based servers, more zones could easily be added to take advantage of the throughput of this server.

This flexible environment can be scaled up or down based on the needs of the applications running in the zone. For this test, we created one full "LAMP" stack on each of the 8 local Solaris zones. Next, each zone was scaled from 100 to 400 users running the Olio web2.0 benchmark. The T5120 server was able to support a total of 3,200 users.

Benchmark Description

The application in the web2.0 kit implements a social events calendar with features such as AJAX, tagging, tag cloud, comments, ratings, feeds, mashups, extensive use of data caching, use of both structured and unstructured data and a high data read:write ratio that is typical of applications in this space. The web2.0 benchmark kit has multiple different flavors. For purposes of this evaluation, we decided to use the following components all running on UltraSPARC CMT servers:

  • Solaris
  • Apache
  • memcached
  • MySQL
  • PHP

See Also

System Configuration

Sun Fire T5120 with:

  • 1x UltraSPARC T2, 1.4  GHz processors
  • 64 GB of memory
  • global zone
  • 8 x local zones
Software:
  • Operating System: Solaris 10 5/08
  • Coolstack 1.3.1 software: PHP, MySQL, Apache, Memcached, Tomcat
  • Faban benchmark driver v0.9
  • Web2.0 benchmark kit - 082108

Like this post? del.icio.us | furl | slashdot | technorati | digg

Intel defaults and judging performance

Tuesday Sep 18, 2007

Intel non-default BIOS change results by 25%? Sure turning off prefetch is a technique but if you don't know if a priori if you should, then should you use it to judge performance?

Always interesting when you have more information. I guess our friends at AMD wanted everyone to see what our friends at Intel were doing so they submitted two SPEC results for them.

Case in point on Clovertown there are two AMD results on the same hardware that gives 25% difference.

Point: Normal mode = prefetch on
Gives 163,080 SPECjbb2005 bops
www.spec.org/jbb2005/results/res2007q2/jbb2005-20070326-00276.txt

Counter-point: Disable HW prefetcher in BIOS for benchmark imprv
Gives 203,754 SPECjbb2005 bops
www.spec.org/jbb2005/results/res2007q2/jbb2005-20070326-00275.txt

...both on the same hardware:
same: 2-socket SuperMicro X7DBE (Intel 2.66GHz Xeon quad-core X5355), 16 GB

Disclosure statement

SPECjbb2005 SuperMicro X7DBE (2 chips, 8 cores, 2.66 GHz) SPECjbb2005 bops=163080, SPECjbb2005 bops/JVM=81540 submitted by AMD; SuperMicro X7DBE (2 chips, 8 cores, 2.66 GHz) SPECjbb2005 bops=203754, SPECjbb2005 bops/JVM=101877 submitted by AMD; SPEC, SPECjbb are registered trademarks of Standard Performance Evaluation Corporation. Results 3/7/07 on www.spec.org.

[5] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

EDA vendors seeing Solaris benefits

Monday Jan 29, 2007

As we've shown in previous blog entries, lots of Solaris benefits in terms of robustness and performance, etc.

We're seeing more and more vendors aligning around Solaris. For example, in the EDA market there was last years announcement by Synopsys to support VCS on Solaris 10 (on both X64 and SPARC). Press release: http://www.synopsys.com/news/announce/press2005/sun_snps_vcs_pr.html

Also Cadence is showing broad support across its product lines for Solaris 10 for both SPARC and Opteron. Press release: http://www.cadence.com/company/newsroom/press_releases/pr.aspx?xml=010306_sun

more coming...

Like this post? del.icio.us | furl | slashdot | technorati | digg

Variety ways Solaris is leading Linux

Wednesday Jan 17, 2007

On this Blog, I've been showing a variety of Solaris vs. Linux performance comparisons (ex: http://blogs.sun.com/bmseer/entry/update_solaris_beating_linux_performance

As you know there are more than performance reasons to like Solaris, some of those are given in the online article called "Sun Goes After Linux" by Andy Patrizio on internetnews.com. That link is: http://www.internetnews.com/ent-news/article.php/3653871

Like this post? del.icio.us | furl | slashdot | technorati | digg