BM Seer Facts & Questions from an Anonymous Sun Source

Sun World Record for Largest Data Warehouse - One Petabyte!

Thursday May 15, 2008

Sun SPARC Enterprise M9000 server, Sybase IQ, and BMMsoft Server managed one Petabyte of raw data. That was over 6 trillion rows of transactional data and more than 185 million content-searchable documents, emails, reports, spreadsheets and other multimedia objects! This even set a new Guinness World Record™.

This is twice the size of the largest commercial data warehouse known to date. The largest known database is Walmart which is said to have half a terabyte of data using the Teradata DB.

[6] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

Single system world record TPC-H Sun SPARC Enterprise M9000 Sun and StorageTek 2540

Thursday May 08, 2008

The Sun SPARC Enterprise M9000 configured with SPARC VI processors, Sun StorEdge 2540 Arrays, and running Solaris 10 combined with Oracle 11g achieved World Record TPC-H performance of 118,573.3 QphH@1000GB for non-clustered systems.

The TPC-H result demonstrates that the Sun SPARC Enterprise M9000 can handle the increasingly large databases required of DSS systems. Oracle delivered 13 GB/sec during the benchmark. editorial note: IBM has never proven delivered IO rates? Why does IBM only resort to quoting un-obtainable peaks?

  • Why no single-system 4.7GHz or 5.0GHz Power6 on TPC-H? IBM has cluster results, this allows IBM to avoid comparisons to a single system.
  • The Sun SPARC Enterprise M9000 outperformed the next best competitor non-clustered system, the HP Integrity Superdome by 69%.
  • The Sun SPARC Enterprise M9000 outperformed the next best competitor non-cluster system, the HP Integrity Superdome by 18% on price/performance.
  • The Sun SPARC Enterprise M9000 outperformed the clustered IBM xSeries 346 by 122%.
  • The Sun SPARC Enterprise M9000 outperformed the clustered IBM xSeries 346 by 29% on price/performance.
  • Sun StorageTek 2540 Array disk configuration - the 20x ST2540 configuration in this benchmark delivered sustained rates of 13.7 GB/sec and showed linear scaling from 1 to 20 arrays.
  • This result demonstrates the effectiveness of Solaris 10 running Oracle 11g.
TPC-H @1000GB Single-system (non-cluster) Performance Chart

QphH = the Composite Metric (bigger is better), $/QphH = the Price/Performance metric (smaller is better)

System Metric
QphH
3 Year
Total
Sys $
$/QphH QppH QthH  
CPUs
Storage
Amount
Sun SPARC Enterprise M9000 118,573.3 $2,772,675 $23.38 114,725.4 122,550.2 32 34.8 TB
HP Superdome 69,999.0 $2,008,168 $28.69 90,909.1 53,898.5 32 39.7 TB
HP Superdome 68,100.6 $4,008,065 $59.00 83,041.7 55,847.7 64 40.6 TB


System  
CPU
 
Cluster
CPU
MHz
CPU Operating System Database RDBMS+HW
Avail
date
Sun SPARC Enterprise M9000 32 N 2400 SPARC64 VI Solaris 10 Oracle 11g 09/10/2008
HP Superdome 32 N 1600 Itanium2 Windows Server 2003 SQL Server 2005 06/18/2007
HP Superdome 64 N 1600 Itanium2 HP-UX 11.i V2 Oracle 10gR2 01/18/2006

Benchmark Description

The TPC-H benchmark is a performance benchmark established by the Transaction Processing Council (TPC) to demonstrate Data Warehousing/Decision Support Systems (DSS). TPC-H measurements are produced for customers to evaluate the performance of various DSS systems. These queries and updates are executed against a standard database under controlled conditions. Performance projections and comparisons between different TPC-H Database sizes (100GB, 300GB, 1000GB, 3000GB and 10000GB) are not allowed by the TPC.

TPC-H is a data warehousing-oriented, non-industry-specific benchmark that consists of a large number of complex queries typical of decision support applications. It also includes some insert and delete activity that is intended to simulate loading and purging data from a warehouse. TPC-H measures the combined performance of a particular database manager on a specific computer system.

The main performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@SF, where SF is the number of GB of raw data, referred to as the scale factor). QphH@SF is intended to summarize the ability of the system to process queries in both single and multi user modes. The benchmark requires reporting of price/performance, which is the ratio of QphH to total HW/SW cost plus 3 years maintenance. A secondary metric is the storage efficiency, which is the ratio of total configured disk space in GB to the scale factor.

Disclosure Statement:

Sun SPARC Enterprise M9000 118,573.3 QphH@1000GB, $23.38/QphH@1000GB, avail 09/10/08, HP Integrity Superdome 69,999.0 QphH@1000GB, $28.69/QphH@1000GB avail 06/18/07, HP Integrity Superdome 68,100.6 QphH@1000GB, $59.00/QphH@1000GB avail 01/18/06, IBM xSeries 346 QphH@1000GB, $32.80/QphH@1000GB, avail 02/14/05, TPC-H, QphH, $/QphH tm of Transaction Processing Performance Council (TPC). More info www.tpc.org.

A 128-core (32-node 4-core) IBM Power 570 cluster (4.7 GHz, 64 chips, 256 threads) with DB2 is the best overall system at 10TB (343,551 QphH@10000GB, 32.89$/QphH, configuration available 04/15/08, Results as of 5/07/08). Note: Do not divide this result by 32 to guess at single node performance, do not compare $/perf between different GB tests, these are not permitted by TPC rules!

Results Summary

  • Audited Results
  • Database Size:
  • 1000 GB (Scale Factor 1000)
  • TPC-H Composite:
  • 118,573.3 QphH@1000GB
  • Price/performance:
  • $23.38/QphH@1000GB
  • Available:
  • 09/10/2008 for Oracle (Sun HW/SW available 05/02/2008)
  • Number of Systems:
  • One Sun SPARC Enterprise M9000
  • Total Number Processors:
  • 32
  • Processor/MHz of Server:
  • SPARC VI 2400 MHz / 6MB L2 Cache
  • Storage:
  • 34.8 Terabytes of disk
  • Database:
  • Oracle 11g
  • Operating System:
  • Solaris 10 Update 4
  • Total 3 year Cost:
  • $2,772,675
  • Other Performance Metrics
  • TPC-H Power:
  • 114,725.4
  • TPC-H Throughput:
  • 122,550.2
  • Database Load Time:
  • 1:35:27

Like this post? del.icio.us | furl | slashdot | technorati | digg

Sun's Delivered Memory Bandwidth Leadership: Stream Benchmark

Wednesday Apr 18, 2007

Sun has faster delivered memory bandwidth than the best that IBM or HP can do. The Sun SPARC Enterprise M9000 beat IBM p5 595 by 10% on Stream TRIAD benchmark. The Sun SPARC Enterprise M9000 beat the HP Integrity Superdome by 33% on Stream TRIAD benchmark. The Sun SPARC Enterprise M9000, running with 2.4GHz SPARC64 VI processors, delivered a Stream TRIAD benchmark result of 227.1GB/s.

Don't let the core count confuse you, IBM cores cost over twice Sun's cores. Look at the other benchmark results posted to see that IBM costs more, is slower, and has fewer cores - but it is the best IBM that offers.

Be careful to compare measured/delivered bandwidth, other vendors sometimes try to confuse with peaks.

Stream Performance Chart - GB/s (1 MB=10^9 B, *not* 2^x B, bigger is better)

System GHz cores COPY SCALE ADD TRIAD
Sun SE M9000 2.4 128 224.4 223.1 224.2 227.1
IBM p5 595 2.3 64 186.1 179.6 200.4 206.2
HP Integrity SuperDome 1.6 128 154.5 153.0 169.5 170.8
HP Integrity SuperDome 1.6 64 116.1 114.6 127.9 128.7
Sun SE M9000 2.4 64 114.9 114.6 130.0 134.4
IBM p5-575 2.2 8 77.9 81.2 96.7 100.5
Sun SE M8000 2.4 32 60.3 60.2 69.3 69.6
Sun SE M5000 2.15 16 24.8 24.8 25.2 25.3
Sun SE M4000 2.15 8 12.6 12.5 12.7 12.7

Benchmark Description

The STREAM benchmark is a simple synthetic benchmark program that measures sustainable memory bandwidth (in MB/s) for simple vector kernels. All memory accesses are sequential, so a picture of how fast regular data can be moved through the system is portrayed. Properly run, the benchmark displays the characteristics of the memory system of the machine and not the advantages of running from the systems memory caches.

STREAM counts how many bytes that were read plus how many bytes that were written. For the simple "Copy" kernel, this is exactly twice the number obtained from the "bcopy" convention. STREAM does this because three of the four kernels do arithmetic, so it makes sense to count both the data read into the CPU and the data written back from the CPU. The "Copy" kernel does no arithmetic, but for consistency, counts bytes the same way as the other three.

The sequential nature of the memory references is the benchmark's biggest weakness. The benchmark does not expose limitations in a system's interconnect to move data from anywhere in the system to anywhere.

Disclosure Statement:

Stream is a publically available benchmark and can be found at http://www.cs.virginia.edu/stream. Results as of 4/13/07.

System Configuration

Systems under test:

  • Sun SPARC Enterprise M9000
  • 64 x 2.4GHz SPARC64 VI processors
  • 1TB memory
  • Solaris 10
  • Sun Studio 12

Like this post? del.icio.us | furl | slashdot | technorati | digg

World Record SPECompL2001 on Sun SPARC Enterprise M9000

Wednesday Apr 18, 2007

The Sun SPARC Enterprise M9000 (2.4GHz SPARC64 VI) set a World Record on the SPECompL2001 benchmark with a score of 1230446. The Sun SPARC Enterprise M9000 (2.4GHz SPARC64 VI) set a World Record on the SPECompLbase2001 benchmark with a score of 1148235.

  • The Sun SPARC Enterprise M9000 (2.4GHz SPARC64 VI) beat the IBM POWER5+ on SPECompL2001 by 16%
  • The Sun SPARC Enterprise M9000 (2.4GHz SPARC64 VI) beat the Itanium 2 dual-core SGI server on SPECompL2001 by 22%
  • Don't let the core count confuse you, IBM cores cost over twice Sun's cores. Look at the other benchmark results posted to see that IBM costs more, is slower, and has fewer cores - but it is the best IBM that offers.

    SPEComp2001 Performance Chart - SPECompL2001 (bigger is better, ordered by peak metric)

    Result Cores Chips OpenMP
    Threads
    System
    Peak Base
    1230446 1148235 128 64 128 Sun SE M9000, SPARC64 VI 2.4GHz
    1056459 1005583 64 32 128 IBM p5-595, POWER5+ 2.3GHz
    1005076 987139 256 128 256 SGI Altix 4700, Itanium2 1.6GHz

    Benchmark Description

    The SPEC OMPL2001 Benchmark Suite was released in June 2001 and tests HPC performance using OpenMP for parallelism. SPEC OMPL2001 consists of 9 programs (2 in C and 7 in Fortran) parallelized using OpenMP API.

    Goals of suite:

  • Targeted to Large-range (8-128 processor) parallel systems
  • Run rules, tools and reporting similar to SPEC CPU2000
  • Programs representative of HPC and Scientific Applications
  • Disclosure Statement:

    SPEC, SPEComp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 04/17/07. Sun results submitted to SPEC. Sun SPARC Enterprise M9000 (128 OMP threads, 128 cores, 64 chips, 2.4GHz) 1230446 SPECompL2001. SGI Altix 4700, Itanium 2 (256 OMP threads, 256 cores, 128 chips) 1005076 SPECompL2001. IBM p5-595 (128 OMP threads, 64 cores, 32 chips) 1056459 SPECompL2001. Sun SPARC Enterprise M9000 (128 cores, 64 chips, 128 OMP threads, 2.4GHz) 1148235 SPECompLbase2001.
    Result
    M9000: 1230446 SPECompL2001
    Reference Date: Apr 17, 2007
    System: Sun SPARC Enterprise M9000
    Total Number Processors: 64
    Total Memory : 1 TB (512x2GB DIMMs)
    Processor/GHz of Server: SPARC64 VI, 2.4 GHz
    Operating System: Solaris 10
    Compiler: Sun Studio 12

    Like this post? del.icio.us | furl | slashdot | technorati | digg

    Sun SPARC Enterprise M9000 tops 1 TFLOP/s - twice as fast as IBM p595

    Tuesday Apr 17, 2007

    The Sun SPARC Enterprise M9000 outperforms the best published single system from IBM p5 595 (1.9GHz POWER5) by over 2X on the Linpack benchmark (Highly Parallel Computing). The Sun SPARC Enterprise M9000 also tops the high-end single-system Itanium 2 based system from HP (Superdome, 1.6GHz/24MB) by 38% on the Linpack.

    Of the 3 vendors Sun, IBM and HP, only Sun can deliver over a TFLOP/s of performance in a single system on the Linpack HPC benchmark. (IBM, POWER5-based systems).

    This benchmark also used the Sun Performance Library which as many routines important to scientific users. This library has been enhanced to take advantage of the SPARC64 VI architecture.

    LINPACK HPC Performance - GFLOPS (bigger is better)

    System GFLOPS Processors
    Total Peak Threads CPUs Type GHz
    Sun SPARC Enterprise M9000 1032.0 1228.8 128 64 SPARC64 VI 2.4
    HP Superdome 745.5 819.2 128 64 Itanium 2 1.6
    IBM p5 595 418.0 486.4 64 32 POWER5+ 1.9

    Benchmark Description

    The Linpack benchmark suite measures the performance for factoring and solving a dense set of linear equations in double-precision floating-point.

    The Linpack HPC benchmark allows the solution of any size matrix with a single right hand side. It was developed to allow vendors to show off their hardware. Because big problems allow for peak performance potentials, the benchmark is seen as an upper bound of potential performance of a machine. The run rules are much more flexible. The solution technique must use a pivoting scheme and the driver must follow the spirit of the Linpack 1000 or Linpack 100 benchmarks.

    Disclosure Statement:

    Linpack HPC, results from http://www.netlib.org/benchmark/index.html as of 04/13/07. Sun SPARC Enterprise M9000 (SPARC64 VI @2.4, 64 chips, 128 cores), 1.032 TFLOPS. IBM p5 595 (POWER5 1.9GHz, 32 chips, 64 cores) 418.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips, 128 cores) 745.5 GFLOPS.

    System Configuration

  • Sun SPARC Enterprise M9000
  • 64 x 2.4 GHz SPARC64 VI processors
  • 1 TB memory
  • Solaris 10
  • Sun Studio 12
  • [5] Comments
    Like this post? del.icio.us | furl | slashdot | technorati | digg