Thursday Nov 05, 2009

TPC-C Sun SPARC Enterprise T5440 with Oracle RAC World Record Database Result

Sun and Oracle demonstrate the World's fastest database performance. Sun Microsystems using 12 Sun SPARC Enterprise T5440 servers, 60 Sun Storage F5100 Flash arrays and Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning delivered a world-record TPC-C benchmark result.

  • The 12-node Sun SPARC Enterprise T5440 server cluster result delivered a world record TPC-C benchmark result of 7,646,486.7 tpmC and $2.36 $/tpmC (USD) using Oracle 11g R1 on a configuration available 12/14/09.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the IBM Power 595 (5GHz) with IBM DB2 9.5 database by 26% and has 16% better price/performance on the TPC-C benchmark.

  • The complete Oracle/Sun solution used 10.7x better computational density than the IBM configuration (computational density = performance/rack).

  • The complete Oracle/Sun solution used 8 times fewer racks than the IBM configuration.

  • The complete Oracle/Sun solution has 5.9x better power/performance than the IBM configuration.

  • The 12-node Sun SPARC Enterprise T5440 server cluster beats the performance of the HP Superdome (1.6GHz Itanium2) by 87% and has 19% better price/performance on the TPC-C benchmark.

  • The Oracle/Sun solution utilized Sun FlashFire technology to deliver this result. The Sun Storage F5100 flash array was used for database storage.

  • Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning scales and effectively uses all of the nodes in this configuration to produce the world record performance.

  • This result showed Sun and Oracle's integrated hardware and software stacks provide industry-leading performance.

More information on this benchmark will be posted in the next several days.

Performance Landscape

TPC-C results (sorted by tpmC, bigger is better)


System
tpmC Price/tpmC Avail Database Cluster Racks w/KtpmC
12 x Sun SPARC Enterprise T5440 7,646,487 2.36 USD 12/14/09 Oracle 11g RAC Y 9 9.6
IBM Power 595 6,085,166 2.81 USD 12/10/08 IBM DB2 9.5 N 76 56.4
HP Integrity Superdome 4,092,799 2.93 USD 08/06/07 Oracle 10g R2 N 46 to be added

Avail - Availability date
w/KtmpC - Watts per 1000 tpmC
Racks - clients, servers, storage, infrastructure

Sun and IBM TPC-C Response times


System
tpmC

Response Time

New Order 90th%

Response Time

New Order Average

12 x Sun SPARC Enterprise T5440 7,646,487 0.170 0.168
IBM Power 595 6,085,166 1.69
1.22
Response Time Ratio - Sun Better

9.9x 7.3x

Sun uses 7x comparison to highlight the differences in response times between Sun's solution and IBM.  Although notice that Sun is 10x faster on New Order transactions that finish in the 90% percentile.

It is also interesting to note that none of Sun's response times, avg or 90th percentile, for any transaction is over 0.25 seconds. While IBM does not have even one interactive transaction, not even the menu, below 0.50 seconds. Graphs of Sun's and IBM's response times for New-Order can be found in the full disclosure reports on TPC's website TPC-C Official Result Page.

Results and Configuration Summary

Hardware Configuration:

    9 racks used to hold

    Servers:
      12 x Sun SPARC Enterprise T5440
      4 x 1.6 GHz UltraSPARC T2 Plus
      512 GB memory
      10 GbE network for cluster
    Storage:
      60 x Sun Storage F5100 Flash Array
      61 x Sun Fire X4275, Comstar SAS target emulation
      24 x Sun StorageTek 6140 (16 x 300 GB SAS 15K RPM)
      6 x Sun Storage J4400
      3 x 80-port Brocade FC switches
    Clients:
      24 x Sun Fire X4170, each with
      2 x 2.53 GHz X5540
      48 GB memory

Software Configuration:

    Solaris 10 10/09
    OpenSolaris 6/09 (COMSTAR) for Sun Fire X4275
    Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning
    Tuxedo CFS-R Tier 1
    Sun Web Server 7.0 Update 5

Benchmark Description

TPC-C is an OLTP system benchmark. It simulates a complete environment where a population of terminal operators executes transactions against a database. The benchmark is centered around the principal activities (transactions) of an order-entry environment. These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.

See Also

Disclosure Statement

TPC Benchmark C, tpmC, and TPC-C are trademarks of the Transaction Performance Processing Council (TPC). 12-node Sun SPARC Enterprise T5440 Cluster (1.6GHz UltraSPARC T2 Plus, 4 processor) with Oracle 11g Enterprise Edition with Real Application Clusters and Partitioning, 7,646,486.7 tpmC, $2.36/tpmC. Available 12/14/09. IBM Power 595 (5GHz Power6, 32 chips, 64 cores, 128 threads) with IBM DB2 9.5, 6,085,166 tpmC, $2.81/tpmC, available 12/10/08. HP Integrity Superdome(1.6GHz Itanium2, 64 processors, 128 cores, 256 threads) with Oracle 10g Enterprise Edition, 4,092,799 tpmC, $2.93/tpmC. Available 8/06/07. Source: www.tpc.org, results as of 11/5/09.

Monday Nov 02, 2009

A Sun Blade 6048 Modular System with 8 Sun Blade X6275 Server Modules configured with QDR InfiniBand cluster interconnect delivered outstanding performance running the FLUENT 12 benchmark test suite. Sun consistently delivered the best or near best results per node for the 6 benchmark tests considered up to the available nodes considered for these runs.

  • The Sun Blade X6275 cluster delivered the best results for the truck_poly_14M tests for all Rank counts tested.
  • For this large truck_poly_14m test case, the Sun Blade X6275 cluster beat the best results by SGI by as much as 19%.

  • Of the 54 test cases presented here, the Sun Blade X6275 cluster delivered the best results in 87% of the tests, 47 of the 54 cases.

Performance Landscape


FLUENT 12 Benchmark Test Suite
  Results are "Ratings" (bigger is better)
  Rating = No. of sequential runs of test case possible in 1 day 86,400/(Total Elapsed Run Time in Seconds)

System
Nodes Ranks Benchmark Test Case
eddy
417k
turbo
500k
aircraft
2m
sedan
4m
truck
14m
truck_poly
14m

Sun Blade X6275 16 128 6496.2 19307.3 8408.8 6341.3 1060.1 984.1
Best Intel 16 128 5236.4 (3) 15638.0 (7) 7981.5 (1) 6582.9 (1) 1005.8 (1) 933.0 (1)
Best SGI 16 128 7578.9 (5) 14706.4 (6) 6789.8 (4) 6249.5 (5) 1044.7 (4) 926.0 (4)

Sun Blade X6275 8 64 5308.8 26790.7 5574.2 5074.9 547.2 525.2
Best Intel 8 64 5016.0 (1) 25226.3 (1) 5220.5 (1) 4614.2 (1) 513.4 (1) 490.9 (1)
Best SGI 8 64 5142.9 (4) 23834.5 (4) 4614.2 (4) 4352.6 (4) 529.4 (4) 479.2 (4)

Sun Blade X6275 4 32 3066.5 13768.9 3066.5 2602.4 289.0 270.3
Best Intel 4 32 2856.2 (1) 13041.5 (1) 2837.4 (1) 2465.0 (1) 266.4 (1) 251.2 (1)
Best SGI 4 32 3083.0 (4) 13190.8 (4) 2588.8 (5) 2445.9 (5) 266.6 (4) 246.5 (4)

Sun Blade X6275 2 16 1714.3 7545.9 1519.1 1345.8 144.4 141.8
Best Intel 2 16 1585.3 (1) 7125.8 (1) 1428.1 (1) 1278.6 (1) 134.7 (1) 132.5 (1)
Best SGI 2 16 1708.4 (4) 7384.6 (4) 1507.9 (4) 1264.1 (5) 128.8 (4) 133.5 (4)

Sun Blade X6275 1 8 931.8 4061.1 827.2 681.5 73.0 73.8
Best Intel 1 8 920.1 (2) 3900.7 (2) 784.9 (2) 644.9 (1) 70.2 (2)) 70.9 (2)
Best SGI 1 8 953.1 (4) 4032.7 (4) 843.3 (4) 651.0 (4) 71.4 (4) 72.0 (4)

Sun Blade X6275 1 4 550.4 2425.3 533.6 423.0 41.6 41.6
Best Intel 1 4 515.7 (1) 2244.2 (1) 490.8 (1) 392.2 (1) 37.8 (1) 38.4 (1)
Best SGI 1 4 561.6 (4) 2416.8 (4) 526.9 (4) 412.6 (4) 40.9 (4) 40.8 (4)

Sun Blade X6275 1 2 299.6 1328.2 293.9 232.1 21.3 21.6
Best Intel 1 2 274.3 (1) 1201.7 (1) 266.1 (1) 214.2 (1) 18.9 (1) 19.6 (1)
Best SGI 1 2 294.2 (4) 1302.7 (4) 289.0 (4) 226.4 (4) 20.5 (4) 21.2 (4)

Sun Blade X6275 1 1 154.7 682.6 149.1 114.8 9.7 10.1
Best Intel 1 1 143.5 (1) 631.1 (1) 137.4 (1) 106.2 (1) 8.8 (1) 9.0 (1)
Best SGI 1 1 153.3 (4) 677.5 (4) 147.3 (4) 111.2 (4) 10.3 (4) 9.5 (4)

Sun Blade X6275 1 serial 155.6 676.6 156.9 110.0 9.4 10.3
Best Intel 1 serial 146.6 (2) 650.0 (2) 150.2 (2) 105.6 (2) 8.8 (2) 9.7 (2)

    Sun Blade X6275, X5570 QC 2.93 GHz, QDR SMT on / Turbo mode on

    (1) Intel Whitebox (X5560 QC 2.80 GHz, RHEL5, IB)
    (2) Intel Whitebox (X5570 QC 2.93 GHz, RHEL5)
    (3) Intel Whitebox (X5482 QC 3.20 GHz, RHEL5, IB)
    (4) SGI Altix ICE_8200IP95 (X5570 2.93 GHz +turbo, SLES10, IB)
    (5) SGI Altix_ICE_8200IP95 (X5570 2.93 GHz, SLES10, IB)
    (6) SGI Altix_ICE_8200EX (Intel64 QC 3.00 GHz, Linux, IB)
    (7) Qlogic Cluster (X5472 QC 3.00 GHz, RHEL5.2, IB Truescale)

Results and Configuration Summary

Hardware Configuration:

    8 x Sun Blade X6275 Server Module ( Dual-Node Blade, 16 nodes ) each node with
      2 x 2.93GHz Intel X5570 QC processors
      24 GB (6 x 4GB, 1333 MHz DDR3 dimms)
      On-board QDR InfiniBand Host Channel Adapters, QNEM

Software Configuration:

    OS: 64-bit SUSE Linux Enterprise Server SLES 10 SP 2
    Interconnect Software: OFED ver 1.4.1
    Shared File System: Lustre ver 1.8.0.1
    Application: FLUENT V12.0.16
    Benchmark: FLUENT 12 Benchmark Test Suite

Benchmark Description

The benchmark tests are representative of typical user large CFD models intended for execution in distributed memory processor (DMP) mode over a cluster of multi-processor platforms.

Key Points and Best Practices

Observations About the Results

The Sun Blade X6275 cluster delivered excellent performance, especially shining with the larger models

These processors include a turbo boost feature coupled with a speedstep option in the CPU section of the Advanced BIOS settings. This, under specific circumstances, can provide a cpu up clocking, temporarily increasing the processor frequency from 2.93GHz to 3.2GHz.

Memory placement is a very significant factor with Nehalem processors. Current Nehalem platforms have two sockets. Each socket has three memory channels and each channel has 3 bays for DIMMs. For example if one DIMM is placed in the 1st bay of each of the 3 channels the DIMM speed will be 1333 MHz with the X5570's altering the DIMM arrangement to an off balance configuration by say adding just one more DIMM into the 2nd bay of one channel will cause the DIMM frequency to drop from 1333 MHz to 1067 MHz.

About the FLUENT 12 Benchmark Test Suite

The FLUENT application performs computational fluid dynamic analysis on a variety of different types of flow and allows for chemically reacting species. transient dynamic and can be linear or nonlinear as far

  • CFD models tend to be very large where grid refinement is required to capture with accuracy conditions in the boundary layer region adjacent to the body over which flow is occurring. Fine grids are required to also determine accurate turbulence conditions. As such these models can run for many hours or even days as well using a large number of processors.
  • CFD models typically scale very well and are very suited for execution on clusters. The FLUENT 12 benchmark test cases scale well.
  • The memory requirements for the test cases in the FLUENT 12 benchmark test suite range from a few hundred megabytes to about 25 GB. As the job is distributed over multiple nodes the memory requirements per node correspondingly are reduced.
  • The benchmark test cases for the FLUENT module do not have a substantial I/O component. component. However performance will be enhanced very substantially by using high performance interconnects such as InfiniBand for inter node cluster message passing. This nodal message passing data can be stored locally on each node or on a shared file system.
  • As a result of the large amount of inter node message passing performance can be further enhanced by more than a 3x factor as indicated here by implementing the Lustre based shared file I/O system.

See Also

FLUENT 12.0 Benchmark:
http://www.fluent.com/software/fluent/fl6bench/fl6bench_6.4.x/

Disclosure Statement

All information on the Fluent website is Copyrighted 1995-2009 by Fluent Inc. Results from http://www.fluent.com/software/fluent/fl6bench/ as of October 20, 2009 and this presentation.

Monday Nov 02, 2009

This is an occasionally-generated index of previous entries in the BestPerf blog. Skip to next entry

Colors used:

Benchmark
Best Practices
Other

Nov 02, 2009 Sun Ultra 27 Delivers Leading Single Frame Buffer SPECviewperf 10 Results
Oct 28, 2009 SPC-2 Sun Storage 6780 Array RAID 5 & RAID 6 51% better $/performance than IBM DS5300
Oct 25, 2009 Sun C48 & Lustre fast for Seismic Reverse Time Migration using Sun X6275
Oct 25, 2009 Sun F5100 and Seismic Reverse Time Migration with faster Optimal Checkpointing
Oct 23, 2009 Wiki on performance best practices
Oct 20, 2009 Exadata V2 Information
Oct 15, 2009 Oracle Flash Cache - SGA Caching on Sun Storage F5100
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 Sun T5440 Oracle BI EE Sun SPARC Enterprise T5440 World Record
Oct 13, 2009 SPECweb2005 on Sun SPARC Enterprise T5440 World Record using Solaris Containers and Sun Storage F5100 Flash
Oct 13, 2009 Oracle PeopleSoft Payroll (NA) Sun SPARC Enterprise M4000 and Sun Storage F5100 World Record Performance
Oct 13, 2009 SAP 2-tier SD Benchmark on Sun SPARC Enterprise M9000/32 SPARC64 VII
Oct 13, 2009 CP2K Life Sciences, Ab-initio Dynamics - Sun Blade 6048 Chassis with Sun Blade X6275 - Scalability and Throughput with Quad Data Rate InfiniBand
Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Oct 13, 2009 Halliburton ProMAX Oil & Gas Application Fast on Sun 6048/X6275 Cluster
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers With Updated SPARC64 VII Processors
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 12, 2009 SPC-2 Sun Storage 6180 Array RAID 5 & RAID 6 Over 70% Better Price Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Array Over 70% Better Price Performance than IBM
Oct 12, 2009 Why Sun Storage F5100 is a good option for Peoplesoft NA Payroll Application
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array
Oct 11, 2009 TPC-C World Record Sun - Oracle
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 Enhancement Pack 4 (Unicode) Standard Sales and Distribution (SD) Benchmark
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Aug 27, 2009 Sun SPARC Enterprise T5240 with 1.6GHz UltraSPARC T2 Plus Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun SPARC Enterprise T5220 with 1.6GHz UltraSPARC T2 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun SPARC Enterprise T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 Oracle BI EE World Record Performance
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 New SPECjAppServer2004 Performance on the Sun SPARC Enterprise T5440
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip
Jul 21, 2009 New CMT results coming soon....
Jul 14, 2009 Vdbench: Sun StorageTek Vdbench, a storage I/O workload generator.
Jul 14, 2009 Storage performance and workload analysis using Swat.
Jul 10, 2009 World Record TPC-H@300GB Price-Performance for Windows on Sun Fire X4600 M2
Jul 06, 2009 Sun Blade 6048 Chassis with Sun Blade X6275: RADIOSS Benchmark Results
Jul 03, 2009 SPECmail2009 on Sun Fire X4275+Sun Storage 7110: Mail Server System Solution
Jun 30, 2009 Sun Blade 6048 and Sun Blade X6275 NAMD Molecular Dynamics Benchmark beats IBM BlueGene/L
Jun 26, 2009 Sun Fire X2270 Cluster Fluent Benchmark Results
Jun 25, 2009 Sun SSD Server Platform Bandwidth and IOPS (Speeds & Feeds)
Jun 24, 2009 I/O analysis using DTrace
Jun 23, 2009 New CPU2006 Records: 3x better integer throughput, 9x better fp throughput
Jun 23, 2009 Sun Blade X6275 results capture Top Places in CPU2006 SPEED Metrics
Jun 23, 2009 One Million Queries per Hour TPC-H at 30 Terabytes by Sun and ParAccel with OpenSolaris
Jun 19, 2009 Pointers to Java Performance Tuning resources
Jun 19, 2009 SSDs in HPC: Reducing the I/O Bottleneck BluePrint Best Practices
Jun 17, 2009 The Performance Technology group wiki is alive!
Jun 17, 2009 Performance of Sun 7410 and 7310 Unified Storage Array Line
Jun 16, 2009 Sun Fire X2270 MSC/Nastran Vendor_2008 Benchmarks
Jun 15, 2009 Sun Fire X4600 M2 Server Two-tier SAP ERP 6.0 (Unicode) Standard Sales and Distribution (SD) Benchmark
Jun 12, 2009 Correctly comparing SAP-SD Benchmark results
Jun 12, 2009 OpenSolaris Beats Linux on memcached Sun Fire X2270
Jun 11, 2009 SAS Grid Computing 9.2 utilizing the Sun Storage 7410 Unified Storage System
Jun 10, 2009 Using Solaris Resource Management Utilities to Improve Application Performance
Jun 09, 2009 Free Compiler Wins Nehalem Race by 2x
Jun 08, 2009 Variety of benchmark results to be posted on BestPerf
Jun 05, 2009 Interpreting Sun's SPECpower_ssj2008 Publications
Jun 03, 2009 Wide Variety of Topics to be discussed on BestPerf
Jun 03, 2009 Welcome to BestPerf group blog!

Monday Nov 02, 2009

A Sun Ultra 27 workstation configured with an nVidia FX5800 graphics card delivered outstanding performance running the SPECviewperf® 10 benchmark.

  • When compared with other workstations running a single graphics card (i.e. not running two or more cards in SLI mode), the Sun Ultra 27 workstation places first in 6 of 8 subtests and second in the remaining two subtests.

  • The calculated geometric mean shows that Sun Ultra 27 workstation is 11% faster than competitor's workstations.

  • The optimum point for price/performance is the nVidia FX1800 graphics card.

Results have been published on the SPEC web site at http://www.spec.org/gwpg/gpc.data/vp10/summary.html.

Performance Landscape

Performance of the Sun Ultra 27 versus the competition. Bigger is better for each of the eight tests. The comparison is based upon the performance of the Sun Ultra 27 workstation. Performance is measured in frames per second.


3DSMAX CATIA ENSIGHT MAYA
Perf % Perf % Perf % Perf %
Sun Ultra 27 FX5800 59.34
68.81
58.07
246.09
HP xw4600 ATI FireGL V7700 49.71 19 48.05 43 57.11 2
268.62 -8
HP xw4600 FX4800 52.26 14 63.26 12 53.79 8
226.82 7
Fujtsu Celsius M470 FX3800 53.67 11 65.25 7 52.19 10 227.37 7

PROENGINEER SOLIDWORKS TEAMCENTER UGS
Perf % Perf % Perf % Perf %
Sun Ultra 27 FX5800 68.96
152.01
42.02
36.04
HP xw4600 ATI FireGL V7700 47.25 32 109.71 28 40.18 4 56.65 -57
HP xw4600 FX4800 61.15 11 131.31 14 28.42 32 33.43 7
Fujtsu Celsius M470 FX3800 64.39 7
139.2 8 29.02 31 33.27 8

Comparison of various frame buffers on the Sun Ultra 27 running SPECviewperf 10. Performance is reported for each test along with the difference in performance as compared to the FX5800 frame buffer. The runs in the table below were made with 3.2GHz W3570 processors.


3DSMAX CATIA ENSIGHT MAYA PROENGR SOLIDWRKS TEAMCNTR UGS
Perf % Perf % Perf % Perf % Perf % Perf % Perf % Perf %
FX5800 57.07
67.84
58.63
219.4
68.05
152.3
40.85
34.73
FX3800 57.17 0 66.57 2
54.91 7
206.4 6 66.48 2 146.3 4 38.48 6 33.12 5
FX1800 56.73 1
64.33 6
52.05 13 189.3 16 64.67 5 135.2 13 34.18 20
30.46 14
FX380 45.90 24 55.81 22 34.93 68 120.3 82 46.09 48 64.11 138 17.00 140 13.88 150

Results and Configuration Summary

Hardware Configuration:

    Sun Ultra 27 Workstation
    1 x 3.33 GHz Intel Xeon (tm) W3580
    2GB (1 x 2GB PC10600 1333MHz)
    1 x 500GB SATA
    nVidia Quadro FX380, FX1800, FX3800 & FX5800
    $7,529.00 (includes Microsoft Windows and monitor)

Software Configuration:

    OS: Microsoft Windows Vista Ultimate, 32-bit
    Benchmark: SPECviewperf 10

Benchmark Description

SPECviewperf measures 3D graphics rendering performance of systems running under OpenGL. SPECviewperf is a synthetic benchmark designed to be a predictor of application performance and a measure of graphics subsystem performance. It is a measure of graphics subsystem performance (primarily graphics bus, driver and graphics hardware) and its impact on the system without the full overhead of an application. SPECviewperf reports performance in frames per second.

Please go here for a more complete description of the tests.

Key Points and Best Practices

SPECviewperf measures the 3D rendering performance of systems running under OpenGL.

The SPECopcSM project group's SPECviewperf 10 is totally new performance evaluation software. In addition to features found in previous versions, it now provides the ability to compare performance of systems running in higher-quality graphics modes that use full-scene anti-aliasing, and measures how effectively graphics subsystems scale when running multithreaded graphics content. Since the SPECviewperf source and binaries have been upgraded to support changes, no comparisons should be made between past results and current results for viewsets running under SPECviewperf 10.

SPECviewperf 10 requires OpenGL 1.5 and a minimum of 1GB of system memory. It currently supports Windows 32/64.

See Also

Disclosure Statement

SPEC® and the benchmark name SPECviewperf® are registered trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect results published on www.spec.org as of Oct 18, 2009. For the latest SPECviewperf benchmark results, visit www.spec.org/gwpg.

Wednesday Oct 28, 2009

Significance of Results

Results on the Sun Storage 6780 Array with 8Gb connectivity are presented for the SPC-2 benchmark using RAID 5 and RAID 6.
  • The Sun Storage 6780 array outperforms the IBM DS5300 by 51% in price performance for SPC-2 benchmark using RAID 5 data protection.

  • The Sun Storage 6780 array outperforms the IBM DS5300 by 51% in price performance for SPC-2 benchmark using RAID 6 data protection.

  • The Sun Storage 6780 Array has 62% better performance than the Fujitsu 800/1100 and delivers a price performance advantage of 5.6x as measured by the SPC-2 benchmark.

  • The Sun Storage 6800 array with 8Gb connectivity improved performance by 36% over the 4GB connected solution as measured by the SPC-2 benchmark.

Performance Landscape

SPC-2 Performance Chart (in increasing price-performance order)

Sponsor System SPC-2
MBPS
$/SPC-2
MBPS
ASU
Capacity
(GB)
TSC Price Data
Protection
Level
Date Results
Identifier
Sun SS6780 (8Gb) 5,634.17 $44.88 16,383.186 $252,873 RAID 5 10/27/09 B00047
IBM DS5300 (8Gb) 5,634.17 $67.75 16,383.186 $381,720 RAID 5 10/21/09 B00045
Sun SS6780 (8Gb) 5,543.88 $45.61 14,042.731 $252,873 RAID 6 10/27/09 B00048
IBM DS5300 (8Gb) 5,543.88 $68.85 14,042.731 $381,720 RAID 6 10/21/09 B00046
Sun SS6780 (4Gb) 4,818.43 $53.61 16,383.186 $258,329 RAID 5 02/03/09 B00039
IBM DS5300 (4Gb) 4,818.43 $93.80 16,383.186 $451,986 RAID 5 09/25/08 B00037
Sun SS6780 (4Gb) 4,675.50 $55.25 14,042.731 $258,329 RAID 6 02/03/09 B00040
IBM DS5300 (4Gb) 4,675.50 $96.67 14,042.731 $451,986 RAID 6 09/25/08 B00038
Fujitsu 800/1100 3,480.68 $238.93 4,569.845 $831,649 Mirroring 03/08/07 B00019

SPC-2 MBPS = the Performance Metric
$/SPC-2 MBPS = the Price/Performance Metric
ASU Capacity = the Capacity Metric
Data Protection = Data Protection Metric
TSC Price = Total Cost of Ownership Metric
Results Identifier = A unique identification of the result Metric

Complete SPC-2 benchmark results may be found at http://www.storageperformance.org.

Results and Configuration Summary

Storage Configuration:

    8 x CM200 trays, each with 16 x 146GB 15K RPM drives
    8 x Qlogic 8Gb HBA

Server Configuration:

    4 x IBM x3650
      2 x 2.93 GHz Intel X5570
      5 GB memory

Software Configuration:

    Microsoft Windows Server 2003 Enterprise Edition (32-bit) with SP2
    SPC-2 benchmark kit

Benchmark Description

The SPC Benchmark-2™ (SPC-2) is a series of related benchmark performance tests that simulate the sequential component of demands placed upon on-line, non-volatile storage in server class computer systems. SPC-2 provides measurements in support of real world environments characterized by:
  • Large numbers of concurrent sequential transfers.
  • Demanding data rate requirements, including requirements for real time processing.
  • Diverse application techniques for sequential processing.
  • Substantial storage capacity requirements.
  • Data persistence requirements to ensure preservation of data without corruption or loss.

Key Points and Best Practices

  • This benchmark was performed using RAID 5 and RAID 6 protection.
  • The controller stripe size was set to 512k.
  • No volume manager was used.

See Also

Benchmark Tags

$/Perf, performance, bandwidth, OpenStorage, Storage

Disclosure Statement

SPC-2, SPC-2 MBPS, $/SPC-2 MBPS are regular trademarks of Storage Performance Council (SPC). More info www.storageperformance.org. Sun Storage 6780 Array 5,634.17 SPC-2 MBPS, $/SPC-2 MBPS $44.88, ASU Capacity 16,838.186GB, Protect RAID 5, Cost $252,873.00, Ident. B00047. Sun Storage 6780 Array 5,543.88 SPC-2 MBPS, $/SPC-2 MBPS $45.61, ASU Capacity 14,042.731 GB, Protect RAID 6, Cost $252,873.00, Ident. B00048.

Publication Rules

See here for publication rules.

Sunday Oct 25, 2009

Significance of Results

A Sun Blade 6048 Modular System with 12 Sun Blade X6275 server modules were clustered together with QDR InfiniBand and using a Lustre File System with QDR InfiniBand to show performance improvements over an NFS file system for reading in Velocity, Epsilon, and Delta Slices and imaging 800 samples of various various grid sizes using the Reverse Time Migration.

  • The Initialization Time for populating the processing grids demonstrates significant advantages of Lustre over NFS:
    • 2486x1151x1231 : 20x improvement
    • 1243x1151x1231 : 20x improvement
    • 125x1151x1231 : 11x improvement
  • The Total Application Performance shows the Interconnect and I/O advantages of using QDR InfiniBand Lustre for the large grid sizes:
    • 2486x1151x1231 : 2x improvement - processed in less than 19 minutes
    • 1243x1151x1231 : 2x improvement - processed in less than 10 minutes

  • The Computational Kernel Scalability Efficiency for the 3 grid sizes:
    • 125x1151x1231 : 97% (1-8 nodes)
    • 1243x1151x1231 : 102% (8-24 nodes)
    • 2486x1151x1231 : 100% (12-24 nodes)

  • The Total Application Scalability Efficiency for the large grid sizes:
    • 1243x1151x1231 : 72% (8-24 nodes)
    • 2485x1151x1231 : 71% (12-24 nodes)

  • On the X5570 Intel processor with HyperThreading enabled and running 16 OpenMP threads per node gives approximately a 10% performance improvement over running 8 threads per node.

Performance Landscape

This first table presents the initialization time, comparing different number processors along with different problem sizes. The results are presented in seconds and shows the advantage the Lustre file system running over QDR InfiniBand provided when compared to a simple NFS file system.


Initialization Time Performance Comparison
Reverse Time Migration - SMP Threads and MPI Mode
Nodes Procs 125 x 1151 x 1231
800 Samples
1243 x 1151 x 1231
800 Samples
2486 x 1151 x 1231
800 Samples
Lustre Time
(sec)
NFS Time
(sec)
Lustre Time
(sec)
NFS Time
(sec)
Lustre Time
(sec)
NFS Time
(sec)
24 48 1.59 18.90 8.90 181.78 15.63 362.48
20 40 1.60 18.90 8.93 181.49 16.91 358.81
16 32 1.58 18.59 8.97 181.58 17.39 353.72
12 24 1.54 18.61 9.35 182.31 22.50 364.25
8 16 1.40 18.60 10.02 183.79

4 8 1.57 18.80



2 4 2.54 19.31



1 2 4.54 20.34



This next table presents the total application run time, comparing different number processors along with different problem sizes. It shows that for larger problems, using the Lustre file system running over QDR InfiniBand provided a big performance advantage when compared to a simple NFS file system.


Total Application Performance Comparison
Reverse Time Migration - SMP Threads and MPI Mode
Nodes Procs 125 x 1151 x 1231
800 Samples
1243 x 1151 x 1231
800 Samples
2486 x 1151 x 1231
800 Samples
Lustre Time
(sec)
NFS Time
(sec)
Lustre Time
(sec)
NFS Time
(sec)
Lustre Time
(sec)
NFS Time
(sec)
24 48 251.48 273.79 553.75 1125.02 1107.66 2310.25
20 40 232.00 253.63 658.54 971.65 1143.47 2062.80
16 32 227.91 209.66 826.37 1003.81 1309.32 2348.60
12 24 217.77 234.61 884.27 1027.23 1579.95 3877.88
8 16 223.38 203.14 1200.71 1362.42

4 8 341.14 272.68



2 4 605.62 625.25



1 2 892.40 841.94



The following table presents the run time and speedup of just the computational kernel for different processor counts for the three different problem sizes considered. The scaling results are based upon the smallest number of nodes run and that number is used as the baseline reference point.


Computational Kernel Performance & Scalability
Reverse Time Migration - SMP Threads and MPI Mode
Nodes Procs 125 x 1151 x 1231
800 Samples
1243 x 1151 x 1231
800 Samples
2486 x 1151 x 1231
800 Samples
X6275 Time
(sec)
Speedup:
1-node
X6275 Time
(sec)
Speedup:
1-node
X6275 Time
(sec)
Speedup:
1-node
24 48 35.38 13.7 210.82 24.5 427.40 24.0
20 40 35.02 13.8 255.27 20.2 517.03 19.8
16 32 41.76 11.6 317.96 16.2 646.22 15.8
12 24 49.53 9.8 422.17 12.2 853.37 12.0*
8 16 62.34 7.8 645.27 8.0*

4 8 124.66 3.9



2 4 238.80 2.0



1 2 484.89 1.0



The last table presents the speedup of the total application for different processor counts for the three different problem sizes presented. The scaling results are based upon the smallest number of nodes run and that number is used as the baseline reference point.


Total Application Scalability Comparison
Reverse Time Migration - SMP Threads and MPI Mode
Nodes Procs 125 x 1151 x 1231
800 Samples
Lustre Speedup:
1-node
1243 x 1151 x 1231
800 Samples
Lustre Speedup:
1-node
2486 x 1151 x 1231
800 Samples
Lustre Speedup:
1-node
24 48 3.6 17.3 17.1
20 40 3.8 14.6 16.6
16 32 4.0 11.6 14.5
12 24 4.1 10.9 12.0*
8 16 4.0 8.0*
4 8 2.6

2 4 1.5

1 2 1.0

Note: HyperThreading is enabled and running 16 threads per Node.

Results and Configuration Summary

Hardware Configuration:
    Sun Blade 6048 Modular Modular System with
      12 x Sun Blade x6275 Server Modules, each with
        4 x 2.93 GHz Intel Xeon QC X5570 processors
        12 x 4 GB memory at 1333 MHz
        2 x 24 GB Internal Flash
    QDR InfiniBand Lustre 1.8.0.1 File System
    GBit NFS file system

Software Configuration:

    OS: 64-bit SUSE Linux Enterprise Server SLES 10 SP 2
    MPI: Scali MPI Connect 5.6.6-59413
    Compiler: Sun Studio 12 C++, Fortran, OpenMP

Benchmark Description

The Reverse Time Migration (RTM) is currently the most popular seismic processing algorithm because of its ability to produce quality images of complex substructures. It can accurately image steep dips that can not be imaged correctly with traditional Kirchhoff 3D or frequency domain algorithms. The Wave Equation Migration (WEM) can image steep dips but does not produce the image quality that can be achieved by the RTM. However, the increased computational complexity of the RTM over the WEM introduces new areas for performance optimization. The current trend in seismic processing is to perform iterative migrations on wide azimuth marine data surveys using the Reverse Time Migration.

This Reverse Time Migration code reads in processing parameters that define the grid dimensions, number of threads, number of processors, imaging condition, and various other parameters. The master node calculates the memory requirements to determine if there is sufficient memory to process the migration "in-core". The domain decomposition across all the nodes is determined by dividing the first grid dimension by the number of nodes. Each node then reads in it's section of the Velocity Slices, Delta Slices, and Epsilon Slices using MPI IO reads. The three source and receiver wavefield state vectors are created: previous, current, and next state. The processing steps through the input trace data reading both the receiver and source data for each of the 800 time steps. It uses forward propagation for the source wave field and backward propagation in time to cross correlate the receiver wavefield. The computational kernel consists of a 13 point stencil to process a subgrid within the memory of each node using OpenMP parallelism. Afterwards, conditioning and absorption are applied and boundary data is communicated to neighboring nodes as each time step is processed. The final image is written out using MPI IO.

Total memory requirements for each grid size:

    125x1151x1231: 7.5GB
    1243x1151x1231: 78GB
    2486x1151x1231: 156GB

For this phase of benchmarking, the focus was to optimize the data initialization. In the next phase of benchmarking, the trace data reading will be optimized so that each node reads in only it's section of interest. In this benchmark the trace data reading skews the Total Application Performance as the number of nodes increase. This will be optimized in the next phase of benchmarking, as well as, further node optimization with OpenMP. The IO description for this benchmark phase on each grid size:

    125x1151x1231:
      Initialization MPI Read: 3 x 709MB = 2.1GB / number of nodes
      Trace Data Read per Node: 2 x 800 x 576KB = 920MB * number of nodes
      Final Output Image MPI Write: 709MB / number of nodes
    1243x1151x1231: 78GB
      Initialization MPI Read: 3 x 7.1GB = 21.3GB / number of nodes
      Trace Data Read per Node: 2 x 800 x 5.7MB = 9.2GB * number of nodes
      Final Output Image MPI Write: 7.1GB / number of nodes
    2486x1151x1231: 156GB
      Initialization MPI Read: 3 x 14.2GB = 42.6GB / number of nodes
      Trace Data Read per Node: 2 x 800 x 11.4MB = 18.4GB * number of nodes
      Final Output Image MPI Write: 42.6GB / number of nodes

Key Points and Best Practices

  • Additional evaluations were performed to compare GBit NFS, Infiniband NFS, and Infiniband Lustre for the Reverse Time Migration Initialization. Infiniband NFS was 6x faster than GBit NFS and Infiniband Lustre was 3x faster than Infiniband NFS using the same disk configurations. On 12 nodes for grid size 2486x1151x1231 the initialization time was 22.50 seconds for IB Lustre, 61.03 seconds for IB NFS, and 364.25 seconds for GBit NFS.
  • The Reverse Time Migration computational performance scales nicely as a function of the grid size being processed. This is consistent with the IBM published results for this application.
  • The Total Application performance results are not typically reported in benchmark studies for this application. The IBM report specifically states that the execution times do not include I/O times and non-recurring allocation or initialization delays. Examining the total application performance reveals that the workload is no longer dominated by the the partial differential equation (PDE) solver, as IBM suggests, but is constrained by the I/O for grid initialization, reading in the traces, saving/restoring wave state data, and writing out the final image. Aggressive optimization of the PDE solver has little effect on the overall throughput of this application. It is clearly more important to optimize the I/O. The trend in seismic processing, as stated at the 2008 Society of Exploration Geophysicists (SEG) conference, is to run the reverse time migration iteratively on wide azimuth data. Thus, optimizing the I/O and application throughput is imperative to meet this trend. SSD and Flash technologies in conjunction with Sun's Lustre file system can reduce this I/O bottleneck and pave the path for the future in seismic processing.
  • Minimal tuning effort was applied to achieve the results presented. Sun's HPC software stack, which includes the Sun Studio compiler, was used to build the 70000 lines of C++ and Fortran source into the application executable. The only compiler option used was "-fast". No assembly level optimizations, like those performed by IBM to use SIMD registers (SSE registers), where performed in this benchmark. Similarly, no explicit cache blocking, loop unrolling, or memory bandwidth optimizations were conducted. The idea was to demonstrate the performance that a customer can expect from their existing applications without extensive, platform specific optimizations.

See Also

Disclosure Statement

Reverse Time Migration, Results as of 10/23/2009. For more info http://www.caam.rice.edu/tech_reports/2006/TR06-18.pdf

Sunday Oct 25, 2009

A prominent Seismic Processing algorithm, Reverse Time Migration with Optimal Checkpointing, in SMP "THREADS" Mode, was testing using a Sun Fire X4270 server configured with four high performance 15K SAS hard disk drives (HDDs) and a Sun Storage F5100 Flash Array. This benchmark compares I/O devices for checkpointing wave state information while processing a production seismic migration.

  • Sun Storage F5100 Flash Array is 2.2x faster than high-performance 15K RPM disks.

  • Multithreading the checkpointing using the Sun Studio C++ Compiler OpenMP implementation gives a 12.8x performance improvement over the original single threaded version.

These results show the new trend in seismic processing to run iterative Reverse Time Migrations and migration playback is a reality. This is made possible through the use of Sun FlashFire technology to provide good checkpointing speeds without additional disk cache memory. The application can take advantage of all the memory within a node without regard to checkpoint cache buffers required for performance to HDDs. Similarly, larger problem sizes can be solved without increasing the memory footprint of each computational node.

Performance Landscape


Reverse Time Migration Optimal Checkpointing - SMP Threads Mode
Grid Size -800 x 1151 x 1231 with 800 Samples - 60GB of memory
Number
Checkpts
HDD F5100
Put Time
(secs)
Get Time
(secs)
Total Time
(secs)
Put Time
(secs)
Get Time
(secs)
Total Time
(secs)
F5100
Speedup
80 660.8 25.8 686.6 277.4 40.2 317.6 2.2x
400 1615.6 382.3 1997.9 989.5 269.7 1259.2 1.6x


Reverse Time Migration Optimal Checkpointing - SMP Threads Mode
Grid Size -125 x 1151 x 1231 with 800 Samples - 9GB of memory
Number
Checkpts
HDD F5100
Put Time
(secs)
Get Time
(secs)
Total Time
(secs)
Put Time
(secs)
Get Time
(secs)
Total Time
(secs)
F5100
Speedup
80 10.2 0.2 10.4 8.0 0.2 8.2 1.3x
400 52.3 0.4 52.7 45.2 0.3 45.5 1.2x
800 102.6 0.7 103.3 91.8 0.6 92.4 1.1x


Reverse Time Migration Optimal Checkpointing
Single Thread vs Multithreaded I/O Performance
Grid Size -125 x 1151 x 1231 with 800 Samples - 9GB of memory
Number
Checkpts
Single Thread F5100
Total Time (secs)
Multithreaded F5100
Total Time (secs)
Multithread
Speedup
80 105.3 8.2 12.8x
400 482.9 45.5 10.6x
800 963.5 92.4 10.4x

Note: Hyperthreading and Turbo Mode enabled while running 16 threads per node.

Results and Configuration Summary

Hardware Configuration:

    Sun Fire 4270 Server
      2 x 2.93 GHz Quad-core Intel Xeon X5570 processors
      72 GB memory
      4 x 73 GB 15K SAS drives
        File system striped across 4 15K RPM high-performance SAS HD RAID0
      Sun Storage F5100 Flash Array with local/internal r/w buff 4096
        20 x 24 GB flash modules

Software Configuration:

    OS: 64-bit SUSE Linux Enterprise Server SLES 10 SP 2
    Compiler: Sun Studio 12 C++, Fortran, OpenMP

Benchmark Description

The Reverse Time Migration (RTM) is currently the most popular seismic processing algorithm because of it's ability to produce quality images of complex substructures. It can accurately image steep dips that can not be imaged correctly with traditional Kirchhoff 3D or frequency domain algorithms. The Wave Equation Migration (WEM) can image steep dips but does not produce the image quality that can be achieved by the RTM. However, the increased computational complexity of the RTM over the WEM introduces new areas for performance optimization. The current trend in seismic processing is to perform iterative migrations on wide azimuth marine data surveys using the Reverse Time Migration.

The Reverse Time Migration with Optimal Checkpointing was introduced so large migrations could be performed within minimal memory configurations of x86 cluster nodes. The idea is to only have three wavestate vectors in memory for each of the source and receiver wavefields instead of holding the entire wavefields in memory for the duration of processing. With the Sun Flash F5100, this can be done with little performance penalty to the full migration time. Another advantage of checkpointing is to provide the ability to playback migrations and facilitate iterative migrations.

  • The stored snapshot data can be reprocessed with different filtering, image conditioning, or a variety of other parameters.
  • Fine grain snapshoting can help the processing of more complex subsurface data.
  • A Geoscientist can "playback" a migration from the saved snapshots to visually validate migration accuracy or pick areas of interest for additional processing.

The Reverse Time Migration with Optimal Checkpointing is an algorithm designed by Griewank (Griewank, 1992; Blanch et al., 1998; Griewank, 2000; Griewank and Walther, 2000; Akcelik et al., 2003).

  • The application takes snapshots of wavefield state data for some interval of the total number of samples.
  • This adjoint state method performs crosscorrelation of the source and receiver wavefields at the each level.
  • Forward recursion is used for the source wavefield and backward recursion for the receiver wavefield.
  • For relatively small seismic migrations, all of the forward processed state information can be saved and restored with minimal impact on the total processing time.
  • Effectively, the computational complexity increases while the memory requirements decrease by a logarithmic factor of the number of snapshots.
  • Griewank's algorithm helps define the most optimal tradeoff between computational performance and the number of memory buffers (memory requirements) to support this cross correlation.

For the purposes of this benchmark, this implementation of the Reverse Time Migration with Optimal Checkpointing does not fully implement the optimal memory buffer scheme proposed by Griewank. The intent is to compare various I/O alternatives for saving wave state data for each node in a compute cluster.

This benchmark measures the time to perform the wave state saves and restores while simultaneously processing the wave state data.

Key Points and Best Practices

  • Mulithreading the checkpointing using Sun Studio OpenMP and running 16 I/O threads with hyperthreading enabled gives a performance advantage over single threaded I/O to the Sun Storage F5100 flash array. The Sun Storage F5100 flash array can process concurrent I/O requests from multiple threads very efficiently.
  • Allocating the majority of a node's available memory to the Reverse Time Migration algorithm and leaving little memory for I/O caching favors the Sun Storage F5100 flash array over direct attached high performance disk drives. This performance advantage decreases as the number of snapshots increase. The reason for this is that increasing the number of snapshots decreases the memory requirement for the application.

See Also

Disclosure Statement

Reverse Time Migration with Optimal Checkpointing, Results as of 10/23/2009. For more info http://www.caam.rice.edu/tech_reports/2006/TR06-18.pdf

Friday Oct 23, 2009

A fantastic source of technical Best Practices is at
http://wikis.sun.com/display/Performance/Home

This wiki hosts the combined wisdom of many performance engineers from across Sun. It has information about Hardware, Software, ZFS, Oracle and other various performance topics.  This wiki attempts to categorize and present information so it is easy to find and use. It is getting started, but please let us know if there are any topics which would be useful.

Tuesday Oct 20, 2009

An engineer in our group wrote this blog posting:
"Exadata V2... Oracle grid consolidation in a box"

Link:
http://blogs.sun.com/glennf/entry/exadata_v2_oracle_grid_consolidation

Thursday Oct 15, 2009

Overview and Significance of Results

Oracle and Sun's Flash Cache technology combines New features in Oracle with the Sun Storage F5100 to improve database performance. In Oracle databases, the System Global Area (SGA) is a group of shared memory areas that are dedicated to an Oracle “instance” (Oracle processes in execution sharing a database) . All Oracle processes use the SGA to hold information. The SGA is used to store incoming data (data and index buffers) and internal control information that is needed by the database. The size of the SGA is limited by the size of the available physical memory.

This benchmark tested and measured the performance of a new Oracle Database 11g (Release2) feature, which allows to extend the SGA size and caching beyond physical memory, to a large flash memory storage device as the Sun Storage F5100 flash array.

One particular benchmark test demonstrated a dramatic performance improvement (almost 5x) using the Oracle Extended SGA feature on flash storage by reaching SGA sizes in the hundreds of GB range, at a more reasonable cost than equivalently sized RAM and with much faster access times than disk I/O.

The workload consisted in a high volume of SQL select transactions accessing a very large table in a typical business oriented OLTP database. To obtain a baseline, throughput and response times were measured applying the workload against a traditional storage configuration and constrained by disk I/O demand (DB working set of about 3x the size of the data cache in the SGA). The workload was then executed with an added Sun Storage F5100 Flash Array configured to contain an Extended SGA of incremental size.

The tests have shown scaling throughput along with increasing Flash Cache size.

Table of Results

F5100 Extended SGA Size (GB) Query Txns / Min Avg Response Time (Secs) Speedup Ratio
No 76338 0.118 N/A
25 169396 0.053 2.2
50 224318 0.037 2.9
75 300568 0.031 3.9
100 357086 0.025 4.6




Configuration Summary

Server Configuration:

    Sun SPARC Enterprise M5000 Server
    8 x SPARC64 VII 2.4GHz Quad Core
    96 GB memory

Storage Configuration:

    8 x Sun Storage J4200 Arrays, 12x 146 GB 15K RPM disks each (96 disks total)
    1 x Sun Storage F5100 Flash Array

Software Configuration:

    Oracle 11gR2
    Solaris 10

Benchmark Description

The workload consisted in a high volume of SQL select transactions accessing a very large table in a typical business oriented OLTP database.

The database consisted of various tables: Products, Customers, Orders, Warehouse Inventory (Stock) data, etc. and the Stock table alone was 3x the size of the db cache size.

To obtain a baseline, throughput and response times were measured applying the workload against a traditional storage configuration and constrained by disk I/O demand. The workload was then executed with an added Sun Storage F5100 Flash Array configured to contain an Extended SGA of incremental size.

During all tests, the in memory SGA data cache was limited to 25 GB .

The Extended SGA was allocated on a “raw' Solaris Volume created with the Solaris Volume Manager (SVM) on a set of devices (Flash Modules) residing on the Sun Storage F5100 flash array.

Key Points and Best Practices

In order to verify the performance improvement brought by extended SGA, the feature had to be tested with a large enough database size and with a workload requiring significant disk I/O activity to access the data. For that purpose, the size of the database needed to be a multiple of the physical memory size, avoiding the case in which the accessed data could be entirely or almost entirely cached in physical memory.

The above represents a typical “use case” in which the Flash Cache Extension is able to show remarkable performance advantages.

If the DB dataset is already entirely cached, or the DB I/O demand is not significant or the application is already saturating the CPU for non database related processing, or large data caching is not productive (DSS type Queries), the Extended SGA may not improve performance.

It is also relevant to know that additional memory structures needed to manage the Extended SGA are allocated in the “in memory” SGA, therefore reducing its data caching capacity.

Increasing the Extended Cache beyond a specific threshold, dependent on various factors, may reduce the benefit of widening the Flash SGA and actually reduce the overall throughput.

This new cache is somewhat similar architecturally to the L2ARC on ZFS. Once written, flash cache buffers are read-only, and updates are only done into main memory SGA buffers. This feature is expected to primarily benefit read-only and read-mostly workloads.

A typical sizing of database flash cache is 2x to 10x the size of SGA memory buffers. Note that header information is stored in the SGA for each flash cache buffer (100 bytes per buffer in exclusive mode, 200 bytes per buffer in RAC mode), so the number of available SGA buffers is reduced as the flash cache size increases, and the SGA size should be increased accordingly.

Two new init.ora parameters have been introduced, illustrated below:

    db_flash_cache_file = /lfdata/lffile_raw
    db_flash_cache_size = 100G
The db_flash_cache_file parameter takes a single file name, which can be a file system file, a raw device, or an ASM volume. The db_flash_cache_size parameter specifies the size of the flash cache. Note that for raw devices, the partition being used should start at cylinder 1 rather than cylinder 0 (to avoid the disk's volume label).

See Also

Disclosure Statement

Results as of October 10, 2009 from Sun Microsystems.

Wednesday Oct 14, 2009

Here is a BestPerf blog index to a variety of benchmarks announced at Oracle Open World and others talked about at the conference.

Colors used:

Benchmark
Best Practices
Other

ORACLEOPENWORLD

CMT Servers

Oct 11, 2009 * TPC-C World Record Sun - Oracle *
Oct 13, 2009 Sun T5440 Oracle BI EE Sun T5440 World Record
Oct 13, 2009 SPECweb200 Sun T5440 World Record, Solaris Containers and Sun Storage F5100
Sep 01, 2009 String Searching - Sun T5240 & T5440 Outperform IBM Cell Broadband Engine
Aug 27, 2009 Sun T5240 Beats 4-Chip IBM Power 570 POWER6 System on SPECjbb2005
Aug 26, 2009 Sun T5220 Sets Single Chip World Record on SPECjbb2005
Aug 12, 2009 SPECmail2009 on Sun T5240 and Sun Java System Messaging Server 6.3
Jul 23, 2009 World Record Performance of Sun CMT Servers
Jul 22, 2009 Why does 1.6 beat 4.7?
Jul 21, 2009 Zeus ZXTM Traffic Manager World Record on Sun T5240
Jul 21, 2009 Sun T5440 World Record SAP-SD 4-Processor Two-tier SAP ERP 6.0 EP4 (Unicode)

SPARC64 Servers

Oct 13, 2009 SAP 2-tier SD Benchmark on Sun M9000/32 SPARC64 VII
Oct 13, 2009 Oracle PeopleSoft Payroll Sun M4000 and Sun Storage F5100 World Record Performance
Oct 12, 2009 Best Practices: M4000 Sun Storage F5100 is a good option for Peoplesoft Payroll
Oct 13, 2009 Oracle Hyperion Sun M5000 and Sun Storage 7410
Oct 13, 2009 SPECcpu2006 Results On MSeries Servers, New SPARC64 VII

X86 Servers

Oct 13, 2009 SAP 2-tier SD-Parallel on Sun Blade X6270 1-node, 2-node and 4-node
Aug 28, 2009 Sun X4270 World Record SAP-SD 2-Processor Two-tier SAP ERP 6.0 EP 4 (Unicode)
Oct 02, 2009 Sun X4270 VMware VMmark benchmark achieves excellent result
Sep 22, 2009 Sun X4270 Virtualized for Two-tier SAP ERP 6.0 EP4 (Unicode) Standard Sales and Distribution Benchmark

HPC Benchmarks

Oct 13, 2009 Halliburton ProMAX Oil & Gas Appl on Sun 6048/X6275 Cluster and Oracle Database
Oct 13, 2009 MCAE ABAQUS faster on Sun F5100 and Sun X4270 - Single Node World Record
Oct 12, 2009 MCAE ANSYS faster on Sun F5100 and Sun X4270
Oct 12, 2009 MCAE MCS/NASTRAN faster on Sun F5100 and Fire X4270
Oct 13, 2009 CP2K Life Sciences, Ab-initio Chem - Sun C48 with Sun Blade X6275 - QDR InfiniBand
Oct 09, 2009 X6275 Cluster Demonstrates Performance and Scalability on WRF 2.5km CONUS Dataset

Specific Storage Benchmarks

Oct 12, 2009 SPC-2 Sun Storage 6180 RAID 5 & RAID 6 Over 70% Better $/Performance than IBM
Oct 12, 2009 SPC-1 Sun Storage 6180 Over 70% Better $/Performance than IBM
Oct 12, 2009 1.6 Million 4K IOPS in 1RU on Sun Storage F5100 Flash Array

Additional CMT Server Benchmarks

Jul 21, 2009 1.6 GHz SPEC CPU2006 - Rate Benchmarks
Jul 21, 2009 Sun Blade T6320 World Record SPECjbb2005 performance
Jul 21, 2009 Sun T5440 SPECjbb2005 Beats IBM POWER6 Chip-to-Chip

Tuesday Oct 13, 2009

The Sun SPARC Enterprise M5000 server with SPARC64 VII processors (configured with 4 CPUs) and Sun Storage 7410 Unified Storage System has achieved exceptional performance for Oracle Hyperion Essbase 11.1.1.3 and Oracle 11g database for hundreds of GB of data, 15 dimensional db and millions of members running on free and open Solaris 10.

  • The Sun Storage 7410 Unified Storage System provides more than 20% improvement out of the box compared to a mid-size fiber channel disk array for default aggregation and user based aggregation.

  • The Sun SPARC Enterprise M5000 server with Sun Storage 7410 Unified Storage System and Oracle Hyperion Essbase 11.1.1.3 running on Solaris 10 OS provides < 1sec query response times for 20K users in a 15 dimension database.

  • The Sun Storage 7410 Unified Storage system and Oracle Hyperion Essbase provides the best combination for large Essbase database leveraging ZFS and taking advantage of high bandwidth for faster load and aggregation.

Performance Landscape

System Processor OS Storage Dataload Def.Agg UserAgg
Sun SE M5000 4 x 2.4GHz SPARC64 VII Solaris Sun Storage 7410 120 min 448 min 17.5 min
Sun SE M5000 4 x 2.4GHz SPARC64 VII Solaris Sun StorageTek 6140 128 min 526 min 24.7 min

Results and Configuration Summary

Hardware Configuration:

    1 x Sun SPARC Enterprise M5000 (2.4 GHz/32GB)
    1 x Sun StorageTek 6140 (32 x 146GB)
    1 x Sun Storage 7410 (24TB disk)

Software Configuration:

    Solaris 10 5/09
    Installer V 11.1.1.3
    Oracle Hyperion Essbase Client v 11.1.1.3
    Oracle Hyperion Essbase v 11.1.1.3
    Oracle Hyperion Essbase Adminstration services 64-bit
    Oracle Weblogic 9.2MP3 -- 64 bit
    Sun's JDK 1.5 Update 19 -- 64-bit
    Oracle RDBMS 11.1.0.7 64-bit
    HP's Mercury Interactive QuickTest Professional 9.0

Benchmark Description

Oracle Hyperion is a OLAP based analytics application used to analyze business needs and plans that is highly dimensional detailed, such as "what-if" analysis to look into the future, build multi-user scenario modeling, planning, customer buying patterns, etc.

The objective of the benchmark is to collect data for the following Oracle Hyperion Essbase benchmark key performance indicators (KPI):
  • Database build time: Time elapsed to build a database including outline and data load.

  • Database Aggregation build time: Time elapsed to build aggregation.

  • Analytic Query Time: With user load increasing from 500, 1000, 2000, 10000, 20000 users track the time required to process each query and hence track average analytic query time.

  • Analytic Queries per minute: Number of queries handled by the Essbase server per minute Track resource, i.e. CPU, memory usage.

The benchmark is based on the data set used by Product Assurance for 2005 Essbase 7.x testing.

    40 flat files of 1.2 GB each , 49.4 GB in total
    10 million rows per file, 400 million rows total
    28 columns of data per row
    49.4 GB total size of 40 files
    Database outline has 15 dimensions (five of them are attribute dimensions)
    Customer dimension has 13.3 million members

Key Points and Best Practices

  • The Sun Storage 7410 was configured with iSCSI.

See Also

Disclosure Statement

Oracle Hyperion Enterprise, www.oracle.com/solutions/mid/oracle-hyperion-enterprise.html, results 10/13/2009.

Tuesday Oct 13, 2009

The Oracle BI EE workload was run on two Sun SPARC Enterprise T5440 servers and acheived world record performance.
  • Two Sun SPARC Enterprise T5440 servers with four 1.6 GHz UltraSPARC T2 Plus processors delivered the best performance of 50K concurrent users on the Oracle BI EE 10.1.3.4 benchmark with Oracle 11g database running on free and open Solaris 10.

  • The two node Sun SPARC Enterprise T5440 servers with Oracle BI EE running on Solaris 10 using 8 Solaris Containers shows 1.8x scaling over Sun's previous one node SPARC Enterprise T5440 server result with 4 Solaris Containers.

  • The two node SPARC Enterprise T5440 servers demonstrated the performance and scalability of the UltraSPARC T2 Plus processor demonstrating 50K users can be serviced with 0.2776 sec response time.

  • The Sun SPARC Enterprise T5220 server was used as an NFS server with 4 internal SSDs and the ZFS file system which showed significant I/O performance improvement over traditional disk for Business Intelligence Web Catalog activity.

  • IBM has not published any POWER6 processor based results on this important benchmark.

Performance Landscape

System Processors Users
Chips GHz Type
2 x Sun SPARC Enterprise T5440 8 1.6 UltraSPARC T2 Plus 50,000
1 x Sun SPARC Enterprise T5440 4 1.6 UltraSPARC T2 Plus 28,000
5 x Sun Fire T2000 1 1.2 UltraSPARC T1 10,000

Results and Configuration Summary

Hardware Configuration:

    2 x Sun SPARC Enterprise T5440 (1.6GHz/128GB)
    1 x Sun SPARC Enterprise T5220 (1.2GHz/64GB) and 4 SSDs (used as NFS server)

Software Configuration:

    Solaris10 05/09
    Oracle BI EE 10.1.3.4
    Oracle 11gR1

Benchmark Description

The objective of this benchmark is to highlight how Oracle BI EE can support pervasive deployments in large enterprises, using minimal hardware, by simulating an organization that needs to support more than 25,000 active concurrent users, each operating in mixed mode: ad-hoc reporting, application development, and report viewing.

The user population was divided into a mix of administrative users and business users. A maximum of 28,000 concurrent users were actively interacting and working in the system during the steady-state period. The tests executed 580 transactions per second, with think times of 60 seconds per user, between requests. In the test scenario 95% of the workload consisted of business users viewing reports and navigating within dashboards. The remaining 5% of the concurrent users, categorized as administrative users, were doing application development.

The benchmark scenario used a typical business user sequence of dashboard navigation, report viewing, and drill down. For example, a Service Manager logs into the system and navigates to his own set of dashboards viz. .Service Manager.. The user then selects the .Service Effectiveness. dashboard, which shows him four distinct reports, .Service Request Trend., .First Time Fix Rate., .Activity Problem Areas., and .Cost Per completed Service Call . 2002 till 2005. . The user then proceeds to view the .Customer Satisfaction. dashboard, which also contains a set of 4 related reports. He then proceeds to drill-down on some of the reports to see the detail data. Then the user proceeds to more dashboards, for example .Customer Satisfaction. and .Service Request Overview.. After navigating through these dashboards, he logs out of the application

This benchmark did not use a synthetic database schema. The benchmark tests were run on a full production version of the Oracle Business Intelligence Applications with a fully populated underlying database schema. The business processes in the test scenario closely represents a true customer scenario.

See Also

Disclosure Statement

Oracle BI EE benchmark results 10/13/2009, see

Tuesday Oct 13, 2009

The Sun SPARC Enterprise T5440 server with 1.6GHz UltraSPARC T2 Plus with Solaris Containers, Sun Flash Open Storage, and Sun JAVA System Web Server 7.0 Update 5 achieved World Record SPECweb2005.
  • Sun has obtained a World Record SPECweb2005 performance result of 100,209 SPECweb2005 on the Sun SPARC Enterprise T5440, running Solaris 10 10/09 Sun JAVA System Web Server 7.0 Update 5, and Java Hotspot™ Server VM.

  • This result demonstrates performance leadership of the Sun SPARC Enterprise T5440 server and its scalability, by using Solaris Containers to consolidate multiple web serving environments, and Sun OpenStorage Flash technology to store large datasets for fast data retrieval.

  • The Sun SPARC Enterprise T5440 delivers 21% greater SPECweb2005 performance than the HP DL370 G6 with 3.2GHz Xeon W5580 processors.

  • The Sun SPARC Enterprise T5440 delivers 40% greater SPECweb2005 performance than the HP DL 585 G5 with four 3.114 GHz Opteron 8393 SE processors.

  • The Sun SPARC Enterprise T5440 delivers 2x the SPECweb2005 performance of the HP DL 580 G5 with four 2.66GHz Xeon X7460 processors.

  • There are no IBM Power6 results on the SPECweb2005 benchmark.

  • This benchmark result clearly demonstrates that the Sun SPARC Enterprise T5440 running Solaris 10 10/09 and Sun Java System Webserver 7.0 Update 5 can support thousands of concurrent web server sessions and is an industry leader in web serving with a Sun solution.

Performance Landscape

Server

Processor

SPECweb2005

Banking*

Ecomm*

Support*

Webserver

OS

Sun T5440

4x 1.6 T2 Plus

100,209

176,500

133,000

95,000

Java WebServer

Solaris

HP DL370 G6

2x 3.2 W5580

83,073

117,120

142,080

76,352

Rock

RedHat
Linux

HP DL585 G5

4x 3.11 O8393

71,629

117,504

123,072

56,320

Rock

RedHat
Linux

HP DL580 G5

4x 2.66 X7460

50,013

97,632

69,600

40,800

Rock

RedHat
Linux

* Banking - SPECweb2005-Banking
   Ecomm - SPECweb2005-Ecommerce
   Support - SPECweb2005-Support

Results and Configuration Summary

Hardware Configuration:

  1 Sun SPARC Enterprise T5440 with

  • 4 x UltraSPARC T2 Processor 8 core, 64 threads, 1.6 GHz
  • 254 GB memory
  • 6 x 4Gb PCI Express 8-Port Host Adapter (SG-XPCIE8SAS-E-Z)
  • 1 x Sun Storage F5100 Flash Array (TA5100RASA4-80AA)
  • 1 x Sun Storage F5100 Flash Array (TA5100RASA4-40AA)

Server Software Configuration:

  • Solaris 10 10/09
  • JAVA System Web Server 7.0 Update 5
  • Java Hotspot™ Server VM

Network configuration:

  • 1 x Arista DCS-7124s 24-10GbE port  switch
  • 1 x Cisco 2970 series (WS-C2970G-24TS-E) switch for the three 1 GbE networks

Back-end Simulator:

  1 Sun Fire X4270 with

  • 2 x 2.93 GHz Intel X5570 Quad core
  • 48GB memory
  • Solaris 10 10/09
  • JSWS 7.0 Update 5
  • Java Hotspot™ Server VM

Clients:

  8 Sun Blade™ T6320

  • 1 x 1.417 GHz UltraSPARC-T2
  • 64 GB memory
  • Solaris 10 5/09
  • Java Hotspot™ Server VM

  8 Sun Blade™ 6270

  • 2 x 2.93 GHz Intel X5570 Quad core
  • 36 GB memory
  • Solaris 10 5/09
  • Java Hotspot™ Server VM

Benchmark Description

SPECweb2005, successor to SPECweb99 and SPECweb99_SSL, is an industry standard benchmark for evaluating Web Server performance developed by SPEC. The benchmark simulates multiple user sessions accessing a Web Server and generating static and dynamic HTTP requests. The major features of SPECweb2005 are:

  • Measures simultaneous user sessions
  • Dynamic content: currently PHP and JSP implementations
  • Page images requested using 2 parallel HTTP connections
  • Multiple, standardized workloads: Banking (HTTPS), E-commerce (HTTP and HTTPS), and Support (HTTP)
  • Simulates browser caching effects
  • File accesses more accurately simulate today's disk access patterns

Key Points and Best Practices

  • The server was divided into four Solaris Containers and a single web server instance was executed in each container.
  • Four processor sets were created (with varying numbers of threads depending on the workload) to run the web server in. This was done to reduce memory access latency using the physical memory closest to the processor.  All interrupts were run on the remaining threads.
  • Each web server is executed in the FX scheduling class to improve performance by reducing the frequency of context switches.
  • Two Sun Storage F5100 Flash Arrays (holding the target file set and logs) were shared by the four containers  for fast data retrieval.   
  • Use of Solaris Containers highlights the consolidation of multiple web serving environments on a single server.
  • Use of the Sun Ext I/O Expansion unit and Sun Storage F5100 Flash Arrays highlight the expandability of the server.

    Disclosure Statement

    Sun SPARC Enterprise T5440 (8 cores, 1 chip) 100209 SPECweb2005, was submitted to SPEC for review on October 13, 2009.  HP ProLiant DL370 G6 (8 cores, 2 chips) 83,073 SPECweb2005. HP ProLiant DL585 G5 (16 cores, 4 chips) 71,629 SPECweb2005. HP ProLiant DL580 G5 (24 cores, 4 chips) 50,013 SPECweb2005. SPEC, SPECweb reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of Oct 10, 2009.

    Tuesday Oct 13, 2009

    The Sun SPARC Enterprise M4000 server combined with Sun FlashFire technology, the Sun Storage F5100 flash array, has produced World Record Performance on PeopleSoft Payroll (North America) 9.0 benchmark.

    • A Sun SPARC Enterprise M4000 server with four new 2.53GHz SPARC64 VII processors and a Sun Storage F5100 flash array is 33% faster than the HP rx6600 (4 x 1.6GHz Itanium2 processors) on the PeopleSoft Payroll (NA) 9.0 benchmark. The Sun solution used the Oracle 11g database running on Solaris 10.

    • The Sun SPARC Enterprise M4000 server with four 2.53GHz SPARC64 VII processors and the Sun Storage F5100 flash array is 35% faster than the 2027 MIPs IBM Z990 (6 Z990 Gen1 processors) on the PeopleSoft Payroll (NA) 9.0 benchmark with Oracle 11g database running on Solaris 10. The IBM result used IBM DB2 for Z/OS 8.1 for the database.

    • The Sun SPARC Enterprise M4000 server with four 2.53GHz SPARC64 VII processors and a Sun Storage F5100 flash array processed 250K employee payroll checks using PeopleSoft Payroll (NA) 9.0 and Oracle 11g running on Solaris 10. Four different execution strategies were run with an average improvement of 25% compared to HP's results run on the rx6600. Sun achieved these results with 8 concurrent jobs using only 25% CPU utilization while HP required 16 concurrent jobs with a 88% CPU utilization.

    • The Sun SPARC Enterprise M4000 server combined with Sun FlashFire technology processed 8 Sequential Jobs and single run control with a total time of 527.85 mins, an improvement of 20% compared to HPs time of 633.09 mins.

    • The Sun SPARC Enterprise M4000 server combined with Sun FlashFire technology demonstrated a speedup of 81% going from 1 to 8 streams on the PeopleSoft Payroll (NA) 9.0 benchmark using the Oracle 11g database.

    • The Sun FlashFire technology dramatically improves IO performance for the PeopleSoft Payroll benchmark with significant performance boost over best optimized FC disks (60+).

    • The Sun Storage F5100 Flash Array is a high performance high density solid state flash array which provides a read latency of only 0.5 msec which is about 10 times faster than the normal disk latencies 5 msec measured on this benchmark.

    • Sun estimates that the MIPS rating for a Sun SPARC Enterprise M4000 server is over 2742 MIPS.

    Performance Landscape

    250K Employees

    System Processor OS/Database Time in Minutes Version
    Run 1 Run 2 Run 3
    Sun M4000 4x 2.53GHz SPARC64 VII Solaris/Oracle 11g 79.35 288.47 527.85 9.0
    HP rx6600 4x 1.6GHz Itanium2 HP-UX/Oracle 11g 81.17 350.16 633.25 9.0
    IBM Z990 6x Gen1 2027 MIPS Z/OS /DB2 107.34 328.66 544.80 9.0
    HP rx6600 4x 1.6GHz Itanium2 HP-UX/Oracle 11g 105.70 369.59 633.09 9.0

    Note: IBM benchmark documents show that 6 Gen1 procs is 2027 mips. 13 Gen1 processors were in this config but only 6 were available for testing.

    500K Employees

    System Processor OS/Database Time in Minutes Version
    Run 1 Run 2 Run 3
    HP rx7640 8x 1.6GHz Itanium2 HP-UX/Oracle 11g 133.63 712.72 1665.01 9.0

    Results and Configuration Summary

    Hardware Configuration:

      1 x Sun SPARC Enterprise M4000 (4 x 2.53 GHz/32GB)
      1 x Sun Storage F5100 Flash Array (40 x 24GB FMODs)
      1 x Sun Storage J4200 (12 x 450GB SAS 15K RPM)

    Software Configuration:

      Solaris 10 5/09
      Oracle PeopleSoft HCM 9.0
      Oracle PeopleSoft Enterprise (PeopleTools) 8.49
      Micro Focus Server Express 4.0 SP4
      Oracle RDBMS 11.1.0.7 64-bit
      HP's Mercury Interactive QuickTest Professional 9.0

    Benchmark Description

    The PeopleSoft 9.0 Payroll (North America) benchmark is a performance benchmark established by PeopleSoft to demonstrate system performance for a range of processing volumes in a specific configuration. This information may be used to determine the software, hardware, and network configurations necessary to support processing volumes. This workload represents large batch runs typical of OLTP workloads during a mass update.

    To measure five application business process run times for a database representing large organization. The five processes are:

    • Paysheet Creation: generates payroll data worksheet for employees, consisting of std payroll information for each employee for given pay cycle.

    • Payroll Calculation: Looks at Paysheets and calculates checks for those employees.

    • Payroll Confirmation: Takes information generated by Payroll Calculation and updates the employees' balances with the calculated amounts.

    • Print Advice forms: The process takes the information generated by payroll Calculations and Confirmation and produces an Advice for each employee to report Earnings, Taxes, Deduction, etc.

    • Create Direct Deposit File: The process takes information generated by above processes and produces an electronic transmittal file use to transfer payroll funds directly into an employee bank a/c.

    For the benchmark, we collect at least four data points with different number of job streams (parallel jobs). This batch benchmark allows a maximum of eight job streams to be configured to run in parallel.

    Key Points and Best Practices

    See Also

    Disclosure Statement

    Oracle PeopleSoft Payroll (NA) 9.0 benchmark, Sun M4000 (4 2.53GHz SPARC64) 79.35 min, IBM Z990 (6 gen1) 107.34 min, HP rx6600 (4 1.6GHz Itanium2) 105.70 min, www.oracle.com/apps_benchmark/html/white-papers-peoplesoft.html Results 10/13/2009.

    This blog copyright 2009 by John Henning