Thursday Apr 17, 2008

For the first time, MySQL includes Dtrace probes in the 6.0 release. On platforms that support Dtrace you can still find out a lot about what's happening, both in the Operating System kernel and in user processes, even without probes in the application. But carefully placed Dtrace probes inserted into the application code can give you a lot more information about what's going on, because they can be mapped to the application functionality. So far only a few probes have been included, but expect more to be added soon.

I decided to take the new probes for a spin. Oh, and rather than do it on a Solaris system, I figured I'd give it a shot on my Intel Core 2 Duo MacBook Pro, since MacOS X 10.5 (Leopard) supports Dtrace.

To begin with I pulled down and built MySQL 6.0.5 from Bit Keeper, thanks to some help from Brian Aker. Here's where, and here's how. I needed to make a couple of minor changes to the dtrace commands in the Makefiles to get it to compile - hopefully that will be fixed pretty soon in the source. Before long I had mysqld running. A quick "dtrace -l | grep mysql" confirmed that I did indeed have the Dtrace probes.

There are lots of interesting possibilities, but for now I'll just run a couple of simple scripts to illustrate the basics.

First, I ran this D script:

#!/usr/sbin/dtrace -s

:mysqld::
{
	printf("%d\n", timestamp);
}

Running "select count(*) from latest" from mysql in another window yielded the following:

dtrace: script './all.d' matched 16 probes
CPU     ID                    FUNCTION:NAME
  0  18665 _ZN7handler16ha_external_lockEP3THDi:external_lock 26152931955098

  0  18673 _Z13handle_selectP3THDP6st_lexP13select_resultm:select_start 26152931997414

  0  18665 _ZN7handler16ha_external_lockEP3THDi:external_lock 26153878060162

  0  18672 _Z13handle_selectP3THDP6st_lexP13select_resultm:select_finish 26153878082583

So the count(*) took out a couple of locks, and we got to see the start and end of the select with a timestamp (in microseconds). Just for interest, I ran the count(*) a second time, and this time none of the probes fired - the query was being satisfied from the query cache.

Next I decided to try "show indexes from latest;". The result was as follows:

mysql> show indexes from latest;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| latest |          0 | latest_x |            1 | hostid      | A         |      356309 |     NULL | NULL   | YES  | BTREE      |         |               | 
| latest |          0 | latest_x |            2 | exdate      | A         |      356309 |     NULL | NULL   | YES  | BTREE      |         |               | 
| latest |          0 | latest_x |            3 | extime      | A         |      356309 |     NULL | NULL   | YES  | BTREE      |         |               | 
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.15 sec)

Here's the result from Dtrace:

  0  18673 _Z13handle_selectP3THDP6st_lexP13select_resultm:select_start 27114155991557

  0  18670 _ZN7handler12ha_write_rowEPh:insert_row_start 27114308499688

  0  18669 _ZN7handler12ha_write_rowEPh:insert_row_finish 27114308553968

  0  18670 _ZN7handler12ha_write_rowEPh:insert_row_start 27114308565086

  0  18669 _ZN7handler12ha_write_rowEPh:insert_row_finish 27114308588605

  0  18670 _ZN7handler12ha_write_rowEPh:insert_row_start 27114308598685

  0  18669 _ZN7handler12ha_write_rowEPh:insert_row_finish 27114308622164

  0  18672 _Z13handle_selectP3THDP6st_lexP13select_resultm:select_finish 27114308714705

So three rows were inserted, presumably into a temporary table, corresponding to the three index columns. Dtrace shows that the query cache isn't used when you rerun this particular query.

Next I ran the following D script:

#!/usr/sbin/dtrace -s

:mysqld::*_start
{
	self->ts = timestamp;
}

:mysqld::*_finish
/self->ts/
{
	printf("%d\n", timestamp - self->ts);
}

and reran the "show indexes" command. Here's the result:

CPU     ID                    FUNCTION:NAME
  1  18669 _ZN7handler12ha_write_rowEPh:insert_row_finish 61634

  1  18669 _ZN7handler12ha_write_rowEPh:insert_row_finish 22040

  1  18669 _ZN7handler12ha_write_rowEPh:insert_row_finish 21058

  1  18672 _Z13handle_selectP3THDP6st_lexP13select_resultm:select_finish 88933

This time we see just one line for each insert and select, including the time taken to complete the operation (in microseconds) rather than a timestamp.

Even in the more compact form, this would get a bit verbose for some operations, though, so I ran the following D script:

#!/usr/sbin/dtrace -s

:mysqld::*_start
{
	self->ts = timestamp;
}

:mysqld::*_finish
/self->ts/
{
	@completion_time[probename] = quantize(timestamp - self->ts);
}

After repeating the "show indexes", Dtrace returned the following after a Control-C:

dtrace: script './all3.d' matched 12 probes
^C

  insert_row_finish                                 
           value  ------------- Distribution ------------- count    
            8192 |                                         0        
           16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@              2        
           32768 |@@@@@@@@@@@@@                            1        
           65536 |                                         0        

  select_finish                                     
           value  ------------- Distribution ------------- count    
           32768 |                                         0        
           65536 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1        
          131072 |                                         0        

This actually gives more info, but in a fairly compact form. Thanks to the quantize() function, we see a histogram with a count for each operation alongside the time taken; Two of the inserts completed in roughly 16384 microseconds, and one in 32768 microseconds.

Finally, I ran a "show tables;", which returned 48 rows, and the following from Dtrace:

dtrace: script './all3.d' matched 12 probes
^C

  select_finish                                     
           value  ------------- Distribution ------------- count    
           32768 |                                         0        
           65536 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1        
          131072 |                                         0        

  insert_row_finish                                 
           value  ------------- Distribution ------------- count    
            4096 |                                         0        
            8192 |@@@@@@@@@@@@                             14       
           16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@             33       
           32768 |@                                        1        
           65536 |                                         0        

The report shows the 46 inserts. It would have been a lot more difficult to read with the first couple of D scripts, but the quantize() function offers a nice summary for no effort.

This is barely scratching the surface of what Dtrace can do, of course. But hopefully I've whetted your appetite. Why don't you check it out for yourself?

Wednesday Apr 09, 2008

Sun engineers blog on the new multi-chip UltraSPARC T2 Plus systems

Today Sun is announcing new CMT-based systems, hard on the heels of the UltraSPARC T2 systems launched in October 2007 (the Sun SPARC Enterprise T5120 and T5220 systems). Whereas previous Sun CMT systems were based around a single-socket UltraSPARC T1 or T2 processor, the new systems incorporate two processors, doubling the number of cores and the number of hardware threads compared to UltraSPARC T2-based systems. Each UltraSPARC T2 Plus chip includes 8 hardware strands in each of 8 cores, so the Operating System sees a total of 128 CPUs. The new systems deliver an unprecedented amount of CPU capacity in a package this size, as evidenced by the very impressive benchmark results published today.

Systems come in both 1U and 2U packaging: the 1U Sun SPARC Enterprise T5140 ships with two UltraSPARC T2 Plus chips, each with 4, 6, or 8 cores at 1.2 GHz, and the 2U Sun SPARC Enterprise T5240 ships with two UltraSPARC T2 Plus chips, each with 6 or 8 cores at 1.2 GHz, or 8 cores at 1.4 GHz. For more information about the systems, a whitepaper is available which provides details on the processors and systems.

Once again, some of the engineers who have worked on these new systems have shared their experiences and insights in a series of wide-ranging blogs (for engineers' perspectives on the UltraSPARC T2 systems, check out the CMT Comes of Age blog). These blogs will be cross referenced here as they are posted. You should expect to see more appear in the next day or two, so plan on visiting again later to see what's new.

Here's what the engineers have to say:

  • UltraSPARC T2 Plus Server Technology. Tim Cook offers insights into what drove processor design toward CMT. Marc Hamilton serves up a brief overview of CMT for those less familiar with the technology. Dwayne Lee touches on the UltraSPARC T2 Plus chip. Josh Simons offers us a look under the hood of the new servers. Denis Sheahan provides an overview of the hardware components of the UltraSPARC T2 Plus sytems, then follows it up with details of the memory and coherency of the UltraSPARC T2 Plus processor. Lawrence Spracklen introduces the crypto acceleration on the chip. Richard Elling talks about RAS (Reliability, Availability, and Serviceability) in the systems, and Scott Davenport describes their predictive self-healing features.
  • Virtualization. Honglin Su announces the availability of the Logical Domains 1.0.2 release, which supports the UltraSPARC T2 Plus platforms. Eric Sharakan offers further observations on LDoms on T5140 and T5240. Ning Sun discusses a study designed to show how LDoms with CMT can improve scalability and system utilization, and points to a Blueprint on the issue.
  • Solaris Features. Steve Sistare outlines some of the changes made to Solaris to support scaling on large CMT systems.
  • System Performance. Peter Yakutis offers insights into PCI-Express performance. Brian Whitney shares some Stream benchmark results. Alan Chiu explains 10 Gbit Ethernet perfomance on the new systems. Charles Suresh gives some fascinating background into how line speed was achieved on the 10 GBit Ethernet NICs.
  • Application Performance. What happens when you run Batch Workloads on a Sun CMT server? Giri Mandalika's blog shares experiences running the Oracle E-Business Suite Payroll 11i workload. You might also find Satish Vanga's blog interesting - it focusses on SugarCRM running on MySQL (on a single-socket T5220). Josh Simons reveals the credentials of the new systems for HPC applications and backs it up by pointing to a new SPEComp2001 world record. Joerg Schwarz considers the applicability of the UltraSPARC T2 Plus servers for Health Care applications.
  • Web Tier. CVR explores the new World Record SPECweb2005 result on the T5220 system, and Walter Bays teases out the subtleties of the SPEC reporting process.
  • Java Performance. Dave Dagastine announces a World Record SPECjbb2005 result.
  • Benchmark Performance. The irrepressible bmseer details a number of world record results, including SPECjAppServer2004, SAP-SD 2 tier, and SPECjbb2005.
  • Open Source Community. Josh Berkus explores the implications for PostgreSQL using virtualization on the platform. Jignesh Shah discusses the possibilities with Glassfish V2 and PostgreSQL 8.3.1 on the T5140 and T5240 systems.
  • Sizing. Walter Bays introduces the CoolThreads Selection Tool (cooltst) v3.0 which is designed to gauge how well workloads will run on UltraSPARC T2 Plus systems.

Check out also the Sun CMT Wiki.

Tuesday Feb 26, 2008

In this blog I'm sharing the results of a series of tests designed to explore the impact of various MySQL and, in particular, InnoDB tunables. Performance engineers from Sun have previously blogged on this subject - the main difference in this case is that these latest tests were based on Linux rather than Solaris.

It's worth noting that MySQL throughput doesn't scale linearly as you add large numbers of CPUs. This hasn't been a big issue to most users, since there are ways of deploying MySQL successfully on systems with only modest CPU counts. Technologies that are readily available and widely deployed include replication, which allows horizontal scale-out using query slaves, and memcached, which is very effective at reducing the load on a MySQL server. That said, scalability is likely to become more important as people increasingly deploy systems with quad-core processors, with the result that even two processor systems will need to scale eight ways to fully utilize the available CPU resources.

The obvious question is whether performance and scalability is going to attract the attention of a joint project involving the performance engineering groups at MySQL and Sun. You bet! Fruitful synergies should be possible as the two companies join forces. And in case you're wondering, Linux will be a major focus, not just Solaris - regard this blog as a small foretaste. Stay tuned in the months to come...

Test Details

On to the numbers. The tests were run on a Sun Fire X4150 server with two quad-core Intel Xeon processors (8 cores in total) and a Sun Fire X4450 server with four quad-core Intel Xeon processors (16 cores in total) running Red Hat Enterprise Linux 5.1. The workload was Sysbench with 10 million rows, representing a database about 2.5Gbytes in size, using the current 64-bit Community version of MySQL, 5.0.51a. My colleague Neel has blogged on the workload and how we used it. The graphs below do not list throughput values, since the goal was only to show relative performance improvements.

The first test varied innodb_thread_concurrency. In MySQL 5.0.7 and earlier, a value greater than 500 was required to allocate an unlimited number of threads. As of MySQL 5.0.18, a value of zero means unlimited threads. In the graph below, a value of zero clearly delivers better throughput beyond 4 threads for the read-only test.

The read-write test, however, benefits from a setting of 8 threads. These graphs show the throughput on the 8-core system, although both the 8- and the 16-core systems showed similar behavior for each of the read-only and the read-write tests.

The following graphs show the effect of increasing the InnoDB buffer cache with the innodb_buffer_cache_size parameter. The first graph shows read-only performance and the second shows read-write performance. As you would expect, throughput increases significantly as the cache increases in size, but eventually reaches a point where no benefit is derived from further increases.

Finally, we've seen that throughput is affected by the amount of memory we assign to the InnoDB buffer cache. But since the default Linux file system, ext3, also caches pages, why not let Linux do the caching rather than InnoDB. To test this, we tried comparing throughput with and without Linux file system caching. Setting the innodb_flush_method parameter to O_DIRECT will cause MySQL to bypass the file system cache. The results are shown in the graph below. Clearly the file system cache makes a difference, because throughput with the InnoDB buffer cache set to 1024 Mbytes supported by the file system cache is also as good as throughput with no file system caching and the InnoDB buffer cache set to 2048 Mbytes. But while the Linux file system cache can help protect you somewhat if you undersize your InnoDB buffer cache, for optimal performance, it's important to give the InnoDB buffer cache as much memory as it needs. Bypassing the Linux file system cache may not be a good idea unless you have properly sized the InnoDB buffer cache - disk read activity was very high when the buffer cache was too small and the file system cache was being bypassed. We also found that the CPU cost per transaction was higher when the InnoDB buffer cache was too small. That's not surprising, since the code path is longer when MySQL has to go outside the buffer cache to retrieve a block.

We tested a number of other parameters but found that none were as significant for this workload.

So to summarize, two key parameters to focus on are innodb_buffer_pool_size and innodb_thread_concurrency. Appropriate settings for these parameters are likely to help you ensure optimal throughput from your MySQL server.

Allan

Thursday Jan 17, 2008

Given the timing of my recent blog, Are Proprietary Databases Doomed?, I've been asked if I knew in advance about Sun's recent MySQL acquisition. Not at all! I was just as surprised and delighted as most others in the industry when I saw the news.

In the blog I outlined counter strategies that proprietary database companies might use to respond to the rise of Open Source Databases (OSDBs). One strategy was acqusition and I noted that MySQL, being privately held, was probably the most vulnerable.

The good news is that MySQL is no longer vulnerable. Sun has an unparalleled commitment to open source. No other organization has contributed anything like the quantity and quality of code, with Solaris, Java, OpenOffice, GlassFish, and other software now freely available under open source licenses. Sun also has an established track record with OSDBs such as PostgreSQL, and JavaDB, Sun's distribution of Derby. The MySQL acquisition does not represent a change of direction for Sun, rather the extension of an existing strategy.

The real surprise is that one of Oracle, IBM, or Microsoft didn't get there first. Any of them could have swallowed MySQL without burping. I'm betting there are people today wondering why on earth they let Sun steal a march on them.

Whatever the reason, the final outcome is great news for anyone who appreciates the value of free open source software. MySQL couldn't be in safer hands.

Allan

Monday Dec 10, 2007

Times of change are upon the database market. The major established database companies are being challenged by open source upstarts like MySQL and PostgreSQL. For years, Open Source Databases (OSDBs) have been quietly increasing their penetration, but until recently they have lacked the capabilities to seriously threaten proprietary databases like Oracle, IBM's DB2, and Microsoft's SQL Server.

All that has changed. OSDBs now boast the necessary features and robustness to support commercial databases hundreds of Gigabytes in size. And a growing trickle of competitive benchmark results shows them performing more than acceptably well against their better-established cousins, while offering significant benefits in Total Cost of Ownership (TCO).

What does this mean for proprietary databases? Are they doomed? And more importantly, are there opportunities for end users to benefit from the rise of OSDBs? I will explore these topics in a multi-part blog:

  1. Feature Stagnation In The Traditional Database Market
  2. License Costs: the Soft Underbelly of Proprietary Databases
  3. The Looming Open Source Database Tsunami
  4. The Perfect Storm for Proprietary Databases
  5. Proprietary Counter Strategies
  6. Conclusion
The standard disclaimer applies as always: these are my opinions and not necessarily those of Sun or anyone else.

1. Feature Stagnation In The Traditional Database Market

When I joined Sun in the late 80s, choosing a database was still an important issue for end users. Large customers routinely issued tenders for databases as well as for computer systems, and, to help in the selection process, customers often staged performance bakeoffs between competing database vendors using home-grown benchmarks.

In the 90s, fierce competition led to a rapid explosion in features as well as dramatic improvements in performance. Sun in particular invested a lot of engineering effort in working with the major database companies to improve performance and scalability. At the same time, a variety of new technologies appeared, many claiming they would knock the relational database from its throne. Distributed databases, object relational, shared nothing, and in-memory database implementations all made cameo appearances. Relational databases simply absorbed their best features and continued to rule. Simple database features like triggers and stored procedures gave way to more sophisticated technologies like replication, online backup, and cluster support.

By the turn of the millenium, relational databases had already pretty much met the essential requirements of end users, and proprietary database companies were either pointing their vaccuum cleaners toward other interesting money piles, or losing the plot entirely and sailing off the edge of the world. Today, database releases continue to tout new features, but they're frosting on the cake rather than essentials. No-one issues a tender for a database unless they have unusual requirements. No-one loses their job because they chose the wrong database. And it's been that way for years.

Put very simply, the database has arguably become a commodity.

2. License Costs: the Soft Underbelly of Proprietary Databases

Databases may have become commodities, but selling them is still very profitable. Historically, as CPU performance increased with faster clock speeds, users continued to pay the same price for database licenses on the newer, more powerful systems. But that all changed as the industry moved to multi-core CPUs. As we will see, the licensing policies adopted by proprietary database companies have ensured that license charges have increased steeply as a result of this revolution in processor chip technology.

Some years ago, chip manufacturers began turning to multi-core CPU designs as a way of continuing to drive improvements in CPU performance. As it becomes more difficult to increase the transistor density of CPU chips and increase clock speeds, multi-core chips offer a simple alternative by packing more than one core on a single chip running at a lower clock speed. At the same time, proprietary database vendors began basing license charges on the number of cores in a system.

A practical example is Sun's dual-core UltraSPARC-IV chip. It replaced the single-core UltraSPARC-III chip at the same clock speed. By delivering two cores instead of one, the UltraSPARC-IV offered twice the performance of its predecessor. A typical system was the popular UltraSPARC-IV-based Sun Fire V490 which included four dual-core chips (eight cores). This system replaced the Sun Fire V480 with four single-core UltraSPARC-III chips. The customer received twice the CPU performance for the same hardware price.

Not so for the Oracle database price, though. Based on per-core licensing, the new system was now treated as an 8-core system instead of a 4-core system as previously. And worse, users were forced to a significantly more expensive database edition if they deployed systems with more than four cores. So, compared to the V480, the V490 now attracted a much higher per-core charge, on top of the requirement to pay for twice as many core licenses.

The following table illustrates the extraordinary windfall received by Oracle:

SystemV480V490
Chips44
Cores48
Relative Performance1.02.0
Relative Hardware Price1.01.0
Database Core Licenses Required48
Relative Core License Price1.02.7
Relative Database License Price1.05.3

In the face of considerable pushback from the industry, Oracle responded with "discounts" for the second core in a chip. In the case of the V490, that meant the discounted database price was still four times the license charge for the V480!

So while users continued to enjoy more powerful and more feature-rich hardware at the same or lower price, they were paying a lot more to use the same database software on the new hardware.

The situation prompted comments like those from Stephen Elliot, an enterprise systems analyst at IDC, who anticipated increased pressure on Oracle to be more flexible with pricing, and reported that "Oracle is becoming the lone force on the processor issue". Since that time, under pressure from Microsoft, which introduced per-chip database licensing back in Feb 2005, Oracle has continued to tweak its licensing model, but so far without making wholesale changes.

It should be noted that Oracle is not alone in following this path - similar anomalies apply to the pricing of IBM's DB2 database. Nor is Microsoft the only vendor to embrace a per-chip pricing strategy - as long ago as 2005, the Register reported that BEA was adopting the same per-socket licensing model as Microsoft and VMware, and noted that "the software maker's move puts it in prime fighting position against Oracle and IBM, which have been slow to adjust their pricing models for new chips from AMD, Intel and others." According to Ashlee Vance in July 2007, "Most software vendors have had the decency to settle on a per-socket basis for their pricing schemes, ignoring the number of cores per chip. Meanwhile, IBM and Oracle, the vendors with the most to lose, prefer to keep you in a state of pricing confusion."

Anecdotally, some companies are finding that database licenses have become their single biggest IT cost. The impact is probably greater on small and medium-sized companies that don't have the same ability to command the hefty discounts that larger companies typically enjoy from database vendors. A colleague related a story that illustrates the issue. His brother worked for a 200-person company that decided it was time to upgrade their database applications. They set out to deploy a well-known proprietary database until they discovered that the database license fee was going to exceed their entire current annual IT budget! They ended up deploying an open source database instead.

3. The Looming Open Source Database Tsunami

In August 2007, Tom Daly revealed the results of a SPECjAppServer2004 benchmark based on an entirely open source software stack, with PostgreSQL running on a Sun Fire T2000 server and the Glassfish Application Server on two Sun Fire X4200 servers. The announcement was revealing for two reasons:
  • It showed PostgreSQL capable of supporting more than 6,000 concurrent users on a commodity hardware platform
  • The benchmark result was within 10% of a published result from an HP/Oracle configuration costing more than three times as much. The major reason for the huge price difference was the cost of Oracle (at $110,000 compared to PostgreSQL $0).

    For the record, it should be noted that the SPECjAppServer2004 benchmark does not include a pricing metric, so these are not official prices. Nonetheless, since benchmark configurations clearly cost actual money, it seems reasonable to assess the prices involved. In this case, all hardware and software prices were drawn from publicly-available sources.

Open source databases still do not scale as well as proprietary databases, but they now perform well enough to manage a broad range of challenging applications. End users who previously saw OSDBs as primarily suitable for simple low-volume applications are now able to reasonably consider them for departmental database server deployments.

Do OSDBs have the features needed for serious deployment, though? 12 months ago, Forrester Research released a report suggesting that eighty percent of applications typically only use 30 per cent of the features found in commercial databases, and that the open source databases deliver those features today. While Forrester noted that OSDBs still lag for mission critical applications, those holes are likely to be plugged as bigger players announce 24x7 technical support and service (as has already happened, for example, with PostgreSQL).

A recent survey of Oracle users showed 20% having open source databases larger than 50 Gbytes and two thirds citing cost as the driver to adoption of open source. Open source database adoption is still relatively small. Does that mean OSDBs should be dismissed? Not according to a Gartner analyst, quoted last year as saying "We think it is a big deal. Granted, in the DBS market right now, they are very small players. Remember about 10 years ago, Linux in the market was a very small player? Not so much, anymore."

The comparison may be apt. With the combination of essential features, improved performance, robust support, and compelling price, OSDBs today bear a striking resemblence to Linux a few years ago. Many believe that the wave looming on the horizon is a tsunami. Time alone will tell.

4. The Perfect Storm for Proprietary Databases

Underneath major end-user applications like ERP and data warehouse software, every major hardware and software component is now subject to commodity pricing. As we have seen, each year hardware prices continue to decline while processing power increases. At the same time, leading operating systems like Sun's Solaris and Linux are now open source and can be deployed for free. The same is true of most other components of the software stack, including virtualization software, databases, application servers, web servers, and other middleware. Even Sun's UltraSPARC T1 and T2 chips and RTL have been open sourced, allowing community members to build on proven hardware at a much lower cost.

Why, then, is proprietary database software becoming more expensive while everything else reduces in price? End users normally expect to benefit from the cost savings resulting from improvements in technology. I am writing this blog, for example, on an affordable computer that would easily outperform expensive commercial systems from just 10 years ago.

It seems difficult to resist the conclusion that proprietary database companies have managed to redirect a good chunk of these savings away from end users and into their own coffers. Successful as this strategy has been, though, it could ultimately backfire. The more expensive proprietary databases become, the more attractive lower cost alternatives appear.

A number of forces are currently at work in the market:

  • The momentum around open source software has continued to build, and many open source products, while not as capable as proprietary alternatives, have become "good enough" to replace them. This is also true of open source databases.
  • Large technology suppliers are beginning to bundle OSDBs, with the result that customers are able to take out support contracts with established companies as well as startups. Sun, for example, ships and supports PostgreSQL.
  • Benchmarks are beginning to feature OSDBs. Thus OSDBs are starting down a path that has been trod by proprietary databases over many years. Benchmarks can be expected to highlight the capabilities of OSDBs, accelerate the process of OSDB performance improvement, and, increasingly, expose the price difference between OSDBs and proprietary databases. And as the scalability of OSDBs increases, benchmarks will be published on larger systems, opening up an even wider gap in database pricing.
  • The sweet spot in the hardware market is a two- to four-chip server. With the advent of quad-core chips from Intel and AMD, and the 8-core UltraSPARC T1 and T2 chips released by Sun, such systems have become powerful enough to carry out processing that required much larger and more expensive systems in the past. The pricing chasm between low cost hardware and high cost proprietary databases continues to widen.

    For example, the UltraSPARC T1-based servers, the Sun Fire T1000 and T2000, shipped with 8 cores on a single chip. The recently-released single-chip, 8-core UltraSPARC T2 servers, which deliver twice the performance at roughly the same cost, now attract a DB2 license charge that has increased 66% compared to the T1 platforms. So IBM has taken the opportunity to significantly hike the price of the database software on that platform, even though the number of chips and number of cores has not changed.

    At the same time, 8-core UltraSPARC T1 servers attracted a 2-core license charge from Oracle (a very reasonable pricing decision on Oracle's part - see this table which is referenced from here). At the time of writing, Oracle has not announced a final decision on pricing for the T2 platform, but it is unlikely that Oracle will be able to resist the temptation to emulate or even outdo the cynicism of IBM. Why such a pessimistic expectation? Because the same table referred to above announces that the 1.4GHz UltraSPARC version of the older T1 servers will be subject to a 0.5 multiplier instead of the 0.25 multiplier that applies to the 1.0 GHz and 1.2 GHz versions of the same platform. So a simple 17% clock speed increase in the hardware, with no other changes at all, prompted Oracle to double the license charge!

    To be fair to Oracle, since the UltraSPARC T1 and T2 platforms are single chip systems, customers can purchase Standard Edition and Standard Edition One licenses for them at greatly reduced prices. But these editions do not offer the full Oracle feature set, and in particular they do not offer the parallel processing capabilities essential for efficient processing on the 32-way T1 and 64-way T2 platforms; if customers want parallel capabilities they must purchase the vastly more expensive Enterprise Edition with its core-based licensing and inexplicable and inconsistent "discounts".

  • Web 2.0 is gathering momentum, and the new breed of companies leading the charge are largely ignoring proprietary databases for their deployments. Probably much of the scepticism relates to the cost implications.

In the past there has been no real alternative to proprietary databases. That has changed, at least for new applications that have no legacy database dependencies. The relative proportion of hardware and software purchase prices has been changing, too, and for some years software has gradually been consuming ever-larger slices of the pie. In recent times the pace has accelerated, though, as commodity-priced hardware has become much more powerful and database prices have increased.

For the most part, customers still have not entirely woken up to these trends. But if they ever do, we may see a significant shift in the database market. Is it possible that the perfect storm for proprietary databases is brewing?

5. Proprietary Counter Strategies

Proprietary Database companies are not without ways of responding to the challenge from OSDBs. Here are a few possibilities.
  • Strategy 1: Resistance is Futile - You Will Be Assimilated. Picking off your competitors can get a lot easier when they are open source companies, because most of them struggle to address a major discrepancy between their penetration and their annual revenue. Putting it another way, they have plenty of users but very little revenue to show for it. Hey, their product is free, after all! So far, not many people have figured out how to become absurdly rich by giving away software.

    Buying a competitor gives you access to their Intellectual Property (IP) and their customer base. Sometimes it simply eliminates a competitor from the marketplace. Either way, playing the Borg can be an effective way of reshaping the market in your favor.

    Note that Oracle has already made some raids across the border, having acquired InnoBase, maker of InnoDB, MySQL's most popular transactional engine, and Sleepycat Software, maker of Berkeley DB, another transactional engine used with MySQL. In response, MySQL has scrambled to introduce Falcon, a transactional database engine of its own.

    Any of the major proprietary database companies could reasonably play the role of the Borg in this scenario, though, since all of them have very deep pockets. MySQL is probably the most vulnerable to takeover, since it's privately held. PostgreSQL may be more difficult to silence, since it is developed by an active community rather than a single company. But in either case, even if you pick off the company or the key community contributors, you haven't removed the IP from the market because the database is open source.

  • Strategy 2: Bait And Switch. Offer a cut-down version of your own proprietary database for free, primarily targeting developers and companies doing pilot implementations. The idea is to make it easy for people to develop on your platform, then charge like wounded bulls when they deploy in earnest.

    All of the major proprietary databases have free cut-down versions. Oracle Lite supports databases limited to 4 Gbytes of disk and 64 concurrent connections. Microsoft's Sql Server Express Edition supports one CPU only, 1 Gbyte of memory, and 4 Gbytes of disk. IBM's DB2 Express-C supports 2 processor cores, 2 Gbytes of memory, and unlimited disk.

    These database editions are free, but they are not open source. The pricing policy could change overnight. And, as outlined above, each has restrictions that limit their usefulness for deployment.

  • Strategy 3: Revenue Pull-Through. Include the database as a bundle with other pieces of your software stack. Focus the customer's attention on buying something else, and chances are they won't notice or won't care that they've bought your database as well.

  • Strategy 4: Business As Usual. If you wait long enough, perhaps open source databases will stumble or be acquired by someone. Maybe their fall will be as meteoric as their rise. Or maybe the Borg will show up and assimilate them before they build too much more momentum. Either way, it will be one less competitor to worry about.

    If you think a wait-and-see strategy sounds implausible, history shows that when they can't make up their minds how to respond, a lot of companies (and countries for that matter) do little more than sit on their hands. Australia's first ice skating gold medalist, Steven Bradbury, demonstrated how to win this way in spectacular fashion at the 2002 Winter Olympics. (Actually the last comparison is not entirely fair to Steven Bradbury - although he did win because the other contestants all stumbled, Bradbury's presence in the final was an achievement in itself that clearly demonstrated his ability and commitment.)

  • Strategy 5: Reduce Prices. Much of the imperative for a migration to OSDBs will be removed if proprietary database companies drop their prices significantly. The excellence of proprietary databases is certainly not under question - I can personally attest to the performance, scalability, rich feature set, and robustness of both Oracle and DB2, for example. The search for alternative databases is largely driven by the need for pricing relief. OPEC discovered in the 1970s that inflated oil prices led to both energy conservation and a search for alternative energy sources. When OPEC soon reduced prices again, much of the impetus behind alternative energy disappeared in the West (sadly).

    This strategy is only feasible if proprietary database companies derive most of their revenue from other sources. Oracle is probably the most vulnerable here.

My vote for the Strategy Most Likely To Succeed is a tie between Revenue Pull-Through and Reduce Prices. Oracle is arguably becoming the most successful proponent of the pull-through strategy. Oracle wants to supply you with a full software stack, including an OS, virtualization software, a broad range of middleware, a database, and end user applications. The largest component of Oracle's revenue currently still comes from database licenses, but the company is working hard to reduce that dependency. Until that happens, reducing prices across the board will be challenging for Oracle. If Oracle succeeds with a pull-through strategy, it doesn't mean that OSDBs will fail, of course. It simply means that Oracle is less likely to sustain major damage from their success.

Price reductions, if they are large enough and sustained enough, are likely to do more to slow down OSDB penetration. But I suspect that proprietary companies, if they are to do it at all, will need to reduce prices soon; if enough momentum builds around OSDBs we will reach a tipping point where it won't matter any more (witness the rise of Linux).

6. Conclusion

Are proprietary databases doomed, then? Not at all. Even if proprietary database companies pull no surprises, they won't fade away anytime soon. Too much legacy application software currently depends on them. Until ISV applications - like SAP's R/3, for example - support MySQL and PostgreSQL, end users will be wedded to proprietary databases. (Note, though, that SAP does support its own free and open source MaxDB database with R/3). As Oracle builds its software portfolio, too, more applications will ship with the Oracle database bundled. And for the forseeable future, proprietary databases will be the platform of choice for the largest mission-critical database deployments.

Make no mistake, though, open source databases are coming. For established companies it's more likely to be an evolution than a revolution. We will probably see a gradual OSDB surround, where new applications and deployments are increasingly based on OSDBs, driven by the cost savings. In emerging markets, though, it's looking more like a revolution. Last year I met with a small number of high-adrenaline companies in India, a market undergoing very rapid growth. They were openly dismissive of proprietary databases. One company had a small installation that was described as a "legacy" application due for replacement by an open source database. This is the scenario playing out today at high fliers like Google, Facebook, YouTube, and Flickr.

How can you take advantage of the rise of OSDBs? Here are some suggestions:

  • If you're considering a new database deployment, examine the possible cost savings of an OSDB.
  • If you're an established proprietary database user, don't simply throw out your database. Take the time to establish the feasibility and quantify the benefits of an OSDB solution before making a change.
  • If you're unhappy about the prices you're paying for database software, let your supplier know - the more senior the contact, the better. Suppliers do listen to their customers! As a side note, if you're a proprietary database customer looking at OSDBs, you don't need to make a big secret of it. I know of situations where proprietary database suppliers offered deep discounts to keep a customer away from OSDBs!

Perhaps the last word should go to The Economist. The following observations, reported in January 2002, may well prove prescient: "if software firms continue to think they can cash in on every new increase in computer performance, they will only encourage more and more customers to defect. And today, unlike a decade ago, open-source software has become just too good to be ignored."

Allan

What do you think? Feel free to let me know at Allan.Packer@Sun.COM.

Sunday Dec 09, 2007

Click here for the full blog entry.

Saturday Dec 08, 2007

Click here for the full blog entry.

Friday Dec 07, 2007

Click here for the full blog entry.

Thursday Dec 06, 2007

Click here for the full blog entry.

Thursday Oct 11, 2007

Sun engineers give the inside scoop on the new UltraSPARC T2 systems

[ Update Jan 2008: Sun SPARC Enterprise T5120 and T5220 servers were awarded Product of the Year 2007. ]

Sun launched the Chip-Level MultiThreading (CMT) era back in December 2005 with the release of the highly successful UltraSPARC T1 (Niagara) chip, featured in the Sun Fire T2000 and T1000 systems. With 8 cores, each with 4 hardware strands (or threads), these systems presented 32 CPUs and delivered an unprecedented amount of processing power in compact, eco-friendly packaging. The systems were referred to as CoolThreads servers because of their low power and cooling requirements.

Today Sun introduces the second generation of Niagara systems: the Sun SPARC Enterprise T5120 and T5220 servers and the Sun Blade T6320. With 8 hardware strands in each of 8 cores plus networking, PCI, and cryptographic capabilities, all packed into a single chip, these new 64-CPU systems raise the bar even higher.

The new systems can probably be best described by some of the engineers who have developed them, tested them, and pushed them to their limits. Their blogs will be cross-referenced here, so if you're interested to learn more, come back from time to time. New blogs should appear in the next 24 hours, and more over the next few weeks.

Here's what the engineers have to say.

  • UltraSPARC T2 Server Technology. Dwayne Lee gives us a quick overview of the new systems. Denis Sheahan blogs about UltraSPARC T2 floating point performance, offers a detailed T5120 and T5220 system overview, and shares insights into lessons learned from the UltraSPARC T1. Josh Simons offers us a glimpse under the covers. Stephan Hoerold gives us an illustration of the UltraSPARC T2 chip. Paul Sandhu gives us some insight into the MMU and shared contexts. Tim Bray blogs about the interesting challenges posed by a many-core future. Darryl Gove talks about T2 threads and cores. Tim Cook compares the UltraSPARC T2 to other recent SPARC processors. Phil Harman tests memory throughput on an UltraSPARC T2 system. Ariel Hendel, musing on CMT and evolution, evidences a philosophical bent.
  • Performance. The inimitable bmseer gives us a bunch of good news about benchmark performance on the new systems - no shortage of world records, apparently! Peter Yakutis offers detailed PCI-E I/O performance data. Ganesh Ramamurthy muses on the implications of UltraSPARC T2 servers from the perspective of a senior performance engineering director.
  • System Management. Find out about Lights Out Management (ILOM) from Tushar Katarki's blog.
  • Networking. Alan Chiu gives us some insights into 10 Gigabit Ethernet performance and tuning on the UltraSPARC T2 systems.
  • RAS. Richard Elling carries out a performability analysis of the T5120 and T5220 servers.
  • Clusters. Ashutosh Tripathi discusses Solaris Cluster support in LDoms I/O domains.
  • Virtualization. Learn about Logical Domains (LDoms) and the release of LDoms 1.0.1 from Honglin Su. Eric Sharakan has some more to say about LDoms and the UltraSPARC T2. Ashley Saulsbury presents a flash demo of 64 Logical Domains booting on an UltraSPARC T2 system. Find out why Sun xVM and Niagara 2 are the ultimate virtualization combo from Marc Hamilton.
  • Security Performance. Ning Sun discusses Cryptography Acceleration on T2 systems. Glenn Brunette offers us a Security Geek's point of view on the T5x20 systems. Lawrence Spracklen has several posts on UltraSPARC T2 cryptographic acceleration. Martin Mueller proposes a UltraSPARC T2 system deployment designed to deliver a high performance, high security environment.
  • Application Performance. Dileep Kumar talks about WebSphere Application Server performance with UltraSPARC T2 systems. Tim Bray shares some hands-on experiences testing a T5120.
  • Java Performance. Dave Dagastine offers us insights into the HotSpot JVM on the T2 and Java performance on the new T2 servers.
  • Web Applications. Murthy Chintalapati talks about web server performance. Constantin Gonzalez explores the implications of UltraSPARC T2 for Web 2.0 workloads. Shanti Subramanyam tells us that Cool Stack applications (including the AMP stack packages) are pre-loaded on all UltraSPARC T2-based servers.
  • Open Source Community. Barton George explorers the implications of UltraSPARC T2 servers for the Ubuntu and Open Source community.
  • Open Source Databases. Luojia Chen discusses MySQL tuning for Niagara servers.
  • Customer Use Cases. Stephan Hoerold gives us some insight into experiences of Early Access customers. Stephan also shares what happened when STRATO put a T5120 to the test. It seems like STRATO also did some experimentation with the system.
  • Sizing. I've posted an entry on Sizing UltraSPARC T2 Servers.
  • Solaris features. Scott Davenport blogs on Predictive Self-Healing on the T5120. Steve Sistare gives us a lot of insight into features in Solaris to optimize the UltraSPARC T2 platforms. Walter Bays salutes the folks who reliably deliver consistent interfaces on the new systems.
  • HPC & Compilers. Darryl Gove talks about compiler flags for T2 systems. Josh Simons talks about the relevance of the new servers to HPC applications. Ruud van der Pas measures T2 server performance with a suite of single-threaded technical-scientific applications. In another blog entry, Darryl Gove introduces us to performance counters on the T1 and T2.
  • Tools. Darryl Gove points to the location of free pre-installed developer tools on UltraSPARC T2 systems. Nicolai Kosche describes the hardware features added to UltraSPARC T2 to improve the DProfile Architecture in Sun Studio 12 Performance Analyzer. Ravindra Talashikar brings us Corestat for UltraSPARC T2, a tool that measures core utilization to help users better understand processor utilization on UltraSPARC T2 systems.

Finally

Go check out the new UltraSPARC T2 systems, and save energy and rack space in the process.

Enjoy!

Wednesday Oct 10, 2007

The newly released Sun SPARC Enterprise T5120, T5220 servers and the Sun Blade T6320 present an interesting challenge to the end user: how do you deploy a system that looks like an entry level server (it's either one or two rack units in size and comes with a single processor chip), yet has more processing power than the fastest 64-CPU Enterprise 10000 (Starfire) shipped by Sun? Oh, and in case you're wondering, the Starfire comparison is good for delivered application throughput, too, not just theoretical speeds and feeds.

The first issue is to figure out how you're going to use up all the CPU. There are a number of possibilities, including:

  1. You deploy a single application that consumes the entire system. This single application might have multiple threads, such as the Java VM, or multiple processes, like Oracle. When a T2000 is dedicated to a single application, such as Oracle, for example, best practice is to treat it like a standard 12-16 CPU system and tune accordingly. So a good starting point is probably to tune a T5120 or T5220 as a 24-32 CPU system. You will want to monitor the proportion of idle CPU with vmstat or mpstat (or corestat] if you'd like more information about how busy the cores are). If there's a lot of idle CPU, then you might need to tune for more CPUs.

    A single application wasn't the most common way of consuming 32-thread UltraSPARC T1 servers like the Sun Fire T2000, though. And it's even less likely to be typical on the 64-thread T2 servers, which are a little more than twice as powerful as T1 servers.

    Why isn't it typical to consume a T1-based system with a single application? The most common reason is because a single application often doesn't require that much CPU. Sometimes, too, a single application instance doesn't scale well enough to consume all 32 CPUs. We've particularly seen this with open source applications with mostly 1- or 2-CPU deployments. Configuring multiple application instances can sometimes overcome this limitation.

    It's worth noting that application developers will increasingly find themselves needing to solve this issue in the future. With all chip suppliers moving to quad-core implementations, it will soon be necessary for applications to perform well with 4- to 8-CPUs just to consume the CPU resource of a 1- or 2-chip system. Early adopters of T1000 and T2000 systems are in good shape, because it's likely they've already made this transition.

  2. You consume the entire system by deploying multiple applications. These applications can, in turn, be multi-threaded, multi-process, or multi-user. Virtualization can be an attractive way of managing multiple applications, and there are two available technologies on T2-based servers: Solaris Zones and Logical Domains (LDoms). They are complimentary technologies, too, so you can use either, or both together. Domains will already be familiar to many - Sun users have been carving up their systems into multiple domains since the days of the Starfire. The LDom implementation is different, but the concepts are very similar. Check out this link for pointers to blogs on LDoms.

Caveats?

In my blog on Sizing T1 Servers back in 2005, I made a number of suggestions about sizing and consolidation that also apply to the new systems. I also noted two caveats related to performance. The first related to floating-point intensive workloads. This caveat no longer applies on T2 servers - the floating point units included in each of the 8 cores deliver excellent floating point performance. The second caveat related to single-thread performance and the importance of understanding whether an application would run well in a multi-threaded environment. Is there, for example, a significant dependence on long-running single-threaded batch jobs? This question must still be asked of T2 servers, although the single-threaded performance of the T2 is improved relative to the T1. The Cooltst tool was created to help identify single-threaded behavior with applications running on existing Solaris (SPARC and x64/x86) and Linux systems (x64/x86). A new version of Cooltst will soon be available that supports T2 systems as well. For optimal throughput with T2-based servers, single-threaded applications should either be broken up, deployed as multiple application instances, or mixed with other applications.

The bottom line is that T5x20 servers will soon be replacing much larger systems, and delivering significant reductions in energy, cooling, and storage requirements.

Sunday May 07, 2006

CoolThreads Consolidation: The Easy Way

Solaris 10 and a CoolThreads server make a potent combination. Along with the raw horsepower, the low wattage, and the miserly rack requirements of the CoolThreads server, you get the robust, feature-rich, open source Solaris 10 operating system.

So far so good, but if you're new to Solaris 10 there's a lot to learn. Solaris Containers and Resource Management make a big difference for consolidation, but they take some getting used to. And then you need to figure out the implications of having four threads per core, for example, when configuring the Sun Fire T1000 or T2000. Is there a way of easing the transition for sysadmins who already have too much to think about?

Enter the Consolidation Tool for Sun Fire Servers V1.0, Sun Fire T1000 and T2000 Edition! This GUI tool is designed to simplify the task of consolidating applications onto CoolThreads servers. It also provides a friendly introduction to Solaris Containers, including Zones, resource pools, psets, and the FSS (FairShare) scheduling class.

Consolidation Tool Introduction

The Consolidation Tool is free (it's open source under the GPL), unsupported, and light-weight, and focused on Solaris 10 and the Sun Fire T1000/T2000. It's ideal for the systems administrator who is considering migrating applications from multiple Xeon boxes running Linux to a T2000 running Solaris 10. The tool can also be used to offer the technically-minded sysadmin a simple introduction to the command line syntax needed to build zones/pools/psets (via the commands script it creates). Note that the Consolidation Tool expects to work with a new Solaris installation - it does not attempt to manage systems that are already using containers.

Consolidation Tool Overview

The tool provides a simple, easy-to-use interface with context sensitive help. The user can choose between a Basic and an Expert mode, with the latter providing more control over the final configuration at the cost of greater complexity. Intelligent defaults are provided in both modes. The tool eases the user into defining and creating Solaris Containers without assuming any previous knowledge of that technology. It will deploy applications into processor sets where appropriate, and allocate "CPUs" (i.e. hardware threads) in a way that ensures all of a core's threads end up in the same processor set. The tool asks a series of user-friendly questions to determine whether to use full-root Zones, sparse Zones, or no Zones at all. The tool also optionally installs versions of key public domain software into the newly-established Zones on the target CoolThreads system.

The tool prepares a report summarizing the planned deployment, a commands script that is used to create the Zones, pools, and psets and install any specified public domain applications, and a file that stores the configuration data. This approach means that it isn't necessary to run the tool on the target CoolThreads system. Instead you can configure the consolidation environment in advance on a client of your choice. The final step is to run the commands script on the target Sun Fire T1000/T2000 system. Note that if you have elected within the tool to install public domain applications, you will need to put the full distribution onto the target system so that the script can find the public domain applications when you run it.

The tool can be run on any of the following client operating systems:

  • Solaris on SPARC
  • Solaris on x64/x86
  • Linux on Intel/AMD
  • MacOS X on PowerPC

Where Can I Get It?

You can find the tool on BigAdmin and also under Cool Tools at OpenSPARC.net. Both locations will let you download a presentation introducing the tool, and point you to the tool download location at the Sun Download Center. Download options include:
  • A 20MB tar.gz file, which provides the tool plus the necessary libraries for all clients, but none of the public domain packages
  • A 130MB tar.gz file with the full distribution, which includes the tool, its dependent libraries, and several public domain applications.
You can find it at the following locations: While you're at it, take the time to check out the OpenSPARC.net Cool Tools page, which also features other useful tools.

Feedback and Discussion

If you'd like to offer feedback on the Consolidation Tool, you can do so at consol-tool-feedback@sun.com. This is an auto-responder alias, so don't expect a reply (other than confirmation that your email has been received). If you would like to discuss the tool with other users, check out the Cool Tools Forum.

Thanks, Allan

Tuesday Dec 20, 2005

A couple of Sun Blueprints have recently been released that offer some helpful insights into application consolidation on CoolThreads Servers.

The first, Consolidating the Sun Store onto Sun Fire T2000 Servers, documents the process of migrating the online Sun Store from a Sun Enterprise 10000 with 38 400MHz CPUs onto a pair of Sun Fire T2000 servers (they used two for high availability). The resulting environment took advantage of Solaris 10 Containers. They saw an overall reduction of approximately 90 percent in both input power and heat output! Pretty cool (literally)! The space savings were even more significant.

The second blueprint, Web Consolidation on the Sun Fire T1000 using Solaris Containers, gives a detailed hands-on description of the process of building a web-tier consolidation platform on a Sun Fire T1000 with Containers.

If you're planning a CoolThreads consolidation, go check them out. I think you'll find both papers useful.

Tuesday Dec 06, 2005

The Sun Fire T1000/T2000 (aka "CoolThreads") server offers a lot of horsepower in a single chip: up to eight cores running at either 1000MHz or 1200MHz, each core with four hardware threads. But how should this SMP-in-a-chip be sized appropriately for real-world applications?

The published benchmarks show that the application throughput delivered by a single T2000 server is equivalent to the throughput delivered by multiple Xeon systems. And this isn't just marketing hype, either; the UltraSPARC T1 processor is a genuine breakthrough technology. But what are the practical considerations involved in replacing several Xeon servers with a single T1000 or T2000?

Preparing for CoolThreads

For starters, it's important to understand the design point of the UltraSPARC T1. If you need blazing single-thread performance, this isn't the system for you - the chip simply wasn't designed that way. And if you think that's bad, then I'm sorry to say your future is looking a little bleak. Every processor designer in the industry is moving to multiple cores, and one implication is that single thread performance will no longer be getting all the attention. Performance will be served up in smaller packages.

The UltraSPARC T1 is a chip oriented for throughput computing. With the multi-threading capablities of this chip Sun has done two things. The first is to push the envelope much further than anyone else anticipated. Not everyone will applaud this strategy, of course. (And just for fun, note the reactions carefully, and deduct points from competitors who bad-mouth Sun's strategy now, and later end up copying it!) More importantly, though, Sun has issued notice about the way applications need to be designed. In a world that increasingly delivers CPU power through multiple cores and threads, single-threaded applications don't make a whole lot of sense any more. The sooner you multi-thread your applications, the better off you'll be, regardless of your hardware vendor of choice.

That doesn't mean you'll be forced to rearchitect your applications before you can use the T1000/T2000, though. You can proceed provided your planned deployment has one or more of the following characteristics, any of which will allow it to take advantage of UltraSPARC T1's multiple cores and threads:

  • Multiple applications
  • Multiple user processes
  • Multi-threaded applications
  • Multi-process applications

In general, commercial software that runs well on SMP (Symmetric Multi-Processor) systems, will run well on T1000/T2000 (because one or more of the above already apply). Note that the Java JVM is already multi-threaded.

When to Walk Away

The other major consideration is floating point performance. The UltraSPARC T1 is not designed for floating-point intensive applications. This isn't as disastrous as it might sound. It turns out that a vast range of commercial applications, from ERP software like SAP through Java application servers, do very little floating point and run just fine on the T1000/T2000. If you're in any doubt about how to figure out the proportion of floating point instructions in your application, help is on the way. More on this in a future blog.

Sizing

If you made it past the single-threaded and floating point questions, you're ready for some serious sizing. The first step is to see how busy your current servers are. Suppose you plan to consolidate applications from six Xeon servers onto a Sun Fire T2000 server. If the CPUs on each system are typically 30% busy and peak at 50%, then you will be migrating a peak load equivalent to three fully-utilized servers.

By far the best way to test the relative performance of the T1000/T2000 and your current servers is to run your own application on both. If that isn't possible, a crude starting point might be to compare published performance on a real-world workload. Check out the published T1000/T2000 benchmarks for further information. If you can't directly compare your intended applications, try to find something as close as possible (e.g. the CPU, network, and storage I/O resource usage should look at least vaguely similar to your actual workload). Benchmarks that use real ISV application code (e.g. SAP and Oracle Applications) are going to be more relevant to a throughput platform like the T1000/T2000 than artificial benchmarks designed to measure the performance of a traditional CPU. One important warning: don't try to draw final conclusions if you're not comparing the same application on both platforms! Extrapolations don't work well when the technologies are radically different (and the UltraSPARC T1 is simply different to anything else out there).

The next step is to figure out how to deploy the applications. You have four, six, or eight cores at your disposal (depending on the T1000/T2000 platform you've chosen). Should you simply let Solaris worry about the scheduling? Or should you figure out your resource management priorities in advance and carve up the available resources before deploying the applications? You might want to refer to my blog about Consolidating Applications onto a CoolThreads Server for more information on this topic.

Once you're ready to deploy, make sure you do some serious load testing before going live. Don't make the mistake of rushing into production without first finding out how well your application scales on the T1000/T2000 platform. I don't know about you, but I hate nasty surprises! And if you do encounter scaling issues, don't forget that Solaris 10 Dtrace is your friend. And check out DProfile, too.

Once you get your head around this technology, you're going to enjoy it! And that's even without mentioning the power, cooling, and rack space savings...

Allan

PS. If you're looking for more CoolThreads info direct from Sun engineers, Richard McDougall has put together an excellent overview of other relevant blogs.

The ground-breaking Sun Fire T1000/T2000 servers, based on the UltraSPARC T1 (Niagara) chip, can provide an excellent platform for application and workload consolidation. An obvious target, for example, might be to consolidate several 1- or 2-CPU Xeon systems onto a single T1000/T2000 (aka "CoolThreads") server, with immediate savings in power consumption and rack space as well as in system administration costs. A wide range of software can be immediately run on the T1000/T2000, thanks to the large portfolio of both proprietary and public domain applications available for SPARC/Solaris.

Solaris 10 offers a number of features that are especially valuable in consolidation. Virtual servers can be created thanks to the Container technology bundled with Solaris 10. There are several key components to containers. The first is zones. A zone is a virtual Solaris instance, and one or more can be created to provide secure application environments with no access to or from other zones running on the same system. Zones do not automatically imply resource management; that's where resource pools come in. A pool can be created with its own dedicated CPU resources and scheduling class, and one or more zones can be optionally bound to the pool to take advantage of those resources. Psets (processor sets) can be created and associated with each pool where it is important to dedicate CPUs to a pool.

So the bottom line is that an arbitrary number of containers can be created on a CoolThreads system, each with its own secure environment and (optionally) its own dedicated CPU resource. The end result is a virtual server into which applications can be deployed. Each T1000/T2000 server employs a single chip that comes with four, six, or eight cores, and four hardware threads per core. For an 8-core system, Solaris sees 32 "CPUs" or virtual processors (eight cores multiplied by four hardware threads). For the purposes of consolidation, we recommend creating psets with multiples of four CPUs, each group of four CPUs corresponding to a single core. (If you only ever create psets with four CPUs, or multiples of four, Solaris will always give you four contiguous CPUs that map directly to the four threads in a single core). If a zone is not likely to require the resources of a full core, other zones can be bound to the same pool, thereby sharing the resources of the pset associated with that pool.

OK, so the technology is definitely there, but how do you get started with it? The good news is that we have developed a freeware tool, called the Sun Fire Consolidation Tool, to simplify the task of creating containers (zones, pools, psets). It is designed for system administrators who haven't yet been exposed to the intricacies of creating zones, resource pools, etc. Taking advantage of an easy-to-use GUI interface, the tool creates a script with all the necessary commands. The script can simply be run on the target system to create the requested containers, but it also comes complete with detailed comments, helpfully illustrating the necessary syntax for anyone interested in learning how to use the commands. It therefore caters to both the casual user and the system administrator wanting to get a head start in mastering the nuances of container management. The tool also optionally installs a number of popular public domain software applications into the newly-created containers.

The tool now has its own Bigadmin page. You can also find a tool webpage at OpenSPARC.net as one of the Cool Tools freely available there.

Finally, the UltraSPARC T1 processor ushers in up a whole new world of price performance. And for maximum benefit, the inexpensive power of the T1000/T2000 hardware can be combined with the inexpensive power of open source software. Open source software continues to gain respect, and now covers almost the entire software stack, from web servers, application servers, and databases, through to the highly-regarded Solaris Operating System.

In summary, the new Sun Fire T1000/T2000 servers are an obvious platform for server consolidation. And probably the best way to make a start is with a pilot.

Allan

PS. If you're looking for more CoolThreads info direct from Sun engineers, Richard McDougall has put together an excellent overview of other relevant blogs.

This blog copyright 2008 by allanp