BM Seer Unofficial thoughts from an anonymous Sun employee

Can I use 64 threads in a chip?

Wednesday Aug 08, 2007

Can someone really use 64-threads in a chip? The answer is simple, when you look out into your datacenter do you see racks of servers or just a single naked core sitting alone in the back corner? :)

If you see racks of server you are running lots and lots of threads. Think of it his way, if you have a bunch of dual-core single-socket 1RU servers filling a rack you have around 80 threads in a rack, or 2-socket you have 160, or quad-core 2-socket you have 320 threads.

Now how would you judge performance of a single rack (with 80-320 threads)? Would you run one copy of "gzip" or "tar" and compare that to your laptop and say that rack is slow, of course not., You'd run a whole bunch of them.

So when you are performance testing an UltraSPARC T1 or UltraSPARC T2 server throw lots of work at it and it will have no problem. There is massive parallelism in every datacenter with racks of servers. Perfect for UltraSPARC T1/T2. Every datacenter with web-tiers, application-tiers, and database behind those tiers runs tons of threads. And remember the UltraSPARC T1 and introduction and even last week continues to set leading performance records at every tier.

Intelligence test :) Would you judge performance of an UltraSPARC T2 by running a single "gzip" or "tar"?

[2] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg

"Estimated" what does that mean for Sun's UltraSPARC T2

Wednesday Aug 08, 2007

Why does Sun designate yesterday's performance results as "estimates", why that word? Did some Sun marketeer just throw a dart and just pick a big number. No. All UltraSPARC T2 SPEC CPU and SPEC OMP metrics quoted are from full “reportable” runs, but are nevertheless designated as “estimates” because they use pre-production systems. Sun customer systems, to be announced later, are expected to perform similarly. SPEC rules do allow comparing these preliminary scores and published result.

Is Sun the only vendor to use this clause? No. Intel and AMD have made a long history of using preliminary numbers at chip announcements to get the word out about their performance. Sun is just following their lead, and trumping their performance :)

Ok, back to why the word "estimates?" The SPEC CPU committee voted to use that specific word for preliminary scores. Members include IBM, Intel, AMD, HP, .... And every employee of a member company must follow the rules.

    By license agreement, SPEC members and customers agree to run and report results as specified in each benchmark suite's documentation. from SPEC FAQ

Postings on Sun's UltraSPARC T2 performance:
http://blogs.sun.com/bmseer/entry/performance_of_the_new_sun
http://blogs.sun.com/bmseer/entry/ultrasparc_t2_more_floating_point
http://blogs.sun.com/sprack/entry/ultrasparc_t2_world_class_crypto
OpenSPARC T2:
http://blogs.sun.com/d/entry/ultrasparc_t2_documentation_available
Ubunu (aready booted on UltraSPARC T2):
Ubuntu & Canonical & UltraSPARC T1 (May06).

As a Sun employee I try my best to follow every rule when talking about results in public, but I'm an engineer so sometimes it is hard to follow all the legalese so I try to correct things as soon as I see an error. And I do my best to remind other Sun bloggers to put in the proper disclosure statement for SPEC & TPC benchmark results. Though quite honestly I wish SPEC & TPC would streamline the rules, make them more consistent, and minimize the lengthy disclosure statements.

Of course because Sun is in the lead and because I made some suggestions, I'm sure this entry will be fully scrutinized by every competitor. If I made errors let me know in the comments and I will correct them.

Disclosure Statement

SPEC, SPECint, SPECfp, and SPEComp registered trademarks of Standard Performance Evaluation Corporation. Results from www.spec.org as of August 6, 2007. Actually this one is short because I didn't put any specific results in this posting, the ones at the links have the more extensive disclosures because they show scores & results.

[1] Comments

Solaris and Sun Studio compiler important to UltraSPARC T2 announcements & benchamrks

Tuesday Aug 07, 2007

Beyond UltraSPARC T2 what other technologies matter? There are two more keys to Sun providing such effective performance in the new single-chip Sun UltraSPARC T2 64-thread processor, that is Solaris (and now of course OpenSolaris) and Sun Studio compilers. Here is a nice slide of the history of hardware history of SPARC, I borrowed this on from an entry in "On the Record" SPARC History from Sun's On the record blog -- blogs.sun.com/ontherecord

An important thing to remember that besides Sun's long history with SPARC, we've also lead the way in parallelism. Over 15 years ago, Solaris supported 64-way SPARC systems and provided near-linear scaling. For those of you old enough to remember, at that time IBM, SGI, HP, and everyone else thought there was no way Sun could produce effective 64-way systems. They were wrong and now our competitors have finally all have introduced systems with lots of processors and/or threads.

Solaris and Sun Studio compilers have a LONG history and lots of experience with industrial-strength applications with lots of threads.

Solaris and Sun Studio compilers were great at scaling to 64-way systems 15 years ago, with a lot more experience and hard work we are even better at scaling and will scale to lots more threads right now. Many thanks to all of those compiler & OS engineers!

Postings on Sun's UltraSPARC T2 performance:
http://blogs.sun.com/bmseer/entry/performance_of_the_new_sun
http://blogs.sun.com/bmseer/entry/ultrasparc_t2_more_floating_point
http://blogs.sun.com/sprack/entry/ultrasparc_t2_world_class_crypto
OpenSPARC T2:
http://blogs.sun.com/d/entry/ultrasparc_t2_documentation_available

...I've focused on Solaris, but there are options, for example Ubuntu. Ubuntu has already booted on the UltraSPARC T2.

As as a reminder Ubuntu and Canonical proved it on an UltraSPARC T1 almost 14 months ago, see this article on that work.

[2] Comments

UltraSPARC T2: more floating-point performance

Tuesday Aug 07, 2007

More about floating-point on the Sun UltraSPARC T2 in this posting, In the previous posting SPECfp_2006 scores and the UltraSPARC T2 design being open-sourced were discussed.

In the UltraSPARC T2 there are eight floating-point units that are well suited for scientific applications. Based upon preliminary runs the Sun UltraSPARC T2 processor at 1.4 GHz beats all single chip scores showing 14230(est)/15081(est) SPECompMbase2001/SPECompMpeak2001.

How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECompMbase2001/SPECompMpeak2001 scores?

  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip IBM p520 POWER5+ 1.9GHz processor published result by 85%.
  • ...Sun is waiting for POWER6 4.7GHz results, maybe UltraSPARC T2 results will scare IBM from ever publishing a single-chip result?
Benchmark description:

The SpecOMP benchmark is a test of the performance of 9 High Performance computing applications. It is used to compare the performance of shared memory servers. All C/C++ and FORTRAN applications in this suite use the OpenMP programming model that provides a portable, scalable model for developing parallel applications for platforms ranging from the desktop to the supercomputer.

The OpenMP Application Program Interface (API) supports multi-platform shared-memory parallel programming in C/C++ and Fortran on all architectures, from the largest Unix servers to the small Windows NT platforms.

Disclosure statement:

All UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable” runs, but are nevertheless designated as “estimates” because they use preproduction systems. SPEC, and SPEComp registered trademarks of Standard Performance Evaluation Corporation. Sun UltraSPARC T2 1.4GHz (1 chip, 8 cores, 64 threads) 14230 (est)/ 15081 (est) SPECompMbase2001/SPECompMpeak2001. Competitive results from www.spec.org as of August 6, 2007. IBM p520 1.9GHz (1 chip, 2 cores, 4 threads) published 8141/8174 SPECompMbase2001/SPECompMpeak2001.

[2] Comments

Performance of the new Sun UltraSPARC T2

Tuesday Aug 07, 2007

Sun UltraSPARC T2 is an amazing chip and very fast! The UltraSPARC T2 features several industry firsts:

  • Eight cores and 64 threads
  • Integrated 10 GbE networking and I/O
  • Dedicated, cryptographic and floating point units per core
  • 10 cryptographic functions supported with hardware
  • open-source design: www.opensparc.net

Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz, beat all single chip scores showing 78.3 est. SPECint_rate2006. How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECint_rate2006 results.

  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip IBM POWER6 4.7GHz processor published result by 29%.
  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip estimated scores of the AMD Barcelona by 23%.
  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip published scores of the 2.66GHz Intel X5355 (Clovertown) by 48%.
Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz, beat all single chip scores showing 62.3 est. SPECfp_rate2006. How do these preliminary runs (we must use the term "estimated" by SPEC rules) compare to SPECfp_rate2006 results.
  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best published single-chip IBM POWER6 4.7GHz processor result by 7%.
  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip estimated scores of the AMD Barcelona by 11%.
  • These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip published scores of the 2.66GHz Intel X5355 (Clovertown) by 66%.

Performance per core doesn't matter GHz doesn't matter, what matters is numbers of cores, efficiency, and design of the chip! Competitors are saying that UltraSPARC T2 is proprietary... this makes no sense. both UltraSPARC T1 and UltraSPARC T2 are open source designs (www.opensparc.net). You do not find the latest design of Intel, AMD, or IBM as open source designs.

Disclosure Statement:

All Sun UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable” runs, but are nevertheless designated as “estimates” because they use preproduction systems. SPEC, SPECint, SPECfp registered trademarks of Standard Performance Evaluation Corporation. Sun UltraSPARC T2 1.4GHz (1 chip, 8 cores, 64 threads) 78.3 est. SPECint_rate2006, 62.3 est. SPECfp_rate2006. Competitive results from www.spec.org as of August 6, 2007. IBM POWER6 4.7GHz (1 chip, 2 cores, 4 threads) 60.9. SPECint_rate2006, 58.0 SPECfp_rate2006. AMD Barcelona 2.6 GHz (1 chip, 4 cores, 4 threads) 63.9 est SPECint_rate2006, 56.3 est. SPECfp_rate2006. Barcelona estimates based upon "The Register" article stating 2.6GHz quad is 21% and 50% faster than Intel 2.66 system. Fujitsu RX300 Intel X5355 2.66 GHz (1 chip, 4 cores, 4 threads) 52.8 SPECint_rate2006, 47.5 SPECfp_rate2006.

Reminder: The Niagara 2 score was obtained from a full "reportable" SPEC run, but is designated as an "estimate" because a pre-production system was used.

...more information on the UltraSPARC T2 later today.

[6] Comments

Lots hitting the wires: UltraSPARC T2, the next generation

Monday Aug 06, 2007

Many news sources now covering UltraSPARC T2, the new high-performance chip from Sun. This new UltraSPARC T2 chip leads in many ways. I'll cover the performance numbers tomorrow.

For now:
http://www.computerworld.com.au/index.php/id;898889798
http://www.reuters.com/article/technologyNews/idUSN0625780420070806
http://www.channelweb.co.uk/vnunet/news/2195718/sun-lifts-lid-niagara-processor
etc..

For some of my previous comments:
http://blogs.sun.com/bmseer/entry/news_trickles_out_on_niagara2

Please remember that the previous generation chip, the UltraSPARC T1, just set an application-tier world record (all details at link). How many times has the "old" chip with half as many threads set a world record weeks before the new one is announced?

A final note. I venture that this chip is going to lead for database, application tier, and of course web tier, oh and don't forget HPC, yes it is that versatile.

[2] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg