BM Seer Unofficial thoughts from an anonymous Sun employee

Sun's T5440 reported about in InformationWeek

Thursday Nov 06, 2008

Information week covers Sun's T5440 server (CMT).

The $132,995 T5440 server will excel anywhere throughput is more important than CPU speed--in those areas, the T5440 is rewriting the record book. Database workloads--in particular, Oracle and SAS environments--are well-suited to the T5440's CMT technology.
read more at:
http://www.informationweek.com/news/hardware/showArticle.jhtml?articleID=211800249

Also interesting blog posting on Web2.0 (Olio is the benchmark kit) Performance on CMT.

Remember it is about system performance no thread performance. Just like automobile performance is about the whole thing not just horsepower per cylinder.

I know competitor commenters and other detractors always point to per thread performance (even ignoring core performance) when the talk about Sun's various CMT servers. Can someone please show me a datacenter at any company that is only running on one thread :)

[18] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

"Oracle and SAS environments--are well-suited to the T5440's CMT technology."

Ok, this time we wont mention the respnse times, and single thread performance (just for you), but whats the point in having cheap kit, with good throughput capability if it costs bazillions in Oracle licensing costs ???

Posted by Alex on November 06, 2008 at 10:58 AM PST #

Alex, Good to hear from you again.

The quick grow of the CMT is showing that customers don't have a response time issue with CMT. The expensive IBM cores can't keep up with T5440, even when you have 16 cores all working together in an IBM p570.

I'm still amazed that IBM can only fit 16 cores in 16 RUs. That is very low density for a $1M USD p570.

Posted by BM Seer on November 06, 2008 at 01:16 PM PST #

16 cores in 8U now actually, and 32 in 16U.

Surely you havent forgotten already ?

Posted by Alex on November 06, 2008 at 02:16 PM PST #

yes Alex, but you always leave our critical details.

IBM 4.7GHz Power6 p570 16-core DC 16 RU, $1M USD is beaten by T5440, yes the IBM *3.6GHz Power6* (~25% less GHz) is only 8RU, but those QCModules are even SLOWER. Sure you can put them in a smaller package but the heat generated means you can't get full GHz. Also IBM is avoiding benchmark comparisons to T5440 to avoid embarrassment.

Posted by BM Seer on November 06, 2008 at 02:33 PM PST #

lol, you really are a marketing droid arent you !

Posted by Alex on November 06, 2008 at 11:20 PM PST #

1. IBM have x86 servers witch compete with Sun's CMT (and they still faster and cost effective, check SAP-SD)
2. Oracle standart edition charge by sockets, only Enterprise edition charge by cores

Posted by Triffids on November 07, 2008 at 12:42 AM PST #

Alex: I'm not in marketing, I don't do marketing. Your comments actually look a lot more like FUD marketing than most I've seen. You make a small comment without facts and then get proven wrong.

Triffids: OK, I actually showed benchmarks in previous showing CMT beating very large X86 servers QC, 6C with lots of sockets.

Posted by BM Seer on November 07, 2008 at 07:45 AM PST #

Hm ... I can see from your posts what 8-socket/48-cores IBM x3950 in SAP-SD much faster than biggest of CMT. In tpc-e reports 64-cores x3950 cheaper than CMT.

Posted by Triffids on November 07, 2008 at 08:12 AM PST #

hi:

I quote your stuff all the time - generates wonderfully literate email from IBMers! Keep it up, please!

Right now I want to know more about what happens to Domino performance on a T5220 when you add other workloads.

I have nothing on that - and I think multi-workload benchmarking going to become much more important as the machines scale far past the single application work loads most companies have these things doing now.

Can you point me at anything that might help? A real, business oriented, multi-application workload benchmark of some kind? with results.. ?

e.g. what happens if you run Notesbench, two or more of the java workloads, two or more of the SAP loads, and some Sun Ray Openoffice/Web users all at the same time on a T5440? how about somebody's 32 (!) way opterons? etc.

Posted by Paul Murphy on November 07, 2008 at 10:19 AM PST #

Just for a moment, re-read this thread, and ask yourself who here is posting FUD. Who here is posting incorrect 'facts' ? Who here is being proven wrong ?

I'll give you a clue. It isnt me !

Posted by Alex on November 07, 2008 at 12:14 PM PST #

Paul: I follow your blog as well. Can you believe that that IBM's chief performance blogger (who never gives any real benchmark analysis) thought we were the same person. IBM...connections to reality?

Sun used to do lots of benchmark on Lotus Domino, it seems that benchmark organization feel apart. Too bad. I've been doing some internal tests for a customer that show that consolidation of different apps on CMT even with Solaris (no additional virtualisation software) scale well. Now I know there are needs for additional virtualisation software for different needs.

I just posted a test we did for a customer consolidating a lot of different MySQL jobs onto the same CMT. Let me look around for work that has been done inside Sun for multiple workload virtualisation, and I'll post what I can make public. Again I don't expect issues give the testing others and myself have done.

Alex: I have yet to find any real data in any of your comments. See any of my blog entries on the huge number of benchmarks that support my opinions.

Posted by BM Seer on November 08, 2008 at 11:02 AM PST #

comparing on one thread is just an aspect.

Well, you may consider as much when you want to compare

Posted by Martin on November 09, 2008 at 04:48 AM PST #

I wonder how comes that Sun doesn't have even a single SPECpower_ssj2008 result?

So much FUD about performance per Watt and no audited results?

Posted by Mike on November 09, 2008 at 07:20 PM PST #

Mike:

We've covered the many issues with SPECpower_ssj, see:
http://blogs.sun.com/bmseer/tags/specpower

Also have you noticed they all use LV-DIMMS, but a friend at a memory supplier told me that the number of people (besides benchmarkers :) i guess) is tiny (insignificant?).

Sun is the only vendor to publish watts on certified benchmarks for over 3 years - all other vendors have been avoiding this. see: http://blogs.sun.com/bmseer/entry/configs_used_for_specpower_ssj

Posted by BM Seer on November 10, 2008 at 05:54 AM PST #

Thanks for pointing out, but I see lots of "estimations" there.
To certify the result for SPECpower_ssj2008 you need a calibrated power meter, not just "70% maximum PSU rating". So, I believe that using SPECjbb results and "estimating" that Sun system is more efficient than other vendors' is not correct.

Posted by Mike on November 10, 2008 at 08:14 AM PST #

I've measured watts like our "estimates" at customer sites, actually most where higher. Sun measures watts with calibrated power meters (very accurate) on all of our systems. Competitors AVOID doing this!

If you are a customer of computer vendors, than you can measure watts on systems in you datacenter (not idle please) with realistic configurations.

If you work for a vendor, than encourage your company to post measured watts on the SAME configurations used in performance benchmarks.

I think it is silly to have systems optimized for a power-performance benchmark and completely different configurations optimized for performance.

Posted by BM Seer on November 10, 2008 at 08:23 AM PST #

I think the results of SPECpower_ssj should be displayed with the "performance" column too, not just "power/performance". This should help avoiding irrelevant configurations.
If Sun does actually measure the power requirements of their systems, why not audit the results against SPECpower then?
I work for a large Sun customer, but we do not own a calibrated power meter.

By the way, regarding those SPEComp results. Isn't it very much like TPC-C, too easily paralleled to be a meaningful measure of real world performance? From my limited experience with parallel programming, it is much easier to achieve good speedup on 8-16 threads than on 1024, remember that Amdahl's law? What is your opinion?

Posted by Mike on November 10, 2008 at 07:05 PM PST #

Mike:

Please dive into the issue with SPECpower_ssj in the many posts here. You'll find that the HP DL580 in SPECpower_ssj is totally different the the HP DL580 used in any other benchmark, so there is no way to list results together. As my neighbor says "...ya think that Chevrolet Impala that raced Talladega was anything like that one 'cross the street, shoot boy!"

You can start measuring power smaller servers in your datacenter now on individual smaller servers. Good power meters are cheap:
https://www.wattsupmeters.com/secure/products.php
They take account of power factors and are accurate within 1.5% likely more than you performance variation. Inside Sun we use more accurate meters but as you can see 1.5% isn't bad at all, and I'm trying to keep it cheap for you to get started.

Sure you can get more accurate meters but as you'll find out 32GB of regular Dimms is 100's of watts more than 8GB of LV-DIMM. Please read this posting:
http://blogs.sun.com/bmseer/entry/hp_watts_up_with_no

SPEComp is designed to measure types of applications in HPC computing. Speedups realized in it are similar to speedups you see in scientific computing codes. It was updated in 2001.

TPC-C is simplistic and has not changed significantly since 1992. Sure there have been updates but in my mind it does not reflect the today's complex workloads. Read this blog from 2005:
http://blogs.sun.com/bmseer/entry/tpc_c_how_old_is

Real world performance is determined by benchmark fit to the characteristics of real-world applications.

Disclosure statement:

TPC-C is a trademark of Transaction Processing Performance Council (TPC). More info www.tpc.org. SPEC, and SPEComp registered trademarks of Standard Performance Evaluation Corporation.

Posted by BM Seer on November 11, 2008 at 07:15 AM PST #

Post a Comment:
Comments are closed for this entry.