BM Seer Unofficial thoughts from an anonymous Sun employee

IBM's new power6 more cores tiny up in GHz

Wednesday Oct 08, 2008

IBM announced new power6 Quad-core, but it won't be avail until late November 21st? Wow that is really pre-announcing a long way ahead.

Let's cut through some of the over-marketing-hype...

Comments:

IBM is slowly working toward quad-core chips...

a. Who put out the first commerical multicore processor chip?
Sun - Duh no; it was a little blue company

b. Is putting out a commerically viable multicore processor more difficult than putting out a commerically viable high speed processor? Don't know but I think it is unless the all cores share functional resources beyond the caches.

IBM avoids any comparison to Sun CMT
The reason is the T series CMT servers are not comparable to IBM p570 & p560 servers, the entry p520 and p550 are more comparable. Before you start on saying show me the data - I find data on CMT servers very difficult to find. Do detailed whitepapers describing RAS, virtualization overheads of LDOMs (unless there are none - which is difficult to believe)and relative performance between generations of servers similar to rPerf (I know Sun has M Values but never seen public documents released periodically or ever to show these values) exist somewhere on Sun's site?

IBM again plays games when talking about old Sun systems to un-realeased IBM servers. IBM's comparison is with the M5000...
Sorry I thought the M5000 is a current server; am I mistaken or was it a Freudian slip on your part?
"With the new Power 560 Express clients can save up to 80 percent of the energy by consolidating 13 Sun Fire V490 servers on a single Power 560 server with PowerVM compared to consolidating those same servers on four Sun SPARC Enterprise M5000 servers with Dynamic System Domains."

IBM considers the p570 equivalent as the M8000. So on the M class servers why don't Sun do progressive benchmarks showing the scaling performance of the servers with 4, 8 & 16 processors with same ratio of memory on the same benchmark?

IBM uses max watts for site planning guides
When you are designing the data center power & cooling requirements I would take the max rate power and cooling requirements of the servers (just in case I upgrade) rather than the typical power and cooling requirements. Unless you have a capping mechanism - do the CMT or SPARC64 servers have such a functionality? IBM has EnergyScale to do just than on the p boxes.

And in your comparisons can we stick to one server model and compare based on cores or processors and not switch back and forth.

Incidentally, the M9000 performance benchmark on SAP SD2T was impressive but I am curious why Sun stopped the benchmark with 67% CPU utilization and why there has not been a subsequent update on the result with the typical 97+% utilization we see from all vendors including Sun for other systems.

Venki

Posted by Venki on October 08, 2008 at 08:39 AM PDT #

IBM only has Dual core chips, now quad-core modules. Way behind
Sun going to 8-cores and 16-cores on a chip!

IBM avoids any comparison to Sun CMT: because 2-chip Sun T5240 OUTPERFORMS
p570 (8RU server)! IBM can't beat the performance so they always
misdirect people to comparing to cheaper IBM servers that are VERY VERY
SLOW compared to CMT. Customers are pulling out p570s and replacing
with lower-cost, faster, smaller Sun T5240. IBM, read and weap.
Sun publishes benchmarks and doesn't rely on making up numbers like rperf.

Venki actually IBM's comparison is: "With the new Power 560 Express ...by consolidating 13 Sun Fire V490 servers"

venki writes: "Sun do progressive benchmarks showing the scaling performance of the servers with 4, 8 & 16 processors with same ratio of memory on the same benchmark?" How out full system performance. customers tire of IBM always
talking about core performance and avoiding system performance.

IBM still avoiding measuring watts, looking at max watts is a marketing paper game. show us DATA!

Venki writes: "Incidentally, the M9000 performance benchmark on SAP SD2T was impressive", yep it has more headroom. Too bad IBM is maxed out.

Posted by BM Seer on October 08, 2008 at 10:45 AM PDT #

IBM only has Dual core chips, now quad-core modules. Way behind
Sun going to 8-cores and 16-cores on a chip!

But at 1.4GHz; where are Sun processors at 4+GHz (adding the individual core's frequencies doesn't count). IBM has built 8 cores on a chip (Cell) and even 1025 core speciality processor (Kilocore1025)for Raptor and that to at very low power.
http://www.itechnews.net/2006/04/06/ibm-previewed-kilocore1025-cpu-with-1025-cores/
So can IBM do it - YES THEY CAN!
And reading reports on the web the Power7, which beat out Rock for the DARPA's PERCS project, will be a 8 core chip.
http://www.theregister.co.uk/2008/07/11/ibm_power7_ncsa/

Look who is talking of comparing CHEAPER servers to ENTERPRISE systems! The Sun T5xx0 servers do not have the RAS or virtualization features of the IBM p570 servers - or for that matter of the p520 and p550. For example wrt RAS: Instruction Retry (available in Fujitsu M-class), Alternate CPU Retry, CPU deallocation, are all absent from the US-T2 processors.

As for customers pulling out p570s and replacing them with T5xx0 servers -where are the FACTS? Gartner says that Sun continues to loose marketshare (by revenue) and IBM p has gained marketshare (by revenue).

"Gartner is still only counting its true AIX boxes in this category and believes IBM sold some $1.47 billion in Power Systems iron running AIX, up 28.9 percent and giving it 35.1 percent of the Unix pie. This gave Big Blue slightly more Unix share than Sun, which had $1.41 billion in sales, down 8.8 percent and giving Sun a 33.8 percent slice of the Unix pie. HP came in third in the Unix space, with $1.07 billion in sales, up 12.8 percent but only giving HP a 25.6 percent of the Unix market in Q2. Fujitsu-Siemens had just under $100 million in sales (down 23.1 percent), followed by Bull with $71.7 million in sales (up 120 percent) and all other Unix vendors with $60 million in sales (up 145 percent)."
http://www.itjungle.com/bns/bns082508-story01.html

Doubt that IBM would have achieved such results if people were yanking out IBM p servers and replacing them with T5xx0 servers, logically IBM should have had a revenue drop and not a gain.

You did not complete the sentence up to the full stop (just to help you out I have CAPITALIZED the part where IBM uses the word COMPARE and to which SERVER).
"With the new Power 560 Express clients can save up to 80 percent of the energy by consolidating 13 Sun Fire V490 servers on a single Power 560 server with PowerVM COMPARED TO CONSOLIDATING THOSE SAME SERVERS ON FOUR SPARC ENTERPRISE M5000 SERVERS with Dynamic System Domains."

Was this a Freudian slip on your part calling a M5000 a OLD SERVER?

I didn't mention cores, I said (CAPITALIZING for your benefit)
"So on the M class servers why don't Sun do progressive benchmarks showing the scaling performance of the servers with 4, 8 & 16 PROCESSORS with same ratio of memory on the same benchmark?"
I am talking about system performance! And about how a SYSTEM scales as MORE PROCESSORS are added to it (keeping other factors as constant as possible).

Didn't deny about IBM using MAX WATTS. Just said to design my data center requirements I need to know the max.
But I DID NOT SEE an answer from you on whether Sun provides a feature in the T5xx0 or Mx000 servers to LIMIT THE MAXIMUM POWER DRAWN? Still waiting....

Also still waiting for WHY it was capped at 67% utilization - a bottleneck maybe in the server design - memory, I/O or Solaris kernal scaling?

Lastly, why does Sun fail to publish
a. the relative performance (M Values) of its servers across processors/generations like IBM p does with rPERF data?
b. detailed information on the LDOMS overhead, RAS functionality built into to it T5xx0 servers? Please provide the links to such documents (if they exist for public consumption).

Venki

Posted by Venki on October 08, 2008 at 12:14 PM PDT #

As a Sun shareholder, I'd like to see a more mature and complete response to Venki than "read and weep". If it's true that a single Power 560 server is equivalent to 4 Sun M5000 servers, then this is bad news for me and my shares!

I agree with Venki that comparing Power processors (fast single thread performance) with the Sun T2 processor doesn't make any sense at all. Face it, they are different beasts with different abilities. It would be nice if we could compare with Rock but it's a bit er late?

Finally, I wouldn't complain about IBM pre-releasing news given Sun's record of hype. Where is my download of xVM Server? I believe you announced it a few weeks ago? Where is Rock? Where are servers with SSDs? Pot calling the kettle black?

Let's get past the rhetoric and start some real analysis here please. Thank you.

Posted by Kevin Hutchinson on October 09, 2008 at 10:05 AM PDT #

Kevin it is simple Venki went off topic. --- why do IBM supporters always do that? This was about the shortcomings of the Power6 sytems that we have to wait to see until very late this year. I wait eagerly for POWER6 results on different MHZ dimms, I wait for measured watts on Power6 on EVERY benchmark.

Venki since you have so much you want to claim, why don't you post it on
your blog and I'll comment.

If I am a customer running Java Serving, SAP, Oracle, MySQL, Web serving etc. I look at what a whole system (example car) does I don't look at the performance of a single sub-component (example Piston ring). Why doesn't IBM publish more whole-system benchmarks and show measured power on each. As an IBM shareholder doesn't it scare you that IBM avoids real data in public?

IBM can't beat CMT on system-wide performance so...
* IBM changes the topic
* IBM avoids real data that points to performance
* IBM avoids real measured watts on known applications or benchmarks
* IBM name calls CMT servers different, because they can't beat it on
SYSTEM performance.

I could do more real analysis if IBM allowed any data in public. But it is much easier for Venki to use rhetoric.

Venki you:
* call CMT servers different without data, let's see whole system perf
on realistic workload or benchmarks
* Venki please post processor scaling on p560, or p570 with different
GHz on INDUSTRY standard benchmarks
* or publish the exact definition of rPerf, lack of clear definition allows IBM to just make up numbers, please show the definition.
* publish the overhead of any IBM virtualization platform, I've posted
pointers that show big overheads, then mysteriously IBM marketing deletes to cover up any data on IBM websites.

Notice the un-released IBM p560 has many shortcomings, it cost what $250K even with mid-size memory. It can only support 96GB of fast memory or you lose a whole lot of memory perf and use 500Mhz dimms.

Posted by BM Seer on October 09, 2008 at 10:45 AM PDT #

Kevin

ON CMT is some real data...with explicit comparisons to IBM power6, but can't compare to IBM because IBM avoids results on many of these benchmarks and AVOIDS MEASURED WATTS on all of the rest.
http://blogs.sun.com/allanp/entry/sun_s_cmt_goes_multi
http://blogs.sun.com/bmseer/entry/2_chip_spec_cpu2006_rate
http://blogs.sun.com/bmseer/entry/lotus_domino_r6inotes_world_record
http://blogs.sun.com/bmseer/entry/specjappserver2004_world_record_single_application1
http://blogs.sun.com/bmseer/entry/sap_sd_2_tier_ecc
http://blogs.sun.com/bmseer/entry/sun_sparc_enterprise_t5240_world
http://blogs.sun.com/bmseer/entry/sun_s_even_faster_specweb2005
...real analysis, real data, real questions in each.

NO rhetoric, just data -- something lacking from big blue or its supporters...

Posted by BM Seer on October 09, 2008 at 11:02 AM PDT #

Wow,

A lot of FUD being thrown around here, especially from BMSeer. It just funny reading nonsense assertions.

1. P560 is supposed to be a cut down p570 that is available for less price. Not sure why you are bringing this up.
2. P560 still supports more memory than any CMT Sun server.
3. People replacing P570 with Sun CMT? Where are you getting the data for this? Last time I looked, IBM was extending their lead in Unix servers.
4. Could you show me any data that "Sun have more configurable systems than IBM and a lot more expansion possibilities".

Posted by Thu on October 09, 2008 at 04:38 PM PDT #

You blogged, made observations/comments that are not bearing and I pointed them out; you blog also allows for comments. If you don't want questions remove that ability from your blog.

Your comments to which I responded.. I don't see I was off topic...
* IBM is slowly working toward quad-core chips...
* IBM avoids any comparison to Sun CMT...
* IBM uses max watts for site planning guides ...
so how was I off topic...
I added one question on which I was curious - on why the M9000 benchmark was capped at 67% utilization.

You responded to these supposidely off topic comments in your response but in your second response I am off topic and you are asking me to post benchmark data - I think you need to ask IBM to do that and not me.

I just questioned why Sun does not do a benchmark which allows (potential & existing) customers to evaluate the scaling of a server. If I am running any of these "Java Serving, SAP, Oracle, MySQL, Web serving etc."
don't you think this would be useful information for me to evaluate and size my requirements on Sun servers.

What ever the numbers that IBM makes up for rPERF, IBM publishes it on their public website and that forces IBM to stand behind it to some manner. This allows customers who have customized code to make some evaluation of what their upgrade requirements will be when they move from an old server to a new generation server. We don't have a similar matrix from Sun - I didn't ask for audited benchmark values just a public document showning relative performance across various generations and types of SPARC processor-based servers.

CMT-based servers do not
a. Have high single threaded performance
b. Work very well when there are fewer independent threads (say less
than 16)
c. support is for robust virtualization (with independent OS instances) with dedicated I/O (say 5 partitions)
d. support hot pluggable components such PCI adapters
e. support Instruction retry in the processor, alternate processor recovery
And if I have to license Oracle Enterprise software; fewere processors cores) help and CMT server because of the not so important factor of per core performance tend to be expensive.

Venki

Posted by Venki on October 09, 2008 at 09:15 PM PDT #

BM Seer, please do not compare Niagara with POWER6. Niagara Core is actually an UltraSPARC II ( http://h10018.www1.hp.com/wwsolutions/misc/docs/2004_Server_Processor_of_Year.pdf ). I have tested by myself that Intel Xeon CPU is much faster than Niagara doing some complex Java workloads. I think Niagara is suitable for Router, Firewall or HTTP servers. No more than those application. If Niagara is better than POWER, then why SUN should endorse Fujitsu's SPARC64 to fill midrange to high-end space ?

Posted by Heatphlux on October 10, 2008 at 08:35 AM PDT #

Sun UltraSPARC IV is actually two cores of UltraSPARC III in the same die, and it will be end-of-life at the end of this year. SPARC64 is an OEM product from Fujitsu. Niagara is based on UltraSPARC II and acquired from Afara Websystem, not from Sun's internal R&D (Hey, where is UltraSPARC V and Gemini ? ). The biggest problem with Niagara is those 8 cores share a small number of L2 cache (4 MB) and the threads do not run simultaneously.

Posted by Heatphlux on October 10, 2008 at 09:05 AM PDT #

Heatphlux do you work for IBM... or just learn from them???

http://en.wikipedia.org/wiki/Fear,_uncertainty_and_doubt

'FUD was first defined by Gene Amdahl after he left IBM to found his own company, Amdahl Corp.: "FUD is the fear, uncertainty, and doubt that IBM sales people instill in the minds of potential customers who might be considering Amdahl products."'

"Fear, uncertainty and doubt (FUD) is a tactic of rhetoric and fallacy used in sales, marketing, public relations[1][2] and politics. FUD is generally a strategic attempt to influence public perception by disseminating negative (and vague) information."

... so why did heatplux point to articles and not actual prices, shipping
systems, benchmarks, measured watts??????

Thu: you are wrong about me using fud, let's look at Thu's statements, shall
we....

Thu do you work for IBM... or just learn from them???

"It just funny reading nonsense assertions."

Thu you say: "1. P560 is supposed to be a cut down p570 that is available for less price. Not sure why you are bringing this up."
you say: "2. P560 still supports more memory than any CMT Sun server."

Not a lot less price:
IBM 560 3.6GHz power6 16 cores with 128 GB of memory = $209,000
IBM 560 3.6GHz power6 16 cores with 256 GB of memory = $353,000
both with ..2 x 146GB SAS 15K + DVD-ROM + AIX 6.3 + 3YR SWMA

Oh... but there is a BIG DROP in MEMORY performance :(
for 16 cores one can only have 96GB of the fast 2GB @ 667MHz
Sun CMT supports more than 96GB...

for 16 cores one must to drop to 4GB @ 533MHz dimms to
get 128GB OUCH that hit performance. max 192GB for this midslow
memory

for 16 cores one must to drop to 8GB @ 400MHz dimms to
get 256GB OUCH ANOTHER hit performance. max 384GB with these ultraslow
DIMMS. I want published SPEC benchmarks to see the effect of this
slow memory.

IBM PLEASE MEASURE WATTS ON ALL SPEC, TPC, and ISV benchmarks!

Thu you say: "3. People replacing P570 with Sun CMT? Where are you getting the data for this? Last time I looked, IBM was extending their lead in Unix servers.

OH please show exact sales figure for POWER6 (IBM sells other kinds of
UNIX servers. this is a very vague FUD argument.

Thu you say: "4. Could you show me any data that "Sun have more configurable systems than IBM and a lot more expansion possibilities".

next posting... comments on this

And by the way do the IBM supporters really think shouting in comments is convincing anyone of anything.

Posted by BM Seer on October 10, 2008 at 11:19 AM PDT #

I didn't get any answers to my questions. So let me ask you some more...

What other Unix servers does IBM sell?
The Gartner figures mentioned in the below article clearly state that they are only considering AIX servers (AIX as far as I know only runs on Power)and that IBM AIX has gained market share while Sun has lost market share and dropped to second place.
http://www.itjungle.com/bns/bns082508-story01.html
Ohhhh, please don't tell me about the x64 blades running Solaris x86.

Where is the proof that customers are replacing IBM p570s with CMT servers? And as you like to ask for facts, please back up your contention with facts and not FUD (rhetoric and fallacy used in sales, marketing, public relations...to influence public perception by disseminating negative and vague information). Please show us the media links and/or industry reports talking about this dramatic change - reflected in market share where IBM AIX servers have lost marketshare qtr on qtr and Sun has subsequently gained marketshare. Or would you like to simply list major customers who are mass replacing their IBM p570 servers with CMT servers? (enough to dent marketshare figures)

Ever thought IBM included 32MB L3 cache in its Power6 processor design to take care of the fact that they can then use slower memory; but Sun dropped L3 so it has to compensate with faster memory (which by the way will generate more heat).

You like to ignore the questions people put to you by saying they are off topic, ignoring them completely or just ranting about FUD. Suprise me ... with straight answers to the questions asked.

Venki

Posted by Venki on October 10, 2008 at 01:01 PM PDT #

Why are customers replacing power6 with Sun's CMT. NO FUD PERFORMANCE DATA AS MEASURED BY IBM AND PUBLISHED BY IBM, beaten by Sun's published data on real systems. SYSTEM WIDE PERFORMANCE - without typical unmeasured IBM FUD of but the thread-widget is slower... or we have an L3 cache. So if it helps so much
why doesn't IBM published public benchmark info show it?

System performance on full benchmarks is better....

1) One Sun SPARC Enterprise T5240 server (two 1.4 GHz UltraSPARC T2 Plus chips) in the application tier demonstrated 2.8X better performance over the IBM p570 result of 1197.51 JOPS@Standard which used two 4.7Ghz IBM POWER6 chips.

One Sun SPARC Enterprise T5240 server (two 1.4 GHz UltraSPARC T2 Plus chips) in the application tier demonstrated 2.8X better performance over the IBM Power6 570 result of 1197.51 JOPS@Standard which used two 4.7 Ghz IBM POWER6 chips. The Sun SPARC Enterprise T5240 server has 3.9X better power-performance than the IBM Power6 570.

2) The Sun SPARC Enterprise T5240 server (2RU) with a 1.4 GHz UltraSPARC T2 Plus processor outperformed the 8-core IBM System p570 (8 RU) with four 4.7 GHz POWER6 dual-core processors by 4%.

The Sun SPARC Enterprise T5240 server with two 1.4 GHz UltraSPARC T2 Plus processors is the first two processor system to exceed 20,000 SAPS.

3) A Sun SPARC Enterprise T5240 server equipped with two UltraSPARC T2 Plus processors at 1.4GHz, delivered a World Record 2-chip result of 373,405 SPECjbb2005 bops, 23338 SPECjbb2005 bops/JVM. The Sun SPARC Enterprise T5240 consumed an average of 770 Watts of power to obtain this result.

One Sun SPARC Enterprise T5240 server (two 1.4 GHz UltraSPARC T2 Plus chips) demonstrated 11% better performance over the IBM p550 result of 333,779 SPECjbb2005 bops, 83445 SPECjbb2005 bops/JVM which uses four 4.2GHz POWER6 chips. The Sun T5240 server has 3.25X better SWaP and 50% better power-performance than the IBM p550.

One Sun SPARC Enterprise T5240 server (two 1.4 GHz UltraSPARC T2 Plus chips) demonstrated 2.1X better performance over the IBM p570 result of 175,474 SPECjbb2005 bops, 87737 SPECjbb2005 bops/JVM which uses two 4.7GHz POWER6 chips. The T5240 server has 3X power-performance than the IBM p570.

...more to come from SUN.

Venki, please explain this poor performance on IBM's fast 4.7GHz POWER6, supposed-fast-single-thread CPUs with L3 cache, that all add up to make slow expensive systems.

I expect little or no more published info on any additional IBM p650 benchmarks and I DO NOT expect to see IBM measure watts on their benchmarks, as you can see it isn't good and IBM avoids any real data that puts them in a bad light.

Required Disclosure statement
-----------------------------
SPECjAppServer2004
Sun SPARC Enterprise T5240 (16 cores, 2 chip) 3331.31 SPECjAppServer2004 JOPS@Standard.
IBM p570(4 cores, 2 chips) 1197.51 SPECjAppServer2004 JOPS@Standard.
IBM p550(4 cores, 2 chips) 1197.51 SPECjAppServer2004 JOPS@Standard.
SPEC, SPECjAppServer reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 04/09/2008.

Two-tier SAP Standard Sales and Distribution (SD) standard SAP ERP 2005 application benchmark: SPARC Enterprise Model T5240, 2 processors / 16 cores / 128 threads, UltraSPARC T2 Plus, 1.4 GHz, 8 KB(D) + 16 KB(I) L1 cache per core, 4 MB L2 cache per processor, 128 GB main memory; Number of benchmark users & comp.: 4,170 SD (Sales & Distribution); Average dialog response time: 1.97 seconds; Throughput: Fully Processed Order Line items/hour: 418,000, Dialog steps/hour: 1,254,000; SAPS: 20,900; Average DB request time (dia/upd): 0.085 sec / 0.238 sec; CPU utilization of central server: 99%; Operating System central server: Solaris 10; RDBMS: Oracle 10g; SAP ECC Release: 6.0. The SAP certification number was not available at press time and can be found at the following Web page: www.sap.com/benchmark. SPARC Enterprise Model T5120 (1-way, 1 proc, 8 cores, 64 threads) 1 x 1.4 GHz UltraSPARC T2, 64GB memory, 2175 SD Benchmark users, 1.91 sec avg response time, Cert#2007059, Oracle 10g, Solaris 10; SPARC Enterprise Model T2000 | Sun Fire T2000 (1-way, 1 proc, 8 cores, 32 threads) 1 x 1.4 GHz UltraSPARC T1, 64GB memory, 1100 SD Benchmark users, 1.91 sec avg response time, Cert#2007051, Oracle 10g, Solaris 10; IBM System p 570 (4-way, 4 processors, 8 cores, 16 threads) 4 x 4.7 GHz POWER6, 64GB memory, 4010 SD Benchmark users, 1.96s avg resp time, Cert#2007038, Oracle 10g, AIX 5L Version 5.3; IBM System p 550 (4-way, 4 processors, 8 cores, 16 threads) 4 x 4.2 GHz POWER6, 64GB memory, 3104 SD Benchmark users, 1.91s avg resp time, Cert#2008002, DB2 9.5, Redhat Enterprise Linux 5; IBM System p 570 (2-way, 2 processors, 4 cores, 8 threads) 2 x 4.7 GHz POWER6, 32GB memory, 2035 SD Benchmark users, 1.99s avg resp time, Cert#2007037, Oracle 10g, AIX 5L Version 5.3;

SAP, R/3, mySAP reg TM of SAP AG in Germany and other countries. More info www.sap.com/solutions/benchmark.

SPECjbb2005 Sun SPARC Enterprise T5240 (2 chips, 16 cores) 373405 SPECjbb2005 bops, 23338 SPECjbb2005 bops/JVM. IBM p550 (4 chips, 8 cores) 333779 SPECjbb2005 bops, 83445 SPECjbb2005 bops/JVM. IBM p570 (2 chips, 4 cores) 175474 SPECjbb2005 bops, 87737 SPECjbb2005 bops/JVM. IBM p560Q (8 chips, 16 cores) 226291 SPECjbb2005 bops, 28286 SPECjbb2005 bops/JVM.
SPEC, SPECjbb reg tm of Standard Performance Evaluation Corporation Results from http://www.spec.org as of 04/09/08

Power References:

IBM p6 570 power specifications from 80% of maximum report power consumption published here, 06/07/07, posted at
ftp://ftp.software.ibm.com/common/ssi/rep_sp/n/PSB01628USEN/PSB01628USEN.PDF
IBM p5 power specifications calculated by applying 70% of the power numbers published in ?Facts and Features Report?, 3/10/06, posted at
http://www-03.ibm.com/servers/eserver/pseries/hardware/factsfeatures.html

Posted by BM Seer on October 10, 2008 at 02:22 PM PDT #

Relax ... relax ...

Those are not FUDs. Those are my personal experiences. I was an employee of SUN partner last year. I don't have answer to my ISV regarding that issue (why Xeon is better than Niagara on complex Java workload ?). That's why I tried to find data by myself and realized that Niagara core is basically an UltraSPARC II core. No wonder ...

You can check this URL also http://tweakers.net/reviews/649/8/database-test-sun-ultrasparc-t1-vs-punt-amd-opteron-pagina-8.html

Posted by Heatphlux on October 10, 2008 at 04:53 PM PDT #

Heatphlux,

You really need to separate out HW & SW issues, most open developers where only focusing on a few cores, and the software shows it. Many people are now working to split locks an make sure that MySQL & PostGres scale to lots of cores, because EVERYONE is doing lots of cores these days. So you might be better off judging system performance by applications and benchmarks that have already eliminated software bottlenecks. I showed benchmarks above.

Don't worry MySQL and PostGres are rapidly growing up to work on bigger systems. YEAH!

Posted by BM Seer on October 10, 2008 at 05:10 PM PDT #

Heatphlux you are pointing to 2-year old SW and older T2000 not UltraSPARC T2 Plus, so don't draw too many conclusions about the present.

Basically an UltraSPARC II core is a bad conclusion too. Why not look at system performance now?

... ok way too late now, gonna run.

Posted by BM Seer on October 10, 2008 at 05:23 PM PDT #

I still remember when apple switched from Power to Intel how they were
saying how the Intel CPUs are 2 times faster.

ahhh the Power vs Sparc battle..... you know why Power will eventually loose the battle?

because AIX SUCKS.

Posted by Z on October 10, 2008 at 06:42 PM PDT #

@BM Seer : We were using multicores Xeon as a comparison to Niagara. We didn't need to tune the application on Xeon and the performance was very good.

Sun sales person always talked about throughput instead of response time, but the customer really need a very good response time performance. One more thing, previously the customer used the UltraSPARC III, and looking for a replacement. The story is end up with a lot of confusion. Why Xeon is performing better than Sun's own Niagara in term of response time ? Do you want me to say this to the customer : "please wait until your transaction is bigger than XXX transaction per second, then you will see Niagara is better than Xeon. "

That's why, after read some text book and informations, I conclude that individual core performance is still very important.

@Z : We are discussing on CPU instead of OS. For OS, I think Linux is the best UNIX (clone) at this moment.

Posted by Heatphlux on October 10, 2008 at 08:31 PM PDT #

@BM Seer : I forgot something to say ...

We tested the application, a Java application with WebLogic as the container, runs on Sun's UltraSPARC T2 (8 cores per die and 8 threads per core). Sun support said to the ISV that they need to turn off the 6 threads to get better performance (response time) than previous test (8 threads).

I'm not a CPU designer but I think with only 4MB L2 Cache for all 8 cores, I can understand with the performance result. Even my desktop PC has 2 MB L2 Cache (It is a Pentium 4).

Again, you can not compare UltraSPARC Tx (Niagara) with other general purpose CPUs like Xeon, POWER, SPARC64 and Itanium.

I found one of Niagara's cousin in the market : http://www.razamicroelectronics.com/products/XLR_732.htm . The company owner is also one of the shareholder in Afara Websystem prior to Sun acquisition. There is one interesting statement in its product description (in the PDF file) :

" The XLR700 series scalable communication processors from RMI®
are designed to address IP networking, VoIP, wireless LAN, 3G wireless,
broadband, storage, routing and switching, security, and telecommunication application "

Posted by Heatphlux on October 10, 2008 at 09:07 PM PDT #

The question was not "why customers are switching from p570 to CMT" (this ain't happening) the question was WHERE IS YOUR PROOF TO YOUR FUD CONTENTION THAT CUSTOMERS ARE SWITCHING FROM IBM p570s TO SUN CMT SERVERS? Do you have it?
(Please don't keep twisting statements you made and the questions we ask if you are unable to answer them).

The proof points are yet to be shown by you.

I accept all your benchmark facts as true, can't deny the black & white of the results and am not even debating it with you at this point. But I have few questions -
a. Do you think that when customers buy servers they consider the total cost of ownership? I think they do. While Sun likes to say number of cores don't matter, and try hide this fact by calling the chip a processor, customers who have to pay for software say it does; so what would be the total cost of acquistion if the cost of the software (say Oracle App server, Oracle DB Enterprise) is also considered? Care to define the difference in Oracle license costs for each of comparisons you made. (Ohh, I know SAP is not licensed based on processor but some of the other applications on those servers would be like backup software, enterprise management software, etc.. are.)

PS. You did not suprise me... not a straight answer to a single question.

Venki

Posted by Venki on October 11, 2008 at 03:30 AM PDT #

Things are getting hotter here. I understand this blog belongs to SUN because it is hosted in sun.com domain. Of course BMSeer you may have the right to say whatever you want to say whether it is true or not. But you will loose credibility (fortunately not SUN), if you can't prove the things right. Here is some examples :

1. You always claim Niagara machine is the best throughput machine in the world and it is benchmarked by specint2006_rate as an example. I got T5120 (1 chip, 8 core and 64 thread) and the specint2006_rate is 83.9 with gccfs. Since T5120 has 8 cores and 64 threads, so the test is copied 63 times over whole threads. So per core performance is 83.9/8 = 9.3. I checked SUN's X4150 (2 chip, 8 core and 8 thread) has 133. So per core performance for Intel is 133/8 = 16.625. How about M4000 (4 chip, 16 core and 32 thread). M4000 performance is 135. So per core performance is 135/16 = 8.4. You already know the performance of p570 ( 8 chip, 16 core, 32 thread) is 478. So Power6 per core performance is 478/16 = 29.8. This per core performance already explained the performance problem of Niagara CPU. This Niagara CPU can't even beat Intel and too far to be compared with Power6. In term of total throughput alone T5120 even can't beat X4150. So I can't understand what you are trying to claim or probably SUN through the information on their website.

2. If you think SUN has a very superior machine in the world especially for Database server, where is SUN's TPC-C number ? I know you will always be defensive by saying TPC-C is not relevant anymore, but please don't attack to anybody while even you can't prove you are better than anybody.

I guess you are trying to distort your SUN's performance number with your "technical marketing" information. I understand you may try to defend and attack as much as you can, but you have to prove whatever you say is justifiable.

Posted by dono on October 11, 2008 at 04:11 AM PDT #

Venki writes: "WHERE IS YOUR PROOF TO YOUR FUD CONTENTION THAT CUSTOMERS ARE SWITCHING FROM IBM p570s TO SUN CMT SERVERS? Do you have it?"

I have dealt with dozens customers who have SWITCHED because of performance issues, can't mention names. So yes you will deny it and point to vague statements of IBM market growth.

Venki writes: "I accept all your benchmark facts as true, can't deny the black & white of the results and am not even debating it with you at this point."

Yes that compelling performance data is part of the reason people switch.
Note these benchmarks support performance, power-performance, $/performance claims.

Venki writes: "a. Do you think that when customers buy servers they consider the total cost of ownership? cost of the software (say Oracle App server, Oracle DB Enterprise) is also considered? Care to define the difference in Oracle license costs for each of comparisons you made."

Yes this is a common IBM marketing attack point. Yes let's talk
TCO and mention hardware costs. The IBM systems beaten in the benchmarks mentioned previously costs $500,000 for 8 cores with same memory use in benchmarks, CMT costs 4x less (2RU server). Now lets talk watt/performance since we're all concerned about energy costs - why do those same IBM servers waste 4 times the watts per every unit of work? Calling it TCO and just pointing to the fact that some software is licensed per core is FUD that ignores all other cost factors. Add it up and Sun wins.

Venki You did not suprise me... not an original question, just the same ones that IBM marketing continues to shout. You guys lose lots of credibility that way.

Dono: system performance is what customer care about. What is your next arguement performance/transister? TPC-C problems(see tag cloud), many benchmarks have database as part of it being run by CMT. Most importantly many customers are using CMT as a database server right now.

Heatplux: CMT has long ago proven 2005, that it beats competition on performance at Web, application, and database on a huge number of different workloads & benchmarks.

Posted by BM Seer on October 12, 2008 at 09:20 AM PDT #

So, confidentiality stops you. And to your assurances that dozens of customers have moved off IBM, I'm sure IBM would be saying the same thing wrt Sun servers (those would be lies concoted by IBM according to you). The market data shows no drop in IBM AIX sales and subsequent increase in market share of Sun (sorry, according to Gratner customers seem to actually be moving off Sun - a drop in revenue and a drop to the #2 UNIX vendor).

There is a local saying (roughly translated) "The quality of the cake depends on how much you paid for it". Intel x64 servers are relatively cheap, mainframes are expensive but I don't see people trying to run mainframe applications on an Intel x64 server.
PS the TCO of a mainframe vis-a-vis a x64 server is less when considering the cost per transaction but very expensive when compared of a single system to single system basis.

So, coming back to you point of looking at the cost of the power & cooling:
a. IBM p570 draws more power and deliver more work than a CMT server
without having to rework standard applications to support huge
numbers of mutually exclusive threads.
In niche workload areas such as web serving, Java app (where these
parameters can be set on JAS)are best served by horizontal scaling
so IBM put up their Power blades. So your comparison should be with
these blades (but you want to show huge cost savings and that would
not meet your "marketing" objective and want to compare a CMT with an
enterprise p570 - do you honestly think IBM would use a p570 for this
workload if consolidated/virtualization was not a requirement)
b. You convienently ignored the software cost (this can in cases be more
expensive than the cost of the hardware or that of power and cooling.
SO WHAT IS THE DIFFERENCE IN THE COST OF SOFTWARE LICENSES (ORACLE DB
ENTERPRISE EDITION - US$47,000/- per license (0.75 license per core),
ORACLE APPLICATION SERVERS (do you want these costs also))?
c. YOU ALSO HAVEN'T ANSWERED IF CMT SERVERS HAVE THE CAPABILITY TO
THROTTLE THE POWER CONSUMED? (I don't think they can and that is why
you are silent on this?)

Hey, about IBM losing credibility - the prelim results for this Qrt is beating market expectations and their growth in marketshare in UNIX servers shows that their credibility is not at risk.

Sun's drop in UNIX server marketshare; its continued delays and cancellations in delivering promised products (US-V, Rock, etc.) and now Sun need to OEM from Fujitsu to have a credible product line is what exactly leads to a loss of credibility.

Maybe... what IBM harps on about is what customers are really concerned about; maybe thats why customers are buying their servers. Maybe Sun should listen.

Venki

Posted by Venki on October 12, 2008 at 01:39 PM PDT #

I understand that a single Sparc64 VII (quad core) does about 40 Gigaflops. I guess that's why it's going into the Sun M3000 you're announcing tomorrow. What does a single Power6 (quad core) achieve? I searched all over the web and couldn't find any specs for Power6. Weird huh?

The T2 has been around for a while now and Intel is catching up fast. Is there a T3 on the way? If not, what's going to succeed the T2 and T2+ processors?

Posted by Kevin Hutchinson on October 12, 2008 at 04:23 PM PDT #

Hi Kevin

The peak FLOPs calculation is :
Flops = Clock speed x FP calculations

For a Power6 @4.7GHz processor (IBM counts each core of the chip as 1 processor)
Flops per core = 4.7GHz x 4 = 18.8 Gigaflops
Flops per chip (2 cores) = 37.6 Gigaflops

Power6 Quadcore @ 3.6GHz would be
FLOPS per core = 3.2GHz x 4 = 12.8GigaFLOPS
FLOPS per chip (4 cores) = 12.8 GFLOPS x 4 = 51.2 GigaFLOPS

PS I also couldn't find any details on the web on Power6 Quadcore except old news reports speculating where Power6 is going next.

Hope this helps
Venki

Posted by Venki on October 12, 2008 at 10:15 PM PDT #

Z hit the nail on the head.

AIX is appalling. Edit inittab and the whole thing tumbles down. On Solaris you can even delete it!

Power is getting to be exactly what it says on the tin - and uses it by the nuclear powerstation load. They are ferocious processors but the overall design and the crap IBM software stack negates any performance benefit they might get.

Some of the hot air blown out the back of the Power6 P570's should be turned into useful work...

Give me two Sun boxes to run active:active vs two AIX jobbies, I might not have the same power, but it'll be reliable, cheaper to run, easier to use (ever used DLPAR - I've seen it bring down other lpars!), and if I do run out of steam I can buy another two Sun servers and still have change left over from buying big blue.

Posted by Mr_V on October 13, 2008 at 01:12 AM PDT #

Let's look at the state of the processor industry. It is fair to use cores per chip as a metric to evaluate the sophistication of a particular processor vendor. Today on x86, AMD is at 4 cores per chip on Barcelona, Intel is at 6 cores per chip on Dunnington. On EPIC, Intel is at 2 cores per chip on Itanium. In RISC, SPARC is at 4 cores per chip on SPARC64 and 8 cores per chip on Niagara. POWER is at 2 cores per chip. Next year, Intel goes to 8 cores per chip on x86 with Nehalem, and 4 cores per chip on EPIC with Tukwila.

Like it or not, more cores per chip, not more MHz per core, is the trend in the industry. The fact is, IBM could have done more cores at lower frequency if it wanted to. Heck, it did on the Z6 mainframe version of POWER6. I guess IBM cannot get the yield on the quad-core POWER processors, so it had to do the QCM thing again with POWER6.

As for anyone saying the T5440 is not comparable to the POWER 570 or POWER 560 based on enterprise features, that is just bunk. That was a viable complaint against the T1000, but not the larger CMT systems.

From a price and market segment, it seems the POWER 560 and T5440 are very similar in target markets.

Posted by Mark on October 13, 2008 at 07:07 AM PDT #

Venki writes: "Hope this helps"

Actually it doesn't help :) Peaks are worthless. IBM really needs to show delivered performance, delivered watts, ...

Venki writes" The peak FLOPs calculation is :Flops = Clock speed x FP"
"For a Power6 @4.7GHz processor (IBM counts each core of the chip as 1 processor) Flops per core = 4.7GHz x 4 = 18.8 Gigaflops,Flops per chip (2 cores) = 37.6 Gigaflops"

OK lets look at delivered perf, here is a comparison:
The Sun SPARC Enterprise M9000 server running 2.52GHz SPARC64 VII delivered 2.023 TFLOPS on the Linpack HPC benchmark which is TWO TIMES on the Linpack HPC benchmark. This system is the largest that IBM makes for its 5GHz Power6-based servers.

Also notice how arbitrary core, processor, and thread are as labels, IBM just decided to call a core=processor, early engineer docs actually said core was called a thread. Very arbitrary. That is why we look at system performance.

Disclosure Statement:

Linpack HPC, results from http://www.netlib.org/benchmark/index.html as of 07/01/08. Sun SPARC Enterprise M9000 (SPARC64 VII @2.52, 64 chips, 256 cores), 2.023 TFLOPS. IBM Power 595 (POWER6 5.0GHz, 32 chips, 64 cores) 1028.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips, 128 cores) 745.5 GFLOPS.

Linpack HPC, results from http://www.netlib.org/benchmark/index.html as of 04/13/07. Sun SPARC Enterprise M9000 (SPARC64 VI @2.4, 64 chips, 128 cores), 1.032 TFLOPS. IBM p5 595 (POWER5 1.9GHz, 32 chips, 64 cores) 418.0 GFLOPS. HP Superdome (Itanium 2 1.6GHz/24MB, 64 chips, 128 cores) 745.5 GFLOPS.

See here is where IBM is arbitrary IBM calls each core a process, IBM used to call those what they now cores... "a thread" (early docs on the web). So we need to look at system, not arbitrary labels inside a chip.

Posted by BM Seer on October 13, 2008 at 08:07 AM PDT #

...cores per chip as a metric to evaluate the sophistication of a particular processor vendor..

Simply putting more simple cores on a chip does not prove the sophistication of the vendor; it is technically more difficult to have speed racer chip then populate more cores. When does core population become sophisticated - when the vendor is able to have core share functional units to get higher utilization on clock cycles, which Sun has failed in the UltraSPARC T2 & T2+ processors.

Consider that, SPARC64 as it evolved from dual core to quad core actually dropped performance per core while increasing clock speed. Check the per core performance on SPARC64 VI & VII, say in the SAP 2D2T benchmarks, you will notice a dramatic drop in the per core performance. Do you consider this a good design?

IBM's quad core delivered less performance than the dual-core versions but at a lower clock speed.

While the increase in cores per chip (without equivalent performance per core)improves the "per socket" performance it also increases your software licensing costs.

Then if the CMT servers are a viable choice for all workloads with the sophisticated RAS features - then why does Sun even bother to resell the Fujitsu boxes? It should just gone to market with just the CMT servers.
The answer lies in that for certain workloads the CMT's dependence on a profusion of light threads does not works - for example HPC workloads, data warehouseing, etc.. (PS I have not seen any benchmarks from Sun for those workloads).

I agree the IBM entry servers & blades are the right models to compare with the CMT servers. BMSeer insists on comparing CMT with p570 and then does a cost comparison; thats like IBM marketing doing a comparison of p550 with M9000 and claiming that the p550 has better price performance then the M9000.

Venki

Posted by Venki on October 13, 2008 at 08:07 AM PDT #

bmseer
- Please show how scientifically Sun came to call a chip a processor?
- It is no less arbitrary than IBM calling a core a processor.
- IBM has maintained calling a core a processor since 2001 when the first commerical multicore chip (POWER 4) came out.

Sorry, bm I was asking Kevin if it helps him to understand, for you nothings help as you just won't listen.

Still no answers on
a. Licensing,
b. On energy throttling, or
c. Proof of customers switching en mass (from IBM to Sun).

Venki

Posted by Venki on October 13, 2008 at 08:22 AM PDT #

Venki: let's cover one thing at a time... energy throttling.

OK So let's throttle the p570 to 770watts and see the performance... at full power the IBM is 4 times worse power-performance. So how much slower would power throttling make the performance?

One Sun SPARC Enterprise T5240 server (two 1.4 GHz UltraSPARC T2 Plus chips) in the application tier demonstrated 2.8X better performance over the IBM p570 result of 1197.51 JOPS@Standard which used two 4.7Ghz IBM POWER6 chips.

One Sun SPARC Enterprise T5240 server (two 1.4 GHz UltraSPARC T2 Plus chips) in the application tier demonstrated 2.8X better performance over the IBM Power6 570 result of 1197.51 JOPS@Standard which used two 4.7 Ghz IBM POWER6 chips. The Sun SPARC Enterprise T5240 server has 3.9X better power-performance than the IBM Power6 570.

SPECjAppServer2004
Sun SPARC Enterprise T5240 (16 cores, 2 chip) 3331.31 SPECjAppServer2004 JOPS@Standard.
IBM p570(4 cores, 2 chips) 1197.51 SPECjAppServer2004 JOPS@Standard.
IBM p550(4 cores, 2 chips) 1197.51 SPECjAppServer2004 JOPS@Standard.
SPEC, SPECjAppServer reg tm of Standard Performance Evaluation Corporation.
Results from www.spec.org as of 04/09/2008.

Power References:

IBM p6 570 power specifications from 80% of maximum report power consumption published here, 06/07/07, posted at
ftp://ftp.software.ibm.com/common/ssi/rep_sp/n/PSB01628USEN/PSB01628USEN.PDF
IBM p5 power specifications calculated by applying 70% of the power numbers published in ?Facts and Features Report?, 3/10/06, posted at
http://www-03.ibm.com/servers/eserver/pseries/hardware/factsfeatures.html

Posted by BM Seer on October 13, 2008 at 08:41 AM PDT #

1. Again, for such workloads IBM would propose their blades - please compare power consumption with such, not a p570.
2. Simple YES or NO answer - IF I WANTED TO THROTTLE POWER ON CMT SERVER CAN I DO IT?

Venki

Posted by Venki on October 13, 2008 at 09:24 AM PDT #

LOL, great read ! BM Seer you have surpassed yourself this time ! :)

As for this

Mr V wrote

"AIX is appalling. Edit inittab and the whole thing tumbles down. On Solaris you can even delete it!"

Dont be a complete tool. Learn how to use AIX properly then before you start messing with it. As a long term Solaris and AIX sysadmin, and I can say without a shadow of a doubt that AIX is in a different league to Solaris. Far more stable, managable, graceful....etc etc etc etc. As for DLPAR, i suggest that again, most probably YOU at fault. It works just fine for me, and always has.

Posted by Alex on October 13, 2008 at 09:37 AM PDT #

You IBM guys are slick & tricky (changing the topic... nah let's look at slower blades) :) OK pick your workload: Web, application serving, or database. Don't bait and switch to slower blades when a comparison get's uncomfortable.

Venki, help me understand your power throttling, so I can answer the question. If I had an IBM 4-chip (dual-core) 8-core 4.7GHz 4.7GHz power6 that draws 1080watts as measured (max 1400W).

Venki please tell me the performance level if you throttled that 4chip dual-core p570 4.7GHz with the fast memory to 500watts?

I need to understand how useful that features is to answer the question. Published data is preferred, but likely not available I fear.

Posted by BM Seer on October 13, 2008 at 09:54 AM PDT #

Mistake I severely underestimated that watts:
2160watts for an IBM 8-core 4.7GHz p570 (max 2800watts)
1080watts for an IBM 4-core 4.7GHz p570 (Max 1400watts)

You IBM guys are slick & tricky (changing the topic... nah let's look at slower blades) :) OK pick your workload: Web, application serving, or database. Don't bait and switch to slower blades when a comparison get's uncomfortable.

(CORRECTED: Venki, help me understand your power throttling, so I can answer the question. If I had an IBM 4-chip (dual-core) 8-core 4.7GHz 4.7GHz power6 that draws 2160watts as measured (max 2800W).

Venki please tell me the performance level if you throttled that 4chip dual-core (8core) p570 4.7GHz with the fast memory to 500watts? I guess
you only get 1/4 the performance. Yes or no?

I need to understand how useful that features is to answer the question. Published data is preferred, but likely not available I fear.

Posted by BM Seer on October 13, 2008 at 10:07 AM PDT #

Venki, some numbers on CMT growth only a small 60% growth rate... :)

"Servers based on the T2 processor family, which debuted last October and is also known by the code-name Niagara 2, have become a $1 billion business that currently is growing at an annual rate of about 60%, according to Jonathan Schwartz, Sun's CEO and president. That makes it "arguably the fastest-growing business that we have ever built at Sun," he said."
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9116998&intsrc=news_ts_head

So that growth rate is pretty good for $1B business.

When I talk to customers who have replaced power6 with CMT the numbers they are contributing to these numbers.

Posted by BM Seer on October 14, 2008 at 03:32 PM PDT #

BM Seer, do you really expect us to beleive that your "customers", who have spent "millions" investing in Power 6 in the past 12 months, are now throwing away that investment and moving to Niagara based servers ?

I put it to you sir, that you are talking nonsense !

Posted by Alex on October 15, 2008 at 12:24 AM PDT #

From Sun's press release: "Sun reported 83 percent year-over-year billings growth in its Solaris-based Chip Multi-Threading systems as customers continued to demand the nearly 10,000 applications available for Solaris 10, while enjoying integrated virtualization and exceptional power efficiency."

If you look at the IBM numbers below are some scary facts, now notice IBM doesn't have clear reporting and IBM changes its "accounting" for p-series and other servers to cloud the numbers. But if you take the time to investigate...

Power Series +7%
iSeries down -82%

IBM has merged the old p-series and i-series into the new p-series. So, this way moving forward, the new p-series will show improvements over last year (due to the addition of i-series business).

It is safe to say that IBM's p-series likely was flat to lower Vs last year. And, x-series (x86) business was way down (-15%) including a drop in their blades business.

Alex gives sweeping generalizations, I show the numbers. I also take to customers that are switching. In the end IBM can probably find customers that are switching the other way in this big world.

But let's get back to the benchmarks: very expensive IBM hardware is getting beaten in perfomance, $/performance, and certainly in power/performance.

Sun's 4RU CMT Sun Enterprise T5440 beats performance of IBM's $1M 16-core 16RU system, etc... see the next postings.

Posted by BM Seer on November 04, 2008 at 08:46 AM PST #

Post a Comment:
Comments are closed for this entry.