Tuesday Sep 18, 2007
Intel non-default BIOS change results by 25%? Sure turning off
prefetch is a technique but if you don't know if a priori if you
should, then should you use it to judge performance?
Always interesting when you have more information. I guess our
friends at AMD wanted everyone to see what our friends at Intel
were doing so they submitted two SPEC results for them.
Case in point on Clovertown there are two AMD results on the same hardware that gives
25% difference.
Point: Normal mode = prefetch on
Gives 163,080 SPECjbb2005 bops
www.spec.org/jbb2005/results/res2007q2/jbb2005-20070326-00276.txt
Counter-point: Disable HW prefetcher in BIOS for benchmark imprv
Gives 203,754 SPECjbb2005 bops
www.spec.org/jbb2005/results/res2007q2/jbb2005-20070326-00275.txt
...both on the same hardware:
same: 2-socket SuperMicro X7DBE (Intel 2.66GHz Xeon quad-core X5355), 16 GB
Disclosure statement
SPECjbb2005 SuperMicro X7DBE (2 chips, 8 cores, 2.66 GHz) SPECjbb2005 bops=163080, SPECjbb2005 bops/JVM=81540 submitted by AMD;
SuperMicro X7DBE (2 chips, 8 cores, 2.66 GHz) SPECjbb2005 bops=203754, SPECjbb2005 bops/JVM=101877 submitted by AMD; SPEC, SPECjbb are registered trademarks of Standard Performance Evaluation Corporation. Results 3/7/07 on www.spec.org.
Friday Sep 14, 2007
I know AMD & Intel love to focus on the power used by CPUs
in their high-stakes battle to gain server chip dominance. Both
started talking TDP (Thermal Design Power) and getting people to judge
systems based on TDP.
...wait a minute, buying a system based on the power of a CPU is
a bit of misdirection unless the CPU is most of the power. This isn't
the case any more. So CPU TDP should be ignored, unless you are
designing your own product and are just buying CPUs.
First of all system power is what datacenters care about, so Intel and
AMD should be talking about system power in realistic memory configurations.
Memory draws lots of power these days.
Second,TDP was created so the manufactures of servers manufacturers
would know much power the chip consumes in worst-case maximum-power cases
so they could design power supplies, cooling, etc. That just isn't
useful to datacenter managers.
AMD's marketing only slightly improved the situation by telling
customers of SYSTEMS to look at the processor's ACP (Average CPU
Power).
Two problems:
- Focus on CPU power to avoid talking system power, but system power is
what one needs to know, in average case to estimate electrical bills.
- Focus on average CPU power not server maximums that datacenters need
to design cooling on (see: http://blogs.sun.com/bmseer/entry/watts_a_matter_with_their.
ACP of a CPU ...hmmm, do you know what a pain it is to just measure a CPU. AMD
in their whitepaper , has to isolate the power consumed by the processor
and that consumed by the motherboard -- this requires motherboard modifications and special instrumented server platforms!.
way to much work to get a marketing message, all we want is server power on a variety of memory configuations and full-speed CPUs and actually running at good datacenter utilizations!
WARNING: Everyone loves to talk performance of high-GHZ CPUs and low-power of low-GHz CPUs, so watch for the confusing marketing and much worse "bait&switch." Also watts per core is useless marketing, it is the watt/perf for a system that counts. Also any vendor trying to sell power-efficiency on high-performance
systems should report watt/performance along with their world record performance on that system.
Thursday Sep 13, 2007
The register really needs to start asking tougher questions about power.
A new article "Researchers: AMD less power-hungry than Intel"
by Austin Modine.
This article just takes the results and ignores the really tough
questions facing datacenters and power. I'll help here...
- What is the power draw of large memory configurations 16GB, 32GB, 64GB.
The 2GB to 8GB used in this report are tiny for those making decisions in many datacenters.
- If you have racks and racks of idle servers (even servers at
30% utilisation) you are major problems and should be counting on
new servers to save power!!!
- It is Watt/Performance (like $/perf) not perf/watt, as this highlights
efficiency.
- CPU utilisation needs to be reported, if you are measuring at
less than 60% you are wasting 2 TIMES to 5 TIMES more power per unit of work!
- Everyone needs to realize the variability of power measurements and not report differences as ##.#% - what? one simply doesn't have one-tenth of
1% reproducibility in power measurements of servers.
Focusing on small memory tries to focus on CPU power which is fine for chip vendors, but it is frankly no what customers I talk to care about.
Customers care about full system ("car
not piston performance").
The results show that under certain configurations and load levels, the Intel server was 2% to 12% per cent more power efficient. But in a majority of cases, the AMD server was 9% to 23% per cent more efficient.
These are tiny differences. Not enough testing was done to show if these
results are consistent (power measurement has big run-to-run variations due to
the complexity of most systems.
AMD server was 30% to 53% more power efficient. If accurate,
it's a noteworthy figure, considering many servers spend the most of their time waiting for work.
WHAT?!?!? Datacenters idle? this is insane, maybe in extremely poorly run datacenters.
Actually, if this the case, those people running the datacenter should find other jobs and
not run datacenters.
On the whole, NN&A's tests showed that Intel's power efficiency decreases as memory size increases. Conversely, AMD's power efficiency increases as the memory is upped.
Intel, of course, disputes the results.
"The report doesn't measure our latest Xeons, or quad cores," said Intel rep Nick Knupffer in an email. We have 2 GHz quad cores in the market at 50 watts, 12.5 per core!"
I'll agree with Intel's criticism about latest CPUs and QCs, but even in
the Intel statement they quickly talk about 2GHz QC and not the full
GHz ones. Everyone loves to talk performance of high-GHZ CPUs and low-power of low-GHz CPUs, so watch for the confusing marketing and much worse "bait&switch." Also watts per core is
useless marketing, it is the watt/perf for a system that counts.
NN&A's white paper http://www.worlds-fastest.com/d.pdf/wfw991.pdf
Also take a look at: http://blogs.sun.com/bmseer/entry/saving_the_planet_one_datacenter
If you want to see big power savings (not the small percentages talked about above!) take a look at:
http://blogs.sun.com/ValdisFilks/entry/another_win_for_ecological_computing
...it is far too late, enough writing for today.
Tuesday Aug 07, 2007
Sun UltraSPARC T2 is an amazing chip and very fast! The UltraSPARC T2 features several industry firsts:
- Eight cores and 64 threads
- Integrated 10 GbE networking and I/O
- Dedicated, cryptographic and floating point units per core
- 10 cryptographic functions supported with hardware
- open-source design: www.opensparc.net
Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz,
beat all single chip scores showing 78.3 est. SPECint_rate2006.
How do these preliminary runs (we must use the term "estimated" by
SPEC rules) compare to SPECint_rate2006 results.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
IBM POWER6 4.7GHz processor published result by 29%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
estimated scores of the AMD Barcelona by 23%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
published scores of the 2.66GHz Intel X5355 (Clovertown) by 48%.
Based upon preliminary runs, the Sun UltraSPARC T2 processor at 1.4 GHz,
beat all single chip scores showing 62.3 est. SPECfp_rate2006.
How do these preliminary runs (we must use the term "estimated" by
SPEC rules) compare to SPECfp_rate2006 results.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best
published single-chip IBM POWER6 4.7GHz processor result by 7%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip estimated scores of the AMD Barcelona by 11%.
- These Sun UltraSPARC T2 1.4GHz processor scores beat the best single-chip
published scores of the 2.66GHz Intel X5355 (Clovertown) by 66%.
Performance per core doesn't matter GHz doesn't matter, what matters
is numbers of cores, efficiency, and design of the chip! Competitors
are saying that UltraSPARC T2 is proprietary... this makes no sense.
both UltraSPARC T1 and UltraSPARC T2 are open source designs (www.opensparc.net). You do not find the
latest design of Intel, AMD, or IBM as open source designs.
Disclosure Statement:
All Sun UltraSPARC T2 SPEC CPU metrics quoted are from full “reportable”
runs, but are nevertheless designated as “estimates” because they use
preproduction systems. SPEC, SPECint, SPECfp registered trademarks of
Standard Performance Evaluation Corporation. Sun UltraSPARC T2
1.4GHz (1 chip, 8 cores, 64 threads) 78.3 est. SPECint_rate2006,
62.3 est. SPECfp_rate2006.
Competitive results from www.spec.org as of August 6, 2007.
IBM POWER6 4.7GHz (1 chip, 2 cores, 4 threads) 60.9. SPECint_rate2006,
58.0 SPECfp_rate2006.
AMD Barcelona 2.6 GHz (1 chip, 4 cores, 4 threads) 63.9 est SPECint_rate2006,
56.3 est. SPECfp_rate2006. Barcelona estimates based upon "The Register"
article stating 2.6GHz quad is 21% and 50% faster than Intel 2.66 system.
Fujitsu RX300 Intel X5355 2.66 GHz (1 chip, 4 cores, 4 threads) 52.8 SPECint_rate2006, 47.5 SPECfp_rate2006.
Reminder: The Niagara 2 score was obtained from a full "reportable" SPEC
run, but is designated as an "estimate" because a pre-production system
was used.
...more information on the UltraSPARC T2 later today.
Friday Dec 08, 2006
Here is a processor wattage chart, but notice that Intel has the memory controller off chip so you need to add 30-35 watts to this figures
(opteron includes this on chip)
http://www.intel.com/products/processor_number/chart/xeon.htm
For more details on power budget breakdown I got pointed to these
two pages for an AMD comparison (looks like it is part of a bigger preso?, no confidential statement):
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/opt_vs_wc_8_dimms.pps
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/4_opt_vs_4_pax.pps
If there were already Intel-based results for a SP...
Agreed, but with such a difference, how does one k...
The degree of performance difference doesn't ...
OK so how would a user know what characteristic of...
Why the characteristics of their workload that mak...