Power outages were reported in both Los Angeles and Buffalo this week. While the Los Angeles outage was simply a result of a faulty sensor in a power transfer facility, the problems in Buffalo appear to be caused by a new $2.3 million Dell computer cluster at the University of Buffalo. According to Supercomputing Online, the university has only been able to run their Dell cluster at 60% of capacity because of "an initial underestimation of the power needed to run the system". I've been to Buffalo, albeit only in the winter, and the university is a good Sun customer, so I thought I would do some research to see if I could help them.

Someone at Buffalo clearly cares about power, because their web site has some nice pictures of the power cables being laid out in the computer room. According to Buffalo's aptly named hotpages, their cluster has a total of 800 Dell SC1425 compute nodes each with two 3.2 GHz Xeon processors. For whatever reason, Intel makes it rather difficult to find information on CPU power usage on their web site, but Dell has this nice Power Calculator on their site which shows the SC1425 uses 437 Watts, which works out to about 350 KWatts for the whole lot of 800. That doesn't count the Myrinet, Fiber Channel, or Gigabit Ethernet switches used in the cluster, or the power needed to cool the system, but lets ignore that for the time being. If Buffalo only has enough power to run the cluster at 60% of capacity, lets assume they have 210 KWatts available for compute nodes.

Usually, when someone buys a 1600 CPU cluster, one of their goals is to get on the Top500 list. To qualify for the Top500 list, you need to run a benchmark called Linpack. There are two figures reported in the Top500 list. The first is a simple calculation called Rpeak which is the maximum theoritical number of floating point operations per second. For Dell's SC1425 server, the figure is calculated as 3.2 GHz * 2 CPUs/server * 2 floating point units/CPU = 12.8 GFlops. For 800 servers you get a Rpeak of 10.24 TFlops. Now lets look at Sun's V20z with dual core AMD Opteron CPUs. A single Sun Fire V20z has an Rpeak of 2.2 GHz * 2 CPUs * 2 cores/CPU * 2 floating point units/core = 17.6 GFlops. The same RPeak value as the 800 node Dell cluster could thus be obtained by 582 V20z servers.

Of course Rpeak is only a theoritical maximum, so the Top500 rankings are actually based on a second number, Rmax, which is the measured throughput using the Linpack benchmark. As can be seen by browsing the Top500 list, Rmax varies widely, with most systems achieving an Rmax of between 50% and 70% of Rpeak. Since Linpack codes have been tuned for many years, the Rmax efficiency is often higher than actual user codes would achieve. Many customer codes, when first run at our High Performance Computing Center in Hillsboro Oregon start out achieving 20% or less of RPeak. Thus, while Top500 is an interesting list, Sun recommends that customers with specifc processing requirements either benchmark their actual code or at least use multiple industry standard benchmarks, like those from the Standard Performance Evaluation Corporation, commonly referred to as SPEC. In published results, AMD's dual core x64 processors typically show 2x the floating point performance of comparitive Intel x64 CPUs while using less power. No wonder Intel has not responded to AMD's recent challenge to a duel. In addition, unlike Intel, AMD makes it very simple to find the max power usage of any of their CPUs.

So back to Buffalo's problem. How are they going to run their Top500 benchmark if they can't turn on all their systems? Assuming they aren't able to afford a new power transformer (I doubt that was figured in the $2.3M purchase price of the Dell cluster), what can they do? They could wait for winter and save power by turning off some of their air conditioners. That might work in December, but the Top500 benchmark is due October 1. Even Buffalo doesn't get that cold in October. Would they have enough power if they replaced their 800 Dell servers with 582 Sun Fire V20z servers? Each of the V20z's dual core CPUs uses 95 watts max. Add in power for the server's memory, disk, and other components and a more typical system power consumption is about 325 watts. So 582 nodes * 325 watts = 190 KWatts, comfortably under the 210 KWatts calculated above as being available! A few less Myrinet, Fiber Channel, and Gigabit Ethernet switches would also be needed, making further power available. If Buffalo would like us to size the system based on actual application performance, we would be happy to benchmark their code and I expect would be able to get the same performance as the 800 Dell systems with fewer than 582 nodes, saving even more power.

Now let's look at what the total cost of ownership (TCO) difference would be had Buffalo gone with 582 Sun Fire V20z compute nodes instead of the 800 Dell boxes. Buffalo didn't break down the $2.3M price, but lets just call the acquisition costs for the servers equal. Lets look at the other components needed by the 218 extra Dell servers:

7 extra racks @ average $5K = $35K

218 extra Myrinet cards + switch ports @ average $1K = $218K

For simplicity, lets ignore the other extra components except for the power cost. The Dell system would require 24 * 365 * 437 Watt * 800 = 3062 MWatt/year.

The Sun system would require 24 * 365 * 325 Watt * 582 = 1657 MWatt/year.

I'm not sure what University of Buffalo pays for power today, but I found this 2002 article explaining how the university was going to save $70,000 a year by self-generating 2000 MWatts/year. I expect electricity prices have gone up since the 2002 date of the article, but using those figures the three year savings of 4215 MWatt hours would be at least an additional $147,525.

Since the university is a good Sun customer, I'll spare stating the obvious. However, this story is a great illustration that performance/watt is becoming increasingly important and you can no longer calculate simple acquisition price/performance without looking at your total cost of ownership. Our high performance computing group at Sun has architected 800 node and larger clusters for many academic, research, and commercial customers, including several universities and financial institutions right in New York. It is a shame the university can't use all their new Dell systems because of lack of power. However, if the university doesn't want to wait until winter to turn on all those Dell space heaters, and doesn't want to blow next year's faculty salary increase budget on a new power generator, they might want to check out our Dell trade in allowance and attend next month's Network Computing Launch where we will announce even more new x64-based systems.

Comments:

Great analysis!

Posted by Manish on August 29, 2005 at 09:01 PM PDT #

Here via your mention on Jonathan's blog. Great work - too bad U of B didn't contact someone at Sun before inking a deal with Dell. I'll be interested to hear further developments and be sure to let us know if you hear from U of B. btw, just subscribed. Looking forward to reading what you have to say.

Posted by Gary Potter on August 30, 2005 at 07:00 AM PDT #

[Trackback] I’m a regular reader of Jonathan Schwartz’s web log. I don’t always agree with what he has to say and I don’t always understand it either. But, I did agree with and understand yesterday’s post. It was all about the

Posted by A Sabre Geek on August 30, 2005 at 07:23 AM PDT #

Marc, great read. Next step is convince them to consider Sun Ray thin clients. Care to guess how many desktops (energy star or not) they have? 17 watts per desktop will make even more of a difference.

Posted by John Clingan on August 31, 2005 at 09:29 AM PDT #

Very cute and well written. Yes, comparing a Sun Opteron dual core to a Dell Xeon single core per your math yields a very similar result - with favor towards Sun based on power and peripherals. Agreed. But I wanted to do more of an apples to apples (single core versus single core and dual core versus dual core) to look at the actual situation without accounting for AMD's approximate 1 quarter lead in delivering DP dual core. The results (again, using your math plus retail prices per the web plus power calculators where available): Sun V20z (2 Opteron 252's) = $557.21 per GFlops, Dell SC1425 = $367.34 per GFlops. Plus the Dell unit (as configured - per Dell's calculator) was 352 Watts versus Sun's (my guess) 330 Watts. Ergo, the overall power necessary to reach 10.24 TFlops is 325 kilowatts for Sun and 282 Kilowatts for Dell. While it's hard to project exactly how Xeon dual core will compare with the V20z dual core it looks to me like it will be an overwhelming Dell advantage per your numbers. I'll look forward to running those numbers. In the meantime, your simplified theoretical GFlops math makes it easy to see why SPARC is such a difficult proposition in today's market: a V210 with 2 SPARC 1.35 GHz CPUs costs $931 per gigaflop and would consume a whopping 463 kilowatts of power to achieve 10.24 TFlops. By the way, if you truly want an apples to apples comparison, compare the HP DL145 with your V20z (you won't like the results).

Posted by Drew Engstrom on August 31, 2005 at 02:46 PM PDT #

Well, it didn't take much guessing, while Drew vaguely disguised his employment with Dell by listing his personal vs work email address, a quick visit to http://engstromweb.com appears to indicate he works for Dell. I guess if I worked for a company that had decided to exclusively work with Intel as a CPU supplier, I would make similar arguments. For the record, AMD started shipping their dual core Opteron server processors in April and Intel has yet to do so, so I believe AMD's lead is more than one quarter, probably at least six months. Historically, the Top500 performance has outpaced Moore's law, growing by 60% every SIX months. In today's world, six months is a long time. As for SPARC performance/watt, hang until until early next year, Sun will be changing the game again.

Posted by Marc Hamilton on August 31, 2005 at 05:14 PM PDT #

Working or not to Dell, Drew made a point that you failed to address. You were, in fact, comparing Dual core with single core processors. But, the question is why shouldn't you? The fact that Dell doesn't have Dual core machines it's a technological disadvantage of the choices they took. If that means they are lagging behind in performance, that's their problem and they have only themselves to blame. (PS. I work for a Sun Reseler)

Posted by Jaime Cardoso on September 01, 2005 at 04:08 AM PDT #

Thanks for the dialog/conversation. Marc - my only nit with your original post was that it was presented in such a way that it implied some level of scientific rigor when, in my opinion, it was executed in more of a "bench-marketing" style with a very specific Sun point of view (nothing wrong with that mind you - that's your job). Comparing an Opteron dual core using optimistic power calculations against a Xeon single core using pessimistic power calculations resulted in showing a V20z in the best possible light. Should that surprise us? Is dual core more efficient based on your calculations than single core? Of course. Do Xeons run hotter than their AMD counterparts today? Yes. Are either of these facts long-term sustainable? I doubt it. And to use such data to label Dell as being "evironmentally irresponsible" as one of your colleagues did is, in my opinion, a bit sensationalistic. Is Sun environmentally insensitive for the SPARC performance/watt numbers cited above? Of course not. And, yes, I do work for Dell (software marketing) and had I wanted to remain anonymous I wouldn't have left my email address. I happen to send my RSS feeds and Blogmail to my personal address to avoid clogging my work email (work email is drew_engstrom@dell.com). I did work for Sun for 5 years and consider myself fortunate to have worked with such great people. Keep up the good work - it will be interesting to see how the market "votes" on this one. Jaime - you are correct, Dell has made decisions (based mostly on the discipline of keeping opex numbers at a point where we can deliver consistent profitability and growth to our shareholders) that have resulted in our being dual "core-less" for the time being. I can't change that reality. I'm just interested to know what the long-term viabilities economic realities of these platforms are - and Marc's analysis helped me derive a conclusion.

Posted by Drew Engstrom on September 01, 2005 at 10:30 AM PDT #

Some good dialog here. Drew, I wonder if you have any comments on this article http://www.microscope.co.uk/Article138835.htm from the UK where Dell responded to an RFP for 550 servers with 1000 servers and as a result is being acused of "price-dumping". From the article, it is hard to see how this is leading to consistent profitability.

Posted by Marc Hamilton on September 01, 2005 at 10:47 AM PDT #

Marc- I may have to invoke a corollary of Godwin's Law here and declare victory based on your diversionary tactic of linking to a 2 month old story that I can't comment on ;) Seriously - I didn't intend the "profitability" comment as a snarky/smarta** reference to Sun's business and I'm sorry if it read that way. I have too much respect for Sun to get into a petty war of words. I was merely trying (perhaps unsuccessfully) to articulate the fact that Dell tends to operate under different constraints and priorities than other companies - including Sun. Sun is one of the brightest success stories in the history of technology and the fact that both Dell and Sun are successful is a great reminder that there is no "cookie-cutter" approach to being successful in business. Gotta go, but I appreciate the dialog.

Posted by Drew Engstrom on September 01, 2005 at 03:27 PM PDT #

Drew - I'm not sure declaring a victory is the appropriate language, but how about a truce? For the record, I didn't intend the link to Manchester as diversionary. While I didn't look into the specifics of the deal mentioned in the article, if I only had single core CPUs to offer and thought the competition was going to offer dual core, I might have responded with 2x the number of servers as well, and thus the topic seemed related to the thread. However, how about droping the whole power thread and focusing on an area of potential mutual interest. I noticed you have a software marketing role at Dell. I also noticed that the Dell 1425 (and quite a few other Dell systems) are on the Solaris 10 HCL list, http://www.sun.com/bigadmin/hcl/data/sol/systems/details/900.html. However, I couldn't find any way to order Solaris 10 as an option on Dell's web page. Maybe we could work on an OEM agreement. Please feel free to contact me offline if this is of interest. Might even be a way for Buffalo to save some money when their Linux support agreement comes up for renewal. I am not sure of our OEM pricing for Solaris, but I bet Sun would be willing to make Dell a deal that was more profitable than your Linux OEM agreements. Just a thought.

Posted by Marc Hamilton on September 01, 2005 at 03:49 PM PDT #

Marc - truce accepted. After watching several hours of CNN last night it's pretty hard for me (and anyone else who's following the Katrina story) to focus enthusiastically on performance/watt conversations. I appreciate Sun's homepage Red Cross link and encourage folks to donate generously. as far as Solaris on Dell - I will indeed take your comments seriously and make an inquiry with the approriate team at Dell. In the meantime I do have an honest technology question that I'm pretty sure you can effectively address: When I was at Sun (in J2EE land), the systems group often spoke of the importance of "balance" in systems and that GHz alone was rarely an accurate metric to predict real world performance. My question for you, as the industry enters the multi-core era, is this: How will multi-core impact overall, real world systems performance in terms of balancing processing power with memory access and other I/O path operations? And, secondly, will this new "rebalancing" further drive the requirement for high-speed, low-latency interconnects such as Infiniband or 10GigE/RDMA/TOE? I haven't looked hard - but I'm guessing that someone at Sun has developed a public whitepaper on the topic... Appreciate your thoughts.

Posted by Drew Engstrom on September 02, 2005 at 07:03 AM PDT #

Drew, I have some thoughts to share on both the Katrina disaster and balance in systems, I think I'll post as separate entries rather than continue the comment thread. Stay tuned.

Posted by Marc Hamilton on September 02, 2005 at 10:20 AM PDT #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2010 by marchamilton