From chaos comes order

data center design stuff
Thursday Apr 03, 2008

Data Center book now free for downloading...

After much work, a PDF version of my data center design book is now freely available. It can be found here.

Enterprise Data Center Design and Methodology book 

 


 
 

Thursday Feb 08, 2007

Put down the flaming torches and pitchforks...

I have been thinking quite a bit about data center environments lately. Seems
only fitting as I am one of the people working on the buildouts for our
Santa Clara, CA. campus to hold the systems (current and next
generation) coming from our now sold, Newark, CA. campus. I keep having
the same thought occur over and over in my head. The real nightmarish
part is the thought that follows it. A vast throng of data center IT
folks chasing me with flaming torches and pitchforks, out to kill the
demon they didn't even want to know existed. (OK...so I was a pretty big
fan of Boris Karloff movies growing up. ;-) )


Before I go on, I need you to put the pitchforks down and extinguish the torches....no seriously...put the pitchforks down.


OK. Everybody take a deep breath.


Raised floor is dead!!


Yep, I said it. Raised floor is past its prime. A dinosaur, who's only
future is to become a fossil. It is a 14.4 modem in the land of 3G and
802.11n. Raised floor is dead!!


So why is raised floor dead?


It comes down to higher power density systems, airflow and the efficient
use of space. Did you know that a rack (42U) full of 1U x64 servers
(pick your vendor of choice) will be in the range of 20KW a rack.


Truth be told, you are hard pressed to cool a 10KW rack just with raised
floor air. It can be done. But, 10KW per rack is about the realistic max
for row full of racks. Now you could more easily cool 10KW a rack if you
don't want rows of racks. Take a 10KW rack and put 10-15 foot of "white
space" all around it, it becomes easier to cool. But, guess what? That
"easier cooling" doesn't have much to do with the where the air is
coming from. It has to do with having a bunch of space for that hot air
to dissipate.


OK, if raised floor is dead, how are we gonna cool racks. The answer is
point cooling. A device that delivers the the volume of cold air a
machine needs and extracts the hot air that the machine generates.
Virtually all of these point cooling systems do this by either being
suspended above the rack or attach themselves on to the side of the
rack. And, in the case of phase change systems, to the back of the rack.


There is also another consideration, and it is probably the most
important. Its cheaper. Raised floors are expensive, for both parts and
installation. The installed cost of raised floor is in the $50 to $75 a
sqft. And point systems are usually cheaper per KW of cooling. Some ball
park numbers, from a source I cannot name, work out the the installed
cost of traditional raised floor CRAC unit cooling system to be in the
range of $1500 per KW of cooling. A point cooling system works out to
about $1000 per KW of cooling.


So lets do a little math. Say you have a 2500 sqft data and you are
going to put 80 20KW racks in it.


The raised floor is going to cost about $60 a sqft or $150K  (60 * 2500
= 150000)
CRAC HVAC is $1500 per KW or $2.4 million  (80 * 20 * 1500 = 2400000)


Point cooling is $1000 per KW or $1.64 million (80 * 20 *1000 = 1640000)


So the raised floor option is approx $2.55 million.


$2.55 million - $1.64 million = $910K


Between a much more effective and efficient cooling model and a savings
of $910K worth of cash, I have to say "raised floor is dead!!"


I guess I shouldn't be that concerned with the flaming torches and
pitchforks, after all, I can buy some really good body armor and bunch of
fire extinguishers with  $940K.

Monday Jul 17, 2006

Study and promotion of energy efficient servers passes US House of Representatives

Did you know that the 109th Congress of the US House has a passed a bill to study and promote energy efficient servers? Well, it has. Go to http://thomas.loc.gov enter "HR5646", select Bill Number and hit the search button. You will find the text of the bill and the fact that it was passed by the House on July 12th. It is now on its way to the Senate for approval. The vote passed in the House by 417 (yeas), 4 (nays), & 11 (not voting). Granted, it ain't much, but, it is a start. As for myself, here is a personal round of applause for the 417 members of the House who voted yea on the bill.

Friday Jun 23, 2006

Power Benchmarks...it ain't really about the chip...

A recent News.com article
http://news.com.com/Chipmakers+admit+Your+power+may+vary/2100-1006_3-6082352.html
there was talk of a power benchmarking metric of performance per watt. This type of benchmarking is not a bad idea. (Frankly it is about time).

However, there are a couple of things about this that should be considered.

The first is that any benchmarking of power is not all about the processor (single core or multi-core). The processor is only part of the equation. Today most people think of power consumption and think only about the processor. It is not.

Does anybody just use the processor? Not really. After all, what good is a processor without some RAM to hold the code in? (Answer: Not much.) You can have a great chip, but without an ethernet controller chip and phsyical interface to serve that data to the (grid, enterprise, internet, etc.) the work that the chip is doing isn't really of that much use. So when we talk about power benchmarking, (or any other form of benchmarking for that matter) it is the entire system that matters. Today, the processor is the main culprit in heat generation. However, that was not always the case. Back in the late 80's disk drives used more power and generated more heat than the CPU ever did. It is not a far reaching leap to think that some number of years from now when machines have petabytes of memory (no...petabyates is not a typo) the memory system will generate more heat than the CPU, and if the industry does not look at reducing power consumption of memory (and other asics as well) we won't really substantially "fix" the problem. So while Intel and AMD are locked in combat on the CPU front. I hope the other chip folks are not just in the stands watching the match.

Also, there is more to life in the real data center world than just performance per watt of the CPU. Performance per watt per amount of work done over a certain threshold for the entire system is what is really important. Why? I am glad you asked. A system has a given set of work to do over a certain time period. For example: Serve 10,000 web pages per minute. Let us pretend that vendor A makes a machine that serves 100 web pages per minute, and the uses 100 watts of CPU power, and vendor B makes a machine that serves 10,000 web pages per minute and uses 10,000 watts of CPU power. They are both providing the same perfromance per watt of the CPU. But, you will need 100 of vendor A machines to do the same work as 1 vendor B machine. However, vendor A will need 99 more network interfaces, needing 99 more interfaces on a switch (bigger switch to handle the ports means more power used by the switch). Vendor A needs 99 more disk drives (one per machine or double that if they are mirrored) All of that "other stuff" uses power and generates heat. However, if you only look at performance per watt of the CPU you will miss all that other power and heat.

The good news is that power benchmarking is being developed by more than just the chip makers and the issues around trying to reduce power and cooling in the datacenter is being addressed through the Green Grid.

http://www.thegreengrid.org/

Thursday Mar 16, 2006

Some new tools...power calculators and Sun Sim Datacenter

In case you didn't know. There are a couple of new tools available for looking at some of the "eco-impact" of systems.

The first is a set of power calculators for a few of Sun's newest systems. The Opteron x64 based systems X4100, & X4200 and the CoolThreads Sun Fire T2000 server. The links are below.
X4100 calc
X4200 calc
T2000 calc

You can select from a range of options for each server including # of CPUS, # of DIMMS & what density 512MB, 1GB, 2GB, # of internal disk drives, PCI cards, and # of power supplies.

It will give you watts used & BTU/HR generated. (The x64 also gives you HVAC tons, but, I wouldn't use that as a hard and fast rule as it does not take into account the HVAC efficiency of your data center.)

I found these calulators fast, easy to use, and helped out with a bit of power/HVAC capacity planning I was doing this week. (And as I mentioned in a previous entry, one of the first keys to better data center efficiency is getting useful data.)

Now the other tool that was recently released was Sun Sim Datacenter.
More info and downlod of Sun Sim Datacenter here

Its not a full CFM (computational fluid-dymanic modeling) ala Flovent (http://www.flomerics.com/flovent/ ) but, it gives you the ability to build a different racks and look at what the power/space/cooling requirements are and how you could arrange them in your data center. (If they would have included system weight it would be a perfect graphical view tool of the RLU methodology found in my data center design book, (Enterprise Data Center Design and Methodology)  I am gonna try to get them to add weight field to the sim.

The other big advantage to Sim Datacenter is you can build your own racks full of stuff, so you can customize it to your particular requirements and equipment.

None of them is the be all and end all, but, they are pretty darn useful and the price is right...FREE!!!!!!!!!

Check them out, you may find them quite useful.

Tuesday Mar 07, 2006

Debunking the MPG myth...

If you saw the latest entry in the eco community blog  (Eco Community Blog) you will get a good, but, not quite accurate, example of trying to define a standard metric. Here is the relevant snipet.
"A simple example: we don't concern ourselves with the efficiency of individual components of cars.  Toyota doesn't market the Camry as having 80% efficient fuel injectors, or .12 coefficient air friction body styling (those numbers are made up).  They just say 34 miles per gallon highway, 24mpg city (not made up).  It's a measure of a given amount of work for a given consumption of energy, and flawed though those calculations may often be, they are at least comparable from one model to another, one manufacturer to another."

The basic idea is sound, however, as a self-proclaimed "car freak", it is not quite accurate and that has implications.

Let us look at the slightly flawed reasoning. MPG is not a gaurantee of the actual milage you will get. Its not a bad metric, but, an accurate representation in the real world, it ain't.


Ask yourself this question. Do you get the same MPG as your car is rated for?

Chances are probably not. It has a lot to do with your driving style, what you have in the car (weight), the lubricants used in the motor and transmission (friction), the air pressure in your tires (rolling resistance), the exact terrain and weather you are driving in (do you have a head wind or a tail wind) the outside air temp (the colder the intake charge, the better), and the octane of the gas you are using. (there is a reason why there are different octane levels, not to mention race gas.)

So your exact MPG will be different, depending on all these factors. So MPG is NOT measure of a given amount of work for a given consumption of energy. It IS a measure of a given amount of work for a given consumption of energy based on a number of variables, that they ain't telling you about.

Think of MPG as slightly less of a lie, or a slightly less half-truth.

So back to energy usage for a system in a data center. (And don't over look that word "system", it is the real crux of the matter.) It is the performance of the entire system (computer, storage, network, etc) that needs to be given. Saying "machine X uses 200 watts and machine Y uses 220 watts" doesn't mean anything unless the entire workload you are comparing on machines X & Y is totally contained on those single machines. No external storage, no network gear, no other systems operating as a different tier layer. If any of those do exist. Than the only accurate representation of the system MUST include all of those things.

Did you know that for every SPEC and HPC benchmark, a full disclosure of exactly what the HW and SW configuration was for that benchmark is required. (here are a couple chosen at random)
Specweb2005 results page

TPC-H results

Check the disclosure reports, there is a full spec on exactly what the specific system config was for each result.

So perhaps that is the answer. Can we get the benchmark standards groups to make power reporting (how much electricity is used on all the parts of a given config for a given benchmark) part of the reporting rules? Hook up a few power monitors to the configs and get the data and force the companies reporting results to publish that info as well.

I grant you, it wouldn't be perfect, but, it would be a pretty good starting point. It would tell a customer what the power usage was for a specific configuration, running a specific workload, what the result of that workload was, and how much power it took to do that work. Now that is a REAL metric, based on a fully disclosed set of parameters.

Wednesday Mar 01, 2006

Whence the data center Rosetta Stone

It is hard to argue against the idea that the understanding of language (written and/or verbal) is a foundation stone of civilization. Take ancient Egypt. The hieroglyphs that made up that written language were not understood for centuries. In 1799, near the town of el-Rashid, the Rosetta Stone was found, and there was finally a way to understand what some of those hieroglyphs meant. More info on the Rosetta Stone can be found at The British Museum So what does stone tablet made in 196 BCE have to do with data centers? I am glad you asked. There is no common language that everyone agrees on to define what goes on in a data center from an electrical and/or thermal standpoint. Talk to many HVAC folks, and they talk in HVAC tons. Talk to rack folks and upstream electrical folks, and they talk in KVa. Another group talks in CFM, and another in Watts, and another in BTUs (but that is really BTU/hr). The problem is that many of these groups make certain assumptions that YOU know all the unmentioned details about what they are talking about.

What we need is a common set of equations that everybody agrees to use. Define a BTU as 3.412, or 3.42, or 3.416. And decide on what the V in KVa really is. If people want to use 200 or 208 or 220. That is fine. But, let us all know what the number is. So I ask again. Whence the data center Rosetta Stone? If anybody knows of a data center in el-Rashid, Egypt, can you check under the raised floor, maybe it is there. But, where ever it is, the sooner we find one (or make one) the easier and more efficient all of us are gonna be.

Wednesday Feb 22, 2006

Begin at the beginning

"From chaos comes order" - Friedrich Nietzsche The more I look at data centers, and how to make them more efficient and environmentally responsible, the more I am convinced that there is something to that. So how do we get from chaos to order? Well, like any good design process, the first step is to ask the right questions. That is the starting point of this blog. Trying to figure out what some of those "right questions" are. I hope those questions will spur comments, discussion, participation. (After all, this is the participation age.) Who am I? Rob Snevely My background is a bit "different", including theater, art history, and tweaking on cars. I bring this up because some of the references you might see in this blog may use ideas from other (read: non-computer) areas. When I do, I will try to remember to provide a bit of background and/or a link or two. For example, here is a link to what wikipedia.org has on Nietzsche. If you just can't wait for a bigger chunk of info on my general thinking about data center design, you can find the foundations of it in this book.

What is next for my blog?

Whence the data center Rosetta Stone.


Archives
Links
Referrers