Tuesday Mar 17, 2009

Over the past few months I had heard exclamations of amazement regarding a storied new data center in the Nevada desert called SuperNAP. I was a bit skeptical of the superlatives about scale and efficiency that embellished these stories. My skepticism turned to exuberance last week when I joined a group of architects from Sun for a tour.  The goal of our tour of this Mega Data Center was to see first hand the state of the art as implemented by Switch Communications, where Sun operates it's cloud computing business.

Switch, and it's customers, which include several operating units within Sun, are beneficiaries of the collapse of Enron. The former utility giant had designs on trading network bandwidth using models similar to their energy trading systems. When Enron's flimsy financial structure gave way, their financial backers and the U.S. government stepped in to auction these assets. Switch CEO Rob Roy was the only one that showed up at the auction block.  In an uncanny twist of fate, he managed to side step what could have been a formidable bidding war to control this hub of communication that is unparalleled in North America.

Here are the vital stats that only begin to describe the phenomenal facility that Switch has managed to assemble:
  • 407K square feet of data center floor space
  • 100 Mw of power provisioned from two separate power grids
  • Fully redundant power to every rack, backed by N+2 power distribution across the facility
  • Enough cooling and power density to run at 1,500 watts per square foot (that's 10x the industry average 150 watts).
  • 27 National network carries
This describes the capacity of the SuperNAP, which is just one of the eight facilities operated by Switch within a 6-7 mile radius in a no-fly zone south of Las Vegas.

Sun to Reveal Cloud Plans Tomorrow

Some details of Sun's Cloud Computing business will be revealed tomorrow (March 18) in New York at the CommunityOne East event.


 Additional Resources

Thursday Jan 10, 2008

The Demand Response Research Center (DRRC) at LBNL provides a system that enables electric utility customers to automate energy load shedding during peak demand periods.  It's called Demand Response Automation Server (DRAS).  Basically, it takes a feed from the utility whose payload includes data for: Event Pending, Price levels, and Price Schedule.  The system can interface with environmental management systems to turn off lights and raise set points when curtailment events and price jumps occur.  It could also be used in combination with systems management software to automate data center load shedding, but has yet to be adopted for this purpose.   With advanced virtualization and automation technologies, there is ultimately no reason that workloads could not be migrated according to a demand / cost equation.light buld by A.A.

The barrier to DRAS adoption in the data center has been a mix of scant awareness, legacy perceptions, and a shortage of creative thinking.  But all that is changing.  DRRC is on a campaign now to build awareness among all eligible facilities and IT managers about the potential for DRAS.   And the old saw that says power cycling a computer reduces it's MTBF is just that: old.  Today's systems, for the most part, are engineered to withstand daily power cycles well beyond the typical useful life of a computer.  With the benefit of this knowledge and the findings of a Harris Interactive Poll commissioned by Sun, IT and facilities managers are inducing employees to turn off their computers when not in use.  But we still need more creative thinking.

While the traditional facilities folks appear to be abundantly creative with power saving measures (in a DRAS webinar last month, Aimee McKane from LBNL, cited one example where a bakery participating in DRAS bought more bread pans to avoid running the dishwasher during peak demand periods,) creative applications of DRAS in the data center are in short supply.  

Still, the conversations are happening, as Walter Bays demonstrated on his blog today.  If a courageous (and creative) company were to combine Dynamic Infrastructure technology from Sun and DRAS from DRRC they could begin to realize the savings possible in Walter's energy utopia.  Services from Sun and others specializing in Demand Response systems like EnerNoc, can help these first mover companies develop strategies for capitalizing on these huge energy saving opportunities.

Wednesday Oct 31, 2007

This question was the underlying theme of many sessions at the 2007 Uptime Institute's Design Charette, which I attended this week.  But it's the wrong question.  As I wrote in a follow up to the EPA Energy Star report on data center efficiency the bigger question is:  How can IT create value, in the broader economy, that replaces other less efficient modes of commerce and interaction?  In that context, any goal to reduce data center energy use is probably unattainable.

Data center energy consumption is projected to be 2.5% of total U.S. electricity demand by 2011, and it's tracking to double every 5 years.   Should IT managers be focused on driving that ratio and rate down?   That's another wrong question.

My design charette team was focused on Data Center Management & Metrics.  Green Grid contributor Ken Uhlman from Eaton was on the team.  He  posits:  "Managers get things done right.  Leaders get the right things done."  Accepting that axiom, it becomes clear that efficiency potential can only be maximized if we have both managers and leaders focus on the challenge.  Data Center managers need to reduce the marginal energy use per unit of work executed in the data center.  Leaders need to find ways to deliver economic value over the network that are more efficient than current business and social practices.
Millionsofus.com
For instance, how much energy can be saved by services like those offered by MillionsOfUs.com?  Every "test drive" of an automobile in these virtual worlds uses some amount of electricity and causes a puff of CO2 to be emitted from a power plant, but the watts/joules/calories and GHG emissions involved are infinitesimal compared to that of a trip in the combustion powered vehicle down to dealer row to try out cars.

Clearly, IT driven efficiency has been at work for a long time.  Over the last 40 years, global economic productivity gains have been driven largely by IT, and much of this gain has arguably resulted in a net reduction in energy use (modulo the indirect demand for energy driven by IT).  But how much?  And what is the size of the opportunity ahead to do even more?   Studying these effects was a clear call to action in the EPA Energy Star report, but no such action appears to be underway.

While it is critical for managers to get a handle on efficiency within the data center envelope - and the potential here is huge - real leadership in energy efficiency will come in the form of value creation over the network that displaces less efficient value creation.

Sunday Oct 14, 2007

I started tracking news about green datacenter a little more than year ago using Google Alerts.  To the extent this statistically flawed analysis represents real media attention on the subject, news of datacenters' role in the green revolution is spreading.  Online news matching green+datacenter has been on the rise over the last thirteen months and has jumped up following the initial six month period.

 Green+Datacenter news chart

Sun's share of the news has increased in the latter half of the period too.  Of the 172 individual news items since March 2007, Sun (represented by the blue line in the chart) was mentioned more frequently and in some periods dominated the news wire.  This is probably a reflection of PR dollars spent as much as it represents newsworthiness of Sun's efforts in this area.  Still, it's nice to see word getting out that Sun is at the center of this conversation. 

Tuesday Aug 14, 2007

The U.S. EPA Energy Star Program released a report on server and data center efficiency(PDF) this month.  The study was in response to Public Law 109–431(PDF), which required EPA to analyze "the rapid growth and energy consumption of computer data centers by the Federal Government and private enterprise".   The Law goes on to say, "It is the sense of Congress that it is in the best interest of the U.S. for purchasers of computer servers to give high priority to energy efficiency as a factor in determining best value and performance for purchases of computer servers."

The increased federal attention to server efficiency is good news for Sun, particularly in light of the recently announced UltraSPARC T2 processor, a.k.a. Niagra 2, which, at two watts per thread, will clobber the competition in most commercial SWaP comparisons.

Where it comes to energy demand, the EPA report keeps the big picture in focus, citing that IT is not only part of the problem, but also part of the solution:

Energy Star logo

"The data processing and communication services provided by data centers can also lead to indirect reductions in energy use in the broader economy, which can exceed the incremental data center energy expenditures in some cases. For instance, e-commerce and telecommuting can reduce both freight and passenger transportation energy use."

The authors recommend quantifying this indirect reduction through IT services in future research.  This is a largely untapped source of energy conservation for which a range of alternatives exist.  For example, using technology such as the 4 watt SunRay thin client, businesses could shift employee desktop computing tasks to run on optimally efficient servers in the data center, rather than the mostly idle 200+ watt computer at every desk scenario that dominates corporate work environments.  Companies could also employ work at home programs like Sun's Open Work, and make much greater use of video conferencing and web meeting software.  These conservation efforts inevitably increase energy demand in data centers, but clearly offset much larger energy demand by providing reasonable alternatives to some very energy intensive practices that dominate business culture today.

Seperate from any empirical consideration of such indirect energy reductions, the report estimates that by 2011 U.S. businesses could shave off $4.1B in data center electricity costs annually just by following best practice outlined in the report.   Considering that total U.S. data center electricity costs in 2006 were $4.5B, that's a lot of efficiency gain by 2011.

Interestingly, the $4.1B potential data center savings is mirrored by the potential savings determined by a Harris Interactive poll commissioned by Sun, for conservation in the office by workers.  The results of the poll, released August 1, indicate that energy-conscious behaviors of U.S. office workers can save $4.3B in energy costs per year.  With a flip of two switches (lights off, computer off,) workers can make a huge collective difference, equivalent to taking 6.1M cars' CO2 emissions out of the atmosphere.

So we're looking at potential savings of $8.4B, just by doing what we already know how to do, with no compromise to services or productivity.  Add in whatever additional energy can be saved by replacing energy intensive business practices with services over the network and you've got a really good economic case for aggressively pursuing energy efficiency in the data center and the workplace.


 Further reading:

Monday Jun 04, 2007

Sun hosted a well attended Silicon Valley Leadership Group (SVLG) event focused on Energy Efficient Datacenters last Thursday at Sun's Santa Clara campus.  The agenda consisted of case studies, an energy demand management program and a facilities tour.  I'll share some of the highlights here.

Small Retrofit Investment Yields Big Savings 

NetApp Facilities Director, Dan Hoffman presented the results of some very targeted energy efficiency measures in a mid-sized (6764 sq ft.) data center.  The net results of a $146k investment were impressive:

  • Estimated Energy Savings: 1,042,000 kWh/year
  • Cost Savings:   $125,000 -$145,000 year
  • GHG emission reduction:  3.6 million lbs./year (equal to removing 150 cars from highways annually)

And under a PG&E rebate program they were able to recoup their entire initial investment, so the project had a zero year payback.

The key measures, developed with the assistance of Lawrence Berkeley National Labs (LBNL), that made these savings possible included:

  • Supply right amount of air at correct temperature where it is needed, i.e., to the inlet of IT equipment
  • Install curtains between hot and cold aisles to minimize Source Air (SA) and Return Air (RA) mixing
  • Replace SA registers that direct air to top of racks with grills that direct air downward to reduce stratification & hot spots
  • Install array of wireless temperature sensors strategically placed on cold aisle side of racks to enable raising SA temperature (i.e., raise the set point on the thermostat by 2-4 degrees)
  • Allow RA to rise to 80-90°F to enable a substantial reduction in RA/SA fan speed
  • Optimize “free-cooling”effect of economizer by raising SA temperature and allowing dampers to modulate longer to reduce the load on the cooling system

I always hear the question, "Why don't we build data centers in northern climates where we can use the outside air for cooling?"  So it was of key interest to me to hear Dan's experience with "free cooling" and the economizers' effect on airborne particle concentrations.  When they increased "free cooling" to use 85% outside air the particle concentrations jumped to 11 micrograms/m3, nearly double their normal level.  This is well below ASHRAE standards, but still of potential concern to data center managers worried about contamination.   The environment used 40% filters, which is typical of modern data centers.  It was estimated that the particle count and size could be reduced significantly (over 50%) by replacing the 40% filters with 85% filters.  The energy required to force air through these finer filters, of course, is higher, so may not be rationalized in terms of net energy use.

Demand Response Coming to a Data Center Near You

LBNL Computer Systems Engineer, Girish Ghatikar shared details of the Demand Response (DR) programs being developed in conjunction with utility companies, and gave some compelling data on why voluntary participation in PG&E's AutoDR plan could be beneficial to medium and large IT equipment users. 

The current PG&E DR Program provides technical assistance and requisite equipment to help facilities managers shed electrical load during critical peak usage periods.  Savings through discounts on electricity can be significant (up to $50 per kw saved) under the program if the facility can reduce usage during "events" (project to be between 12 and 15) during the summer (May 1 - October 31).  Participants are notified day before or day of an event, and participants control the level of curtailment during the period.

For most data center managers, the idea of switching off IT equipment during business hours and on short notice, is akin to fingernails scratching a chalkboard.  The mere mention of cycling the power on a server gets a visible shiver from the average system administrator.  But many of the concerns are becoming manageable, and the notion that cycling the power reduces MTBF is dated - most modern IT equipment does not suffer any significant reduction in reliability due to moderate frequency of power cycles. 

In response to the management problem - how to efficiently and gracefully respond to peak event notification - the LBNL folks have developed a systems architecture that can be integrated into existing environments.  The infrastructure includes a Demand Response Automation Server and Energy Management Control System (EMCS), which transmit event notifications and trigger load shifting actions.  The architecture leverages a new SOA standard, OpenADR based on SOAP, that enables a publish and subscribe model.  The payload of the OpenADR messages contain Event Pending, Price levels, and Price Schedule.  This architecture, while promoted through the AutoDR program can be used for finer grained control that can be integrated with other event notifications besides PG&E's critical peak days.  For example, if price exceeds $0.20/kWH then initiate a load shed activity.

State of The Art - Sun's New Santa Clara Labs

Mechanical Plant

When I first met the distinguished LBNL Engineer Bill Tschudi at this meeting, he said, "You guys are doing some great stuff, but why aren't you using free cooling in your new rooms?"  Good question, Bill.  I don't know, but I suspect it has something to do with the somewhat unique mixed use environments that are characteristic of the new labs being built in Building 12 on the Santa Clara campus. 

As Dean Nelson, Sun's Shared Lab Services Director, pointed out on our tour of these nearly completed rooms, engineers will be working in these environments as part of their normal day to day routine.  For the most part, these are not lights out environments.  Systems engineers and Services engineers will be using these large labs to do everything from prototype component testing to troubleshooting customer problems.  They're designed with ergonomics and frequent physical movement of equipment in mind.  A lot of the space in these labs is dedicated to benches where gear can be disassembled and tested easily.

Two huge cooling towers are the centerpiece of a mechanical plant (pictured here) that sits behind an office building facade just outside the labs.  These 1000 ton chillers use adaptive frequency drives that change the frequency and voltage to optimize the load to what is sensed.  Dynamic cooling within the data centers takes the cooling intelligence down to the rack level, and spot cooling minimizes unwanted air mixture.

One of the highlights of the tour was a look at the Hot Aisle Containment solutions by Liebert and APC.  There were several installations of each, and Dean expected huge savings in cooling costs due to their efficient circulation of chilled air.  The Liebert Solution has a higher total CFM, so was chosen for the more dense lab needs, while the APC Solution with some sensing and adjusting capability and lower maximum CFM were chosen for the less dense, or not always on environments.  Dean's team will be generating benchmark data that compares both solutions once they get populated and running. 

Watch this space for more on the magnitude of savings and reduction in greenhouse gas emissions as a result of the new Santa Clara labs project.


Related reading:

Tuesday Mar 13, 2007

The Open Architecture Network has posted a "project" chronicling in photos the process of bringing this community online.  Except for one particularly unflattering photo of me, there's a good sequence of photos of the SunFire X2200 M2 servers and Storagetek 3511 storage array racked in AMD's data center. 

With all the available space in that rack why did we stick the shiny new gear at the bottom of the rack?  In densely populated racks, servers mounted in the top half of the rack have as much as a 50% lower MTBF than servers in the bottom half.  The working rule of thumb is for every 10o F above 68o the failure rate doubles.  The gear that is typically most sensitive to high temperature is the power supply, hard drive, and fan.  Good thing we've got two of each in these boxes, but I wouldn't expect a heat problem anyway - the cold air blowing on my head whilst working on these machines reminded me of winter in Duluth.


Resources for managing data center cooling:

Wednesday Feb 28, 2007

The Ziff Davis eSeminar today on Managing the Exploding Energy Costs in the Data Center was billed as an opportunity to learn "How to build the energy-smart datacenter". If you run an overheated, power hungry Dell and Intel datacenter this was an hour for you. The tools, technologies and services presented were specific to these platforms. The seminar was really a commercial for these brands.

It's great to see other industry leaders focused on this problem. It's not great to see Sun's truly game-changing technologies for energy efficiency so often absent from the discussion.

Innovation coming out of Dell for policy based shut down of systems at night was featured in the presentation. This is a valuable tool for the enterprise who has predictable workloads involving periods of non-use. Unfortunately, the businesses with the highest demand for compute resource are global operations that deliver their value online, so don't have the option of shutting down systems at night. (Desktops are generally good candidates for this efficiency measure, but the period of non-use for a 200 watt desktop would have to be more than 23 hours every day in order to use the same amount of energy as a 4 watt SunRay 2. Sign me up for the 5 hour work week.)

Dell made the requisite claim that their servers are optimized for power and deliver 25% greater performance per watt over similarly configured servers. They went on to compare with the HP DL 380 and the IBM x3650.

Intel discussed a lot of the long range promise for improving processor efficiency through advances like 45nm process and, and the potential for 10X reduction in power consumption for equivalent or better performance through the use of Indium Antimonide transistors. Their material included an intersting axiom that smaller silicon means lower power consumption. They didn't answer my question about how to reconcile that with the historical evidence to the contrary - as processors have gotten smaller their power consumption has risen. Benefits of Smaller Silicon

Intel also took the opportunity to promote their Ultra-dense processor line which is the Xeon L53x0, their 40W Dual-Core, and 50W Quad-Core low voltage processors. The Quad-Core due is due to ship in the next few weeks. The quad-core is basically a dual die design that combines two of their dual-core wafer on one chip. The wattage spec. for these is interesting. Like other vendor specs, it does not include the power draw of the supporting chip set, but in the L5300 series, where they've moved the memory controller off of the processor, this means there's at least 30 watts not accounted for in the spec if you were looking for an apples to apples comparison.

The most valuable insight offered in the talk was a cost comparison of liquid vs. air cooling. David Moss from Dell said that in high density environments where air cooling has the worst efficienty, liquid is cheaper, at least in theory, than air.

Someone asked a great question about the state of the art in using waste heat from the datacenter to heat office space, and using free cooling from the ambient air outside to cool a datacenter. The answer, basically, was that people are talking about it, and it's not very practical in a place like Austin. It was disappointing that these experts really aren't thinking about the potential efficiency gains from such an industrial ecology approach. There are some practical applications of this already in use in Scandinavia. Other than Project Blackbox I'm not sure what Sun is doing in this area, but it seems like there is a big market opportunity here.

You can get the audio and slides from this and other eSeminars at www.eseminarslive.com

This blog copyright 2009 by downstream