20061017 Tuesday October 17, 2006

THE INDUSTRIAL REVOLUTION, FINALLY

I've commented frequently upon a central paradox of IT: software and hardware components are the products of fierce, high-volume competition, yet their final assembly by IT organizations is one-of-a-kind artisanship. To quote Scott McNealy, I've never toured a datacenter with the reaction "Wow, this looks just like the one I visited yesterday!"

We ought to ask why this is so, because it is supremely inefficient. Practically all IT organizations speak of the commoditization of computers, but seldom of computing. Partly, this is because computers and storage are simple to understand and quantify compared to the enormous complexity of their assembly into systems that deliver some (with hope, predictable) level of business service. This complexity not only is expensive, it's viscous. Business innovation, the central goal of IT, suffers.

There is certainly a school of thought that this complexity is inherent and the proper (read: profitable) thing for a vendor to do is insulate the IT customer from it with "services and solutions". From our vantage point, this is a punt. It's far better to attack the composition of systems to provide useful service as an engineering problem, not as an Exercise Left to the Reader.

And it is precisely in this spirit that Project Blackbox was born. We went back to engineering first principles: how do you transport, physically assemble, power, cool and ultimately recycle computing infrastructure? Take the joules-in, BTUs-out problem as one of engineering co-design. Something that can be quantitative, efficient, and manufacturable in volume.

Many unquestioned assumptions were put on the table. "Why do we build datacenters?" (Because of latency and administrative scale issues.) "Why do we build machine rooms?" (To let people and machines cohabitate, you know, to mount tapes, clean out chad, and punch buttons...) "Why do we have hot-swap fine-grained FRUs?" (To give the cohabitants something to do?)

Where we ended up with Project Blackbox is admittedly not for everyone. It is designed for ferocious scale, complete lights-out, fail-in-place, virtualization, uber-fast provisioning, and brutal efficiency. And I'd like to emphasize that the we expect that the most efficient way to deliver computing and storage services is with containers. Full stop.

While we've tried to keep the project as stealth as possible, we have disclosed aspects of it during its development to selective sets of potential customers and analysts. Feedback has been categorically positive, from "I need ten of these tomorrow. No really, I'm not kidding." to a giddy "This is classic Sun! Why didn't someone do this before?".

Yeah, this seems obvious, so why don't we build datacenters this way? It's the same kind of reaction one had to luggage with wheels, in-line skates, or parabolic skis. Obvious in hindsight, so why did it take so long? Well, obvious at one level, but most definitely dependent upon basic technological progress (in these cases, advances in bearings, plastics and laminates).

For containerized computing, the underlying enabler is the confluence of power density, lights-out management, horizontal scale and virtualization.

Let's look at power density. Half-a-dozen years ago, we were indeed building mondo datacenters, but at quite approachable power densities: typically under 100 watts/ft2. But as we continued to compress physical dimensions (the 1RU server and, now, blades) while simultaneously running hotter chips with more DRAM and disk, watts-per-rack skyrocketed.

Today, 10 kilowatts/rack is standard fare, and many folks are facing 15, 20 and even 25 kw. A standard rack fits nicely over a 2ft x 2ft floor tile. Thus, a 20 kw rack "projects" 5 kw/ft2. If my datacenter is 100 w/ft2 then I can only put one such 20 kw rack every fifty floor tiles! Even a completely modern, leading edge datacenter at 500 w/ft2 spaces our 20 kw rack one every tenth tile.

(It's the square root, natch': for the 500w/ft2 facility it's a rack, two empty tiles, then a rack, in both x and y. For the 100 w/ft2 case, it's a rack, six empty tiles, ...)

No wonder that people are out of space, power or cooling (they are all inter-related). And no wonder I get people jumping out of their chairs wanting ten Blackboxes "tomorrow"!

[Aside: don't confuse power density --- watts/unit volume --- with power efficiency--- watts/unit performance. Even super power-efficient designs such as the eight core UltraSPARC T1 can lead to high power densities, for the simple reason that cramming processors closer together allows them to be more cheaply and effectively interconnected. Low power processors do not necessarily imply low power density systems. But because you use fewer of them overall, they most definitely can cut the power costs of delivering a certain throughput or level of service]

Actually, the higher the power density, the more desirable containerized computing becomes. A standard TEU-sized (8ft x 20ft) container readily can handle eight 25 kw racks. That's a power density of 1,600 watts/ft2. And we really aren't breaking a sweat at these levels, they could easily be doubled owing to the dedicated heat exchanger for each rack position in the cooling loop.

Lights out management (LOM) is another technology enabler. Simply put, we've had a lot of pressure from our customers to make sure that no one is required to interact with a functioning server or storage system. Again, this is a long way from the implicit assumption left over from the mainframe era that there are "operators" for computers.

[Another aside: we are constantly reminded that if you want to build very reliable systems, the best thing you can do is keep people's fingers away from them. There are significantly non-zero probabilities that an operator coming in physical contact with a system, despite all best intents and training, will break something; not infrequently, by disconnecting a wrong cable or wrong disk drive.]

When we mix in virtualization and/or horizontal scale, we finally get to the place where a bit of code doesn't have to run on a particular computer, it only has to run on some computer. Thus, we can use mature techniques such as load balancing, along with emerging ones such O/S paravirtualization and dynamic relocation, to abstract applications from computers. And that leads to service strategies such as fail-in-place, and a wholesale re-evaluation of things like hot swap and redundant power supplies.

Clearly, this level of physical engineering attacks only a focused part of the complexity-at-scale problem, which is manifold. Given this qualification, Project Blackbox is a real, tangible step towards the purposeful engineering and mass production of modular infrastructure. The cobbler's children no longer have to go barefoot, and the industrial revolution can finally arrive for scalable computing.

However the market develops, I know my wife, Laurie, is relieved that Project Blackbox is finally, well, out of the box. For the past two years, whenever seeing a container any where on the road, a train, stacked aboard a ship, or sitting motionlessly at some job site, I'd predictably mumble "that could be one of ours...". And that would lead to my pleading for a commercial driver's license so I could haul them around on an 18-wheeler to different events. "Have you seen the way that you drive?" is the inevitable reply. Of course, I know she's right (and she's a far better driver than I, for the record).

But, Laurie, please, I'll only drive it on the weekends, and just around the block!

( Oct 17 2006, 12:00:00 AM PDT ) Permalink Comments [10]

Comments:

This is really classical stuff.

Kudos to Sun.

BTW, Jonathan mentioned that a few of your customers requested enclosure detonation incase of a break-in, are there provisions in-place for that?

Just kidding... ;-)

Posted by Mayuresh Kathe on October 18, 2006 at 05:57 AM PDT #

Great work Greg.

What you've done is "Insanely Great"...

Posted by Steve Jobs on October 18, 2006 at 06:06 AM PDT #

Has some sort of option for a heavy-duty dust filtering system been engineered in so the military can use them in the desert? I can see them buying a ton of these.

Posted by 192.18.126.88 on October 18, 2006 at 12:57 PM PDT #

Has some sort of option for a heavy-duty dust filtering system been engineered in so the military can use them in the desert? I can see them buying a ton of these.

Posted by Russ on October 18, 2006 at 12:58 PM PDT #

What do you mean "nobody has done it before". I heard rumors of Google doing something similar, and APC actually launched a mobile datacenter last year: http://www.apc.com/resource/include/techspec_index.cfm?base_sku=ISXT440MD12RMBL APC's offering is incomplete since it lacks servers. Sun's offering is incomplete since it lacks UPS/generators. I've seen similar ideas for the military too, but it seems the US military prefers to do things inefficiently/incompetently. No matter, it's definitely a good idea. But you guys better hurry before Dell thinks its a good idea too :p. Then Data Warehousing could take a new meaning.

Posted by Nobody on October 18, 2006 at 08:49 PM PDT #

Greg, One question about this concept - with redundant high-speed access to the computing farm, what is the primary advantage of making these mobile? I understand the value of engineering the environment (building, container) for the power, cooling and access needs of the datacenter, but why put it in a box? Is this left over from the desire to own the data on site for security reasons? An alternative would be to locate datacenters in areas that are cheap to cool (avoid hot climates) and have lower energy costs. String redundant fiber for communication, and move the data, not the datacenter. In any case, good example of in-the-box out of the box thinking!

Posted by Steve on October 19, 2006 at 09:29 AM PDT #

Greg, Another 'novel' idea from SUN. An obvious use for 'remote' business, education, military and/or natural disaster instances. Wish SUN engineers were creating that 'next BIG thing' within technology. Sadly, mobile shipping container enclosure 'datacenters' likely does not represent such. However, NEVER stop trying!

Posted by William Walling on October 19, 2006 at 03:57 PM PDT #

Greg, Use of 'commodity" 1/2U parts may be the fastest way out but surely it is not the best way to optimize power and volume ...the rest is very interesting. You did not start with a 'clean sheet' of paper on this.

Posted by Richard on October 21, 2006 at 03:42 AM PDT #

[Trackback] La noticia geek del día es la presentación de Sun Blackbox, un gadget más que interesante. Los chicos de Sun han empezado a pensar inside the box. Mientras todos los fabricantes intentan miniaturizar los componentes, ellos se pr...

Posted by think in blog on October 21, 2006 at 06:26 AM PDT #

Interesting stuff Greg. The choice of a standard in packaging, the 20 foot container, immediately provides you with an infrastructure for transport on rail, roads and the high seas. It gives a new meaning to infrastructure mobility through a drop and pick-up paradigm. I also find the case for hands-off operation implying higher reliability quite convincing and the Lego-like assembly of containers for scalability convincing as well. And finally, in terms of synergy, it capitalizes well on new directions that Sun is taking in heat-efficient chips and virtualization techniques. However it seems to me like you’re not considering international deployment. In that unsafe world at large the “standard box” model requires one of its least noticed features: anonymity. Containers are anonymous, you can never tell what’s inside by just looking at them. Your containers carrying the Sun logo on a black background proudly could be spotted a mile away by pirates and parasites that live on trying to know what is packaged inside a container, even as it travels on the high seas in Maersk ships. On the other hand, if applied to the international scale, the whole point would be flexibility and mobility, and in that scenario no one would want to wait for weeks while his container travels the planet on ship. So obviously, you must have considered that if you should go international with the box, local assembly points would be a must. It’s too bad that the standard 20 foot container is a not a standard for air travel. I wonder whether it is worth considering a lighter, different, packaging that could use the air freight option. My final point is the following: we’ve always used the black box image to represent computer systems. That big mainframe over there is a black box with CPUs and storage and everything all packaged together and standing in the corner. What you need to stress in your black box is that it goes beyond the previous concept, and is basically an atomic datacenter carrying its environment with it; and as such it can literally be dropped anywhere.

Posted by George Zakharia on October 21, 2006 at 07:12 AM PDT #

Post a Comment:

Comments are closed for this entry.

Calendar

RSS Feeds

Search

Links

Top Picks

Navigation