sungrid link
20050531 Tuesday May 31, 2005
Too Risky, even for Lloyds

Gotta love the Register with it's vulture logo. But I came across this headline, “Itanic sinks at Lloyds Register” and just had to laugh...

As most of you know, we have been proceeding to build out Sun Grid with a focus on Opteron based processors. What is becoming increasingly clear is that there is a cost of power - namely that power efficiencies will be a critical element of any utility offering. We really notice this when we try and power cycle quite a few racks (32 nodes x ~350W = ~11kW/rack) at the same time and recognize that instantaneous power-on spikes (booting every node at once) could cause us to blow circuits. In fact, where we used to pay “per circuit” for power, many of our co-lo partners are now moving to metered power just like we do at home. This really makes some sense because power in = heat out, which is really a recognition that though floor space is expensive, that power density is a very large contributing factor to overall cost.

This makes me double excited about our Niagra based processors which are expected to provide 8 cores at < 60W:

 Processors Throughput Images Gap

“Imagine the potential impact to IT operations: a single blade shelf designed to do the work of 32 of today's 4-way servers; eight rack units instead of 160; less than 3 kilowatts of power versus 38; one blade system to manage instead of 32 servers.”

and I haven't even started to talk about the relative balance in performance that Concurrent Multi-Threading (CMT) provides in beginning to address the growing gap in cpu clocking vs. memory latency in which traditional processor designs are frequently waiting for the data that they need to perform useful work to become available.


Permalink
Trackback: Technorati cosmos http://blogs.sun.com/dhushon/entry/too_risky_even_for_lloyds
20050523 Monday May 23, 2005
Telematics Update, Detroit

Sorry for my week long hiatus,

Last week I attended a telematics conference in Detroit MI (Telematics Update). One of the things that I realized, once there, is that telematics, something that I have been working on since 2000, still hasn't reached the status of mainstream, in fact, far from it. Furthest along is Onstar, but even they haven't been able to do something that I think is critical: to develop a community of value around a technology platform.

It seems to me that in everyone's attempt to “own” the customer - and by doing so, justify the expense of the infotronics gateway for telematics delivery through directed revenue, that people have kind of ignored the business models of very successful companies like DoCoMo, where a bet was made that the platform would take off when a suitably rich community existed.

I talked briefly with Scott McCormick, President of the Connected Vehicle Trade Association and he agrees that we need a cross-industry community, that he is trying to form. But, to me at least, it seems a chicken-or-egg situation persuant to this community. One of the big problems is obviously that the auto makers want to enforce both a standard look and feel, and in doing so ensure compliance with “distracted driver” rules - who can blame them, but also they are concerned that they could shoulder the $100-$300/car burden for the inter-vehicular gateway and not be able to properly capitalize on their investment. Gee, this is starting to sound a good bit like the challenges that we had with the J2ME handset profiles ;)

Anyways, after reading Ron Goldman and Richard Gabriel's excellent book, Innovation Happens Elsewhere, which i HIGHLY recommend, it became fantastically clear - to me, at least. The automotive community must establish a community effort to standardize an appropriate interface for the developers to be able to build against, in a suitable community, Logo Jcpor risk having a consumer platform become the defacto candidate wresting all control from this community (auto-TiVO(sm) anyone?)

Anyways, I'm pretty passionate about telematics because I think that this is one remaining untapped market that can drive business productivity, drives mobility interfaces (identity and contexts), and lastly could serve to improve my life balance...


Permalink
Trackback: Technorati cosmos http://blogs.sun.com/dhushon/entry/telematics_update_detroit
20050515 Sunday May 15, 2005
Q: SunGrid cheaper than Beowulf?

I recently came across this analysis from Thom Hickey, with whom I have absolutely no relationship: so his analysis is totally his own, [but I have taken the liberty to restructure]

So, let's take some of Thom's #'s:

Operating Cost (C) = $100,000/yr
Operating Cost/h (Ch) = (C) / 8760 hr/yr = $11.42/hr
CPU's (P) = 48
Hourly Operating Cost/P = $0.24/cpu-hr (which is cheap at this small scale)

Now this doesn't look that high (though I'd suggest that these economics are akin to measuring the cost to produce energy using a home generator vs. an efficient commercial gas turbine), but when we really dig into how he is using his Grid, we can begin to see the real benefit of utility business model:

utilization (U%): 48cpu-hr/day / 1152 available cpu-hr/day = 4.2% of available resources
annual consumption: 48cpu-hr/day * 250 workdays/year = 12,000cpu-hrs or $12k in use for the compute elements.

which isn't an efficient use of capital by any measure.

As Thom states: “Even if you throw in the occasional run-away process that burns up cpu time for a weekend, out-of-pocket costs should be under $30,000/year.”

In fact, even at 4x the apparent /cpu-hr cost, to do testing, having error prone jobs, and even scaling up the jobs, we're still well under 50% of the cost of maintaining your own cluster.

Furthermore, we still haven't looked at the networking costs, the job management/operational costs and other burdens. Sun Grid, for example has an Infiniband option which provides a 4xIB non-blocking fabric. For those of you who haven't been watching Infiniband, it is quite different than the bussed fabric's like Ethernet and has performance that increases as nodes are added vs. the reverse for Ethernet. This provides a very interesting extra-chassis backplane-like technology and is roughly 1/2 the cost of a 10GbE equivalent (per port cost). Ask other competitors how their grids are architected, I'll be that you'll see that not only is our business model unique, but so is our architecture.

A: it's not that SunGrid is necessarily cheaper, pre-se, but that multi-tenancy and sharing expenses can make Sun's $1/cpu-hr a bargain; depending on your utilization. The value is only further bolstered by a substantial investment in high performance infrastructure like the IB network.


Permalink
Trackback: Technorati cosmos http://blogs.sun.com/dhushon/entry/q_sungrid_cheaper_than_beowulf
20050509 Monday May 09, 2005
Requirements for Architecture.next()

At Sun, we have long predicted the impending doom for systems that are often termed Generation 1 (G1), or as I call them, “linear architectures”.... systems that are wholey built from a traditional 2/3-tier architectural approach, in which all elements of the presentation, business and data management layers are solely constructed for a single application, not built for “sharing”.

For many reasons, G1 systems and approaches, have continued to have longevity, why won't they continue? For we have been using the same system paradigm since the mid-70's, a continual drive towards commoditization and manufacturing has thus far survived major disruptions.

The shear speed and power provided by the network of connected computing resources is seeing a shift in the paradigm from the network as a data transfer media, to the network as a fabric for the execution of complex distributed services. This is perhaps arguably one of the challenges facing our large scale SMP product line (but we're not alone in this challenge - IBM's Mainframe line is really having some challenges).

The problem that is becoming more apparent is the exponential complexity introduced by a complex Service Oriented Architecture (loosely coupled coarse/medium grain just in time compositional systems) when architected using the mechanisms traditional to G1 systems.

An example of this fractal complexity:

Company A including a service from company B in their SOA. Company B then compositionally includes services from C to complete, perhaps w/o A's knowledge, what happens if C uses the same service from A...

At best this means that identity and privacy need to be further protected, at worst we may have a re-entrant/recursive execution problem (yeah single box threading can be hard, but network threaded apps where you don't have control of the components or sourcecode?), or “ilities” problems as the G1 paradigm typically does not find the need to elaborate security, scalability, availability, manageability models as part of the core design since each “tier” is typically fully characterized only for use by the preceding component: “the EJB's only get called through OUR Servlets” or “the DB is only accessed through OUR EJB's”.

With the movement toward component compositional models, SOA's, and Grid computing we begin to realize that a new paradigm is afoot - both with it's challenges and advantages:

What are G.next's core challenges:

  • fallacies of networking
  • network “enforced” isolation through distribution
  • contiguous system memory vs. distributed memory
  • service operation model & recognition of partial failure
  • federation of core “identity services” including identity, context/role, entitlement

But it carries substantial advantages:

  • differentially granular and dynamically scaled systems
  • ability to take advantage of locality / proximity in execution
  • dynamic parallelization of workflow
  • ability to version/add functionality (carefully) in-vivo
  • declarative security levels and well known enforcement models
  • enforce isolation rules / best practices through interface “encapsulation”, resource management and controlled access

What are some of the major trends that I forsee needing in the construction of G.next based systems:

  1. Majority of time in planning and assembly vs. iterative contstruction
  2. Improvements in model annotation to capture systemic qualities and referential patterns (micro-architectures) so that non-functional behaviors can be better understood
  3. Declarative models vs. code (though pseudo code could be used for declaration for rich syntax)
  4. Federation of core services ( a core tenant of SOA that many forget!)
  5. Ability to deal with distributed & distributable data
  6. Append only / constant query data models (fail in place, recover from clone)
  7. Omnipotent debugging, AOP and debugging languages like “D”
  8. Systemic (compositional) SLA managment ... what is possible & what is desired
  9. Micropayment environment to allow for “for fee” SOA services
  10. Fine grained identity & entitlements to allow for security level agreements w/ tooling

More later!


Permalink
Trackback: Technorati cosmos http://blogs.sun.com/dhushon/entry/requirements_for_architecture_next
20050505 Thursday May 05, 2005
The Old IT Is Dead. Long Live the New

I've been sitting on an article for a little while (4/18). I just found it refreshing that a journalist finally said something that made sense in explaining today's corporate buying & investor strategies.

“the industry is on the cusp of a sweeping change to new information technologies such as true mainframes-on-a-chip, Web services, and open-source software... Everything we talked about in the '70s, '80s, and '90s -- putting together clusters of PCs to replace big machines -- is finally happening.”

And finally a statement that I fully support...

“The fact that the next generation is less expensive does not mean that growth disappears. If you wind up uncovering significant new ranges of applications and you end up deploying them far more widely, you're going to dramatically expand digital services.”
I think that most everyone recognizes the growth of stored data that everything from compliance to RFID are driving to record levels, and it can only be expected that people are going to want to begin to capitalize on this expense that they are incurring, through programs including statistical analysis for business process optimization, product quality / time dependent reliability, or financial planning. What was holding us back? I can only suggest that it had something to do with cost, something with scale, and something with complexity. Only by looking systematically across these factors can we finally realize a solution.
Permalink Comments [1]
Trackback: Technorati cosmos http://blogs.sun.com/dhushon/entry/the_old_it_is_dead
20050501 Sunday May 01, 2005
Disruptions and Discontinuities

A large number of people have been telling me over the past weeks that they cannot see their workloads shifting outside the “protected” four walls of a corporate data center. To that I respond, are your four walls really that protected, I mean most corporate Intranets are little more secure than the social engineering that continually compromises them; take the ChoicePoint incident for example, which had no hacking involved.

The question I really ask: are you so sure that you cannot better manage corporate/state/local/federal policy through contracts? to companies who make isolation, and enforcement a priority because of their multi-tenant nature? Just take for example an apartment building, where the common areas are secured to protect the dwellers despite the fact that some dwellers may not lock their own doors. For tenants who do lock their doors, the exterior doors add an additional layer of protection, under specific contract with the condo association. - Just a thought!

I then moved on to a review of Tom Friedman's new book “The World Is Flat': The Wealth of Yet More Nations” In this review by Zakaria he relates a section of the book that really typified why I think that multi-tenant utilities, like Sun Grid will inevitably have the loads to make them work:

Jerry Rao [an Indian entrepreneur], explained to [Tom] Friedman why his accounting firm in Bangalore was able to prepare tax returns for Americans. (In 2005, an estimated 400,000 American I.R.S. returns were prepared in India.) ''Any activity where we can digitize and decompose the value chain, and move the work around, will get moved around.

specifically that where there is a need which cannot be met with existing resources: financial, staff or other, then innovation will naturally fill the demand. These disruptions/discontinuities where new processes so fundamentally shift the economics vs. the old businesses, it becomes easier for people to recognize that changing, and in some cases standardizing is worth it!

My copy is on order, can't wait.


Permalink
Trackback: Technorati cosmos http://blogs.sun.com/dhushon/entry/disruptions_and_discontinuities
Disclaimer: These are the express views of Dan Hushon, and in no way are indicative of the views, strategies nor plans of Sun Microsystems, Inc. Creative Commons License
All content on this website (including text, photographs, audio files, and any other original works), unless otherwise noted, is licensed under a Creative Commons License.
Valid HTML 4.01! Valid CSS! Listed on BlogShares