The Sun Blade Blog

Friday Apr 04, 2008

Clear Up Sun Blade I/O FUD*

I've seen and heard a lot about our blade I/O strategy.  The most popular criticisms have been:

1) Why doesn't Sun get it, we [the customer] need switches to (reduce cables, lower cross charges, etc.)

2) If you believe in no in-chassis switching, why do you offer a 'Switched' Infiniband NEM?

Let's take a moment to open (1) and (2) to discussion.  I'd like to hear customer feedback and take the opportunity to clear up some FUD / myths.  Wish me luck!

(1) We've all seen the IBM commercial with a ball of cables growing infinitely in all directions.  It's clever.  But, let's remember that cabling is a one-time experience. Re-cabling times have been reduced with virtualization and re-provisioning software from Sun and 3rd party vendors like Scalent.

Cable counts do drop with in-chassis switches, but customers are encouraged to do a total cost of acquisition analysis.  If you plan to deploy 3 or more ports / blade, you'll see increased hardware costs and increased switch deployment costs with in-chassis switching.  Basically, the cost / port sky rockets.  There are also the intangibles. 

    - What are the costs associated with learning a new network topology and proprietary software from vendors?      Is it worth the additional lock-in?

    - What are the required security policy changes between the network and server groups at your company?

    - The points of failure inside the chassis increase. How does this affect your SLA?

    - Are the in-chassis switches as capable as traditional switches from Cisco, Brocade?


(2) Offering an IB switch for Sun Blade 6048 may seem hypocritical, but it's not.  This switch is designed to work with Sun Datacenter Switch 3456.  This product combination was specifically designed for the HPC markets (academia, large industry,  research).  If viewed from a HPC perspective only, our offerings look very attractive.  Customers get high-speed, low latency IB performance that is unmatched in the market.   HPC is all about very (x1000) fast number crunching and Sun does it best. 

Independent I/O Is Fast, Real Fast

* FUD => Fear, Uncertainty, Doubt


Comments:

From looking at the NEM module diagram on sunsolve, it appears that the board is totally passive. Thus, even if sun were to open the specs, a 3rd party such as brocade/cisco/foundry/etc could not develop a blade to slot into the chassis for sun's customers to use.

That said, our Windows team had a Dell blade system with a built in switch. They hated it almost as much as the network team despised it. The thing caused more problems on the network than anything else. They always had issues with vlans and knowing which ports went where.

In the end, simply having a pass through is probably best. You can always reverse mount a 24 port gig-e switch with a pair of 10G uplinks right above the chassis and use that. Same thing with a pair of Brocade 200E's (they want to be reverse mounted anyway, so it is rather easy)

anyway, my 0.02 worth

Posted by John on April 04, 2008 at 11:15 AM PDT #

We picked the X6000 solution almost entirely BECAUSE of the open I/O architecture. The only way to get 10Gbps Ethernet on other major vendor blades is with switch options. These are very expensive and present a single point of failure we've had the fortune to experience with our previous switch-in-a-chassis solution.

Our problems were not sustained. Rather, they were the really annoying, intermittent kind (Etherchannel issues). The kind that can only come from software. It was eventually fixed by a code update, but this very update caused yet another outage. Cables don't have software.

Oh, and it's not FUD if what you're saying is true. Internal switches DO reduce cabling. If reduced cabling is your design priority (as with your own HPC example), then that's the better choice. Reduced cabling wasn't our primary concern. But it might be for someone.

Posted by Charles Soto on April 04, 2008 at 08:03 PM PDT #

We have both Sun & Dell in our shop.

In 2006, the Dell blade chassis with integrated Cisco & Brocade was a very attractive story as it promised a lot of capacity and throughput, with localized complexity, to help us with the coming virtualization trends. The entire ingress/egress for each chassis of processing was limited to about 12 cables, which was a great story for our operations folks.

We bought in, but it was a trap! We got caught in the middle of Dell's product cycle. That generation of chassis is being replaced by the next generation, and there is no upgrade path for the integrated switching and fabric which was quite expensive. So here we're literally stuck with 512 ports of ethernet and FC that are trapped in these chassis frames, with no compute refresh for the frame on the horizon.

If we had gone with pass-throughs, or Sun's offering, we would have been able to replace the blades at the typical 3-4 year compute cadence, and carried on with the normal 6-8 year lifespan of switching and fabric separately.

Second trap! The Windows jockeys in our shop didn't see the value of virtualization or consolidation as it took them out of their comfort zone. So they used the chassis resources as single purpose server blades, so the overall utilization of the resources was pitiful, and thereby making the costs per workload absolutely exorbitant.

Lessons learned:
- Decouple commodity switching and fabric from your compute purchases and manage their lifespans independently. Your finance department will thank you.

- Get a longer roadmap and contractual guarantees up front from your vendor if there's a possibility of a refresh trap.

- Don't consider costly dense technology unless your team is forcibly mandated to use the technology to it's maximum utility.

Life wouldn't be so interesting if we didn't make these mistakes.

MK

Posted by Michael Kennedy on April 05, 2008 at 04:25 AM PDT #

As you mentioned IBM Blades and their built in switches: We've several of these blade centers, and our networking people really hate the switches. They have limited capabilities and cause problems with the rest of the network infrastructure.

Posted by Christian on April 07, 2008 at 12:50 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

Calendar

Feeds

Search

Links

Navigation

Receive Blog Via Email

Add to Technorati Favorites View blog reactions

Referrers