I went and saw a neat presentation by Scott McCord at the Oklahoma City OpenSolaris User Group meeting last night. He mainly talked about our Unified Storage 7000 products, building up the rational for why a customer would want to use them.
The timing was awesome as it dovetails into my exploration of the systems. One of the key take aways for me is that the morphing of OpenSolaris offerings into Unified Storage was not a small step. Yes it is built on the shoulders of everyone who has ever worked on Solaris, but there is a lot of small and subtle changes which really transform a collection of hardware into a powerful appliance.
I'm going to highlight what I learned, which may or may not dovetail into what Scott presented.
I had kept on wondering why Sun wasn't exploring NVRAM and the answer is Write Optimized SSD. Sweet! And then Scott showed a slide of a clustered 7410 and the Write Optimized SSD wasn't in the head, it was with the disk shelves. The implications were immediate:
This went over well in the room and I think might actually be one of the hidden definitions of OpenStorage. :-> It is simple, you buy one of our appliances and everything that is supported by the hardware is immediately open for you to start deploying. You don't have to contact a sales rep, you don't have to rush through a PO because your management chain decided to start using your NFS appliance as a CIFS appliance as well.
And when you download the Sun Unified Storage Simulator, you can play with all of the features right away. You don't have to ask for permission.
This is basically a fancy way of saying that we have a graphical frontend to DTrace. And DTrace is the coolest customer support tool ever. It sounds weird, but let me explain.
Before DTrace, if a customer was hitting an issue that didn't present a core dump and static trace points weren't capturing the data, the only solution was to prepare a custom kernel to ship to the customer. This would entail the developer guessing at what data they wanted to collect, unit testing, and passing the resultant kernel off to QA. They would then do unit testing, perhaps try to recreate the customer environment (and a shout-out goes to Bill Snider who was really good at doing this when I was at NetApp), and then do regression testing.
The kernel would then go to the customer. And they would then normally run some of their own regression testing. They would then put it on the production system. If you were lucky, you got the data you needed. If you were unlucky, then you either had to roll another version (with the customer losing faith in your abilities) or you got a core dump. And again, you might still get lucky and catch the problem in the core. But you really had an irate customer either way.
With dynamic tracing, you remove that whole cycle of verifying a custom build. And remember, that cycle could be several weeks long! Instead, you piggyback right off of a well tested product and the resulting quality. You focus right in on the issue at hand. And even better, there are some pretty sharp sysadmins out there who can run DTrace on their own!
Along these lines, I asked Scott to include the video from Brendan Greg's blog on Unusual disk latency (video is here).
I'd heard out at Connectathon 2009 that our appliances would fall over if you shouted at them. I knew that our competitors were going to use this as FUD. So I was really happy to find Brendan's openess to the issue.
But don't lose sight of Brendan's real message here - Sun has the only tool dynamic enough to measure the impact of shouting at a disk. Any competitor's disks are going to react the exact same way in the face of a shout. But can they measure the impact without that cost of developing special static trace points?
My point to Scott is that the video is exciting to watch (thanks to Brendan's upbeat personality) and drives home the point of what DTrace and Analytics brings to the table.
Someone in the audience stated that they just got a Thor box on a Sun Try and Buy program and wanted to know if they could convert it directly to a Unified Storage product. The short answer is to turn that Sun Fire X4540 Server back in and get a Sun Storage 7210 Unified Storage System out for an evaluation.
Scott didn't blindly recommend this, he asked the audience member what they were doing and the answer was it was intended to be a CIFS Home-directory server. If the answer had been along the lines of they wanted to run something on the box, then the recommendation would have been to stick with the Thor.
The point here is that the Unified Storage is optimized as an appliance. We aren't going to let you run a program on it. The focus is to leverage the reliability of OpenSolaris, the rock solidness of the underlying hardware, the experience we've had in configuring ZFS, tested hardware configurations, and the ease of use of the BUI.
We haven't moved away from our stance that we have great OpenStorage systems that allow you to run your application right on the server. Instead, we are focusing on taking that same hardware and software base and turning it into an appliance tailored to serving up data. Our value add is the way you interact with the appliance, the testing we put into both the software and hardware configurations, etc. I.e., we make sure the configurations work right out of the box. And in a stroke of genius, we make sure that the configuration of the box is simple.
Tom,
Awesome, thats a very nice concise report. Hope to see you again at the next meeting!
BB
Posted by Bryan Boden on March 13, 2009 at 09:40 PM CDT #
Agree about the advantages of write-optimized SSDs as an alternative to NVRAM. Providing mirrored write cache has to be one of the biggest engineering challenges in designing storage systems. Moving the write cache to a dual-homed SSD external to the controllers is both brilliant and simple.
Posted by Mark on April 05, 2009 at 02:33 PM CDT #