Wednesday Sep 24, 2008

Saving a Fortune in Data Warehousing

UPDATE at bottom.

I just wanted to extend my congratulations to the team at Greenplum, and our joint customers at Fox Interactive Media - the folks behind MySpace, Photobucket, IGN, FOXSports.com, and a whole series of web properties that together represent one of the single largest audiences on the web.

All three of us announced today that Fox is running a massive production data warehouse built atop Greenplum's data warehousing software on Sun's Solaris/ZFS based OpenStorage platforms (a sea of Thumpers, to be specific). That is to say, open source software is at the core of one of the world's largest - and most affordable - data warehouses.

Fox joins a series of joint Sun/Greenplum customers, from LinkedIn to the New York Stock Exchange, in looking to open source databases and innovation as a vehicle to drive better insight, faster decisions and more efficiency.

Which is to say, customers that are tired of proprietary vendors with a knack for raising license fees during economic downturns have a clear set of remarkably affordable alternatives. Based on commodity economics everyone can understand.

Congratulations to all involved!

______________________________

UPDATE: I've gotten a fair number of inquiries from folks wanting to know how the Greenplum/Thumper data warehouse discussed above prices out against its competitors - given that one recently announced proprietary entrant has suggested $15,000 per terabyte is acceptable to customers. My view is that's a pre-bubble price, and roughly an order of magnitude too expensive in today's market - and unlikely to garner more than headlines. But that's obviously a biased view, I'd check with a few customers to find out what they want to pay.

Share this post  del.icio.us | digg.com | slashdot.org | technorati.com | reddit | facebook | stumbleupon

Sunday Sep 14, 2008

Of Wine, Virtualization and xVM

A few years back, I remember sitting with a group of customers talking about wine, and virtualization (a natural pairing, if ever one existed). Wine, because we were at an event Sun was hosting in Napa Valley, the heart of California's wine country - virtualization, because the attendees were data center professionals who'd come to talk about the future.

The customers in attendance all ran very high scale, high value data centers, who would deservedly respond to the accusation that they "hugged" their servers with "and what of it?" They were the individuals who kept some of the world's most valuable systems running with exceptional reliability.

But they were all starting to see and worry about the same thing, running applications in "virtualized" grids of networked infrastructure ("cloud computing" wasn't yet in vogue, or I'm sure someone would've used the term).

Now, virtualization is a simple concept with a fancy name (abbreviated to "v12n" by the cognoscenti - by that method, I am "j14z"). It's simply slicing up physical computers into many smaller "virtual" computers, each of which can be outfitted with its own OS and application stack.

That is, not only does a virtualized computer take on the task of running multiple OS's (running atop a hypervisor, described below), but the OS's themselves might change over time, responding to load or schedule. The traditional view of "computer A runs OS/Application B" can now give way to a more responsive "these computers are available for high priority work," without regard to operating system or architecture. A spike in on-line shopping might reallocate more "virtual" machines to transaction processing during peak shopping hours, shifting to a different OS/app stack when the frenzy dies down. Capacity moves from fixed to fungible.

Although desktop virtualization wasn't the focus of these customers, most live in a world with multiple desktop OS's, too - it's not that they all (like me) run five different desktop OS's, most don't - it's that they have multiple generations of Windows, or no longer have the source code to legacy applications, a condition that dictates you keep old OS's (and hardware) around. Desktop virtualization enables users to run multiple OS's side by side on a single desktop, and divorces software upgrades from hardware upgrades (an innovation keeping CIO's and developers smiling).

Back to the datacenter, virtualization can enable extreme infrastructure consolidation - decoupling applications from hardware drives more efficient capacity planning and system purchases. And as exciting as that was to everyone, if things went wrong, you could also tank the quarter, blow those savings and end your career. So, why all the anxiety?

If I could sum it up, these customers worried that virtualization would dissolve the control they'd carefully built to manage extreme reliability. In essence, they could hug a virtualized mainframe or an E25K (hugging is the act of paying exquisite attention to an individual machine), but it's far harder to hug a cloud. Nor can you ask a cloud why it's slow, irritable, or flaky, questions more easily answered with a single, big machine.

As the wine soothed their anxieties, a few of them began to draw out their vision of an ideal cloud environment (our laptops were open to take notes). Summarized, here's what they wanted:

Extreme diagnosability. Datacenter veterans know that things rarely run as planned, so assuming from the outset you're looking for problems, bottlenecks or optimization opportunities is a safer bet than assuming everything will go as expected. They all wanted ultimate security in responding to the question "what if something goes wrong?" - their jobs were on the line.

Second, they wanted extreme scalability - they all believed the move toward horizontally scaled grids (lots of little systems, 'scaled out'), would give way (as it always does) to smaller numbers of bigger systems ('scaled up'). We're seeing that already, with the move toward multi-core cpu's creating 16, 32, 64 even 128 way systems in a single box, lashed together with very high performance networking.

But scalability applies to management overhead, as well - having 16,000 virtualized computers is terrific (like 16,000 puppies), until you have to manage and maintain them. Often the biggest challenge (and expense) in a high scale datacenter isn't the technology, it's the breadth of point products or people managing the technology. So seamless management had to be our highest priority, with extreme scale (internet scale) in mind.

They wanted a general purpose, hardware and OS independent approach. That is, they wanted a solution that ran on any hardware vendor they chose, not just on Sun's servers and storage, but Dell's, IBM's, HP's, too. And they wanted a solution that would support Microsoft Windows, Linux and not just Solaris. Ideally embraced and endorsed by Microsoft, Intel, AMD, and not just Sun.

And finally, they wanted open source. After years of moving toward and relying upon open source software, they didn't want to reintroduce proprietary software into the most foundational layer of their future datacenters. Some wanted the ability to "look at the code," to ensure security, others wanted the freedom to make modifications for unique workloads or requirements.

And with that feedback, the answer to the above seemed obvious to one attendee, "why can't you guys just use Solaris?" They all ran Solaris in mission critical deployment, all appreciated its performance, they loved the diagnosability (via delivered via DTrace), and the capacity to scale to the largest systems on earth. It was the perfect answer until one of the customers asked, "do Windows customers want to run Solaris? I don't think so." The "Solaris" brand didn't convey OS neutrality - and that neutrality was core to what we were thinking. But we knew the underlying inventory of OpenSolaris innovations would certainly give us a fabulous headstart.

That's the rough backdrop to what drove our virtualization announcements last week - a desire to solve problems for developers and datacenter operators in multi-vendor environments. If you look to the core of our xVM offerings, you'll see exactly how we responded to the requirements outlined above: we integrated DTrace for extreme diagnosability. We leveraged the scale inherent in our kernel innovations to virtualize the largest systems on earth. We've built a clean, simple interface to manage clouds (called xVM OpsCenter, click here for more details), to address management and provisioning for the smallest to the largest datacenters. And everything's available via open source (and free download), endorsed by our industry peers (watch these launch videos to see Microsoft and Intel endorse xVM - no, that's not a typo, Microsoft endorsed xVM). We even leveraged ZFS to get a head start on storage virtualization (the next frontier).

And why call it xVM? To make sure everyone knew we weren't simply targeting Solaris - xVM virtualizes Microsoft Windows, Linux (Ubuntu, RHEL, all other distros) alongside Solaris (8, 9 and 10). Customers can consolidate those operating systems, and similarly consolidate their hardware infrastructure - and use xVM OpsCenter to manage and maintain the whole plant.

This week, we're unveiling a full line of desktop to datacenter virtualization offerings, covering desktop virtualization (xVM VirtualBox), datacenter virtualization (xVM Server), high scale management (xVM Ops Center), and Virtual Desktop support (xVM VDI and SunRay). All endorsed and supported by the industry, and all in use by some of the most powerful customers on earth.

And to that end, I'd like to offer my thanks to the customers who were present at that event a few years ago, and offer my sincere congratulations to the teams involved in bringing xVM to market, across Sun and our partner community.

With all the celebration around xVM, perhaps our next customer event should be held in Champagne...

Share this post  del.icio.us | digg.com | slashdot.org | technorati.com | reddit | facebook | stumbleupon

Sunday Sep 07, 2008

Fanning the Winds of Change in Storage

It's been over a month (and three hurricanes in America) since I've posted a blog. More than a few of you've noticed - thanks for the prodding...

It's been a busy summer, on nearly every front. Customer activity hasn't slowed down, and the good news surrounding the (otherwise unfortunate) economic crisis embroiling many customers (especially those in the financial services industry, a heavy concentration for Sun) is that it's whipping up the winds of change. Customers facing spending pressure, or tiring of vendor price increases have new options, and there's a new appetite to explore those options (nothing like mandates from the CEO to reduce spending by 50%).

One of my more interesting recent meetings wasn't with a customer, though, it was with an equity analyst from a global financial institution. Equity analysts publish research that feeds the investment community - their (free) research and financial analysis accompanies buy/hold/sell recommendations to investors (who hopefully generate trading fees for the analyst's employers).

This one analyst hadn't historically followed Sun, and was in the process of developing his first rating. He wanted to focus on our storage plans - more and more of the customers whom he interviewed were focused on storage, and many were talking up a specific open source software technology: ZFS. (Before meeting with me, he'd talked to colleagues in his own IT shop, and was impressed to find some who admitted to running ZFS at home - nothing like touching your customers where they live... if you'd like to have ZFS sent to you, click here or on the LiveCD shown at right.)

Granted, you can see an increasing focus on storage at Sun - the acquisition of MySQL is as much a storage acquisition, as an enhancement to Sun's developer offerings. Discussions of flash memory, the economics of archiving, the Lustre parallel file system, all point to an increasing focus on what Sun sees as an exceptional opportunity for customers (and thus, investors). Storage and computing are converging - and we're about to bring the trends that transformed the server industry a few years ago (mass engagement in open development communities, and scale achieved via clusters of commodity parts vs. proprietary technologies) to the historically closed and proprietary storage industry.

Now, the notion of "engaging customers in open development communities" doesn't sit well among some traditional storage analysts (or our competition) who believe "Storage is too mission critical to tolerate open source software." Although I appreciate that wisdom and experience, I think the market's more nuanced than that - mission critical environments don't tolerate unsupported software, true, which is why we offer 24x7 commercial support for ZFS (on Sun hardware, and Dell, even). But broad global adoption of key open source projects will continue to drive change deep into the world's datacenters. Gartner's prediction that 90% of world's companies will run open source software didn't specify where they'd be running it - "everywhere" is the safest bet.

But back to the equity analyst - he patiently asked, "Great theory, but when will you see revenue results?"

"Last year," I responded. "You're seeing it accelerate."

As many folks know, we shipped our first ZFS based storage systems in 2007 - known as Thumpers. Thumpers finished up this last year generating around $100m in billings, up 80% year over year. From a capacity perspective, we delivered roughly 90 petabytes of Thumper storage in FY2008, to some of the most demanding storage installations on earth (up ~200% y/y). What's fueling the growth? Adoption of ZFS is a clear driver (this chart gives you a sense of where we're seeing adoption - thus revenue opportunity). But ultimately, customers are recognizing they can save money, space and power. Thumpers are roughly twice the capacity in half the space at half the cost of the competition - $1.20/Gigabyte. (They also run Windows and Linux with the same hardware economics).

Now, our view is "OpenStorage" (systems built from commodity parts and open source software) will grow far faster than the proprietary storage market. We plan on driving that growth, and over the next few months, you'll see a tremendous amount of storage innovation targeting the growing breadth of customers wanting better/faster/cheaper/smaller options. Expect to see flash, zfs, dtrace, and good old fashioned systems engineering play a very prominent role in an aggressive push into the storage market.

And in case you missed our announcement last week, our progress was validated by industry analysis - IDC said customers are growing their disk storage business with Sun far faster than with any of our proprietary competition. And at three times the rate of the overall market's growth. A great place to start.

If you'd like to know more, and might be interested in taking a Thumper system for a free trial run, just click here and pick the country in which you're located. We supply most systems at Sun for free trials across the globe (yes, we even cover shipping to you). If you like the system, please buy it. If not, we'll take care of getting it returned to Sun, you owe us nothing. (That's the closest we can get to free hardware downloads...)

As I said to the analyst, you need only look to the results we're already delivering to see the linkage between open innovation and revenue growth. ZFS won't transform demand for our legacy products, but it'll certainly transform the opportunity and industry unfolding before us. But don't just get our opinion, the best folks to validate our approach aren't at Sun, they're among the storage buyers finally feeling the winds of change - at their backs.

Share this post  del.icio.us | digg.com | slashdot.org | technorati.com | reddit | facebook | stumbleupon