Taylor's Take on Sun Storage : Weblog

Taylor's Take on Sun Storage

My storage team and I focus on three of the most important aspects in any industry: customers, competitors and market trends. There is insight to gain and share in this role, so here is our take on Sun and Storage - Taylor Allis


amazon disk eco emc honeycomb ibm open opensolaris openstorage saas solaris storage storagetek sun sunstorage tape thumper virtualization vtl web2.0 x4500 zfs
Tuesday Apr 29, 2008

Sun is on to something - Open Storage

I freely admit, when Sun announced its open source storage community a year go I was a skeptic.  Sure, open source has its play in software and servers - but storage?  

Well, after a year of watching Sun's open storage investments, industry-standard hardware used in enterprise storage and working within Sun on today's announcement - Sun Extends World's First Open Storage Platform - I think Sun is on to something...

Also read all about Sun Open Storage here.... 

I'll make a brief case for open storage and Sun's leadership in it here.   We have also developed a series of open storage White Papers that give more detail on the below info - I'll post them here.  

What is Open Storage? 

Here is a simple definition:  Open storage is the combination of open source software with industry-standard hardware to create enterprise-class storage systems

Open source software like Linux or OpenSolaris OS.  Open source applications like MySQL database software.  And Sun has been one of the first companies to break the barrier with higher-level open source storage applications which include:

Industry-standard hardware is typically available through multiple vendors and is very price-competitive.  Examples include x86 servers and standard FC/SATA/SAS disk drives.  One could also include LTO tape because it is an industry-standard tape technology - but I'll primarily focus on disk systems, as this is the market that will be most impacted by open storage. 

In an open storage architecture, the customer selects the best hardware and software for the job.  In contrast, almost all of today's disk arrays and NAS appliances are closed - customers are locked into using the vendor's disk drives, controllers and proprietary software. 

The irony being that a lot of closed systems are built from open source software and industry-standard hardware - helping vendor margins but not customer budgets
    

The Evolution of disk Architectures  

Our brilliant disk analyst, Bruce Norikane, also points out that industry disk systems have been evolving to more open architectures over time and with each new market introduction.  A similar trend has also happened in the server market.  Consider the graphic below:   

 

Early disk systems were custom, proprietary engineering projects starting with IBM's SLED (Single Large Expensive Disk)  in 1956 where everything was custom.  Then in the 1980's a high-volume disk market emerged thanks to PCs and servers; and in the 1990's Enterprise RAID was adopted.  Enterprise RAID incorporated a custom disk controller and these new market drives.  Modular storage then hit the market, consisting of a separate controller and disk enclosures that fit in a standard rack - more flexible and affordable.  Most recently we have seen the RAIN (Redundant Array of Inexpensive Nodes) architecture emerge - distributed storage based on server technologies offering better scalability at a better price point.  RAIN architectures are largely based on industry-standard servers, operating systems and networks.  However, while RAIN systems leverage open components, they tend to be build as closed systems with locked-in components from traditional vendors.

The next logical step is open storage - industry-standard hardware and open source software that drive down storage economics and spur greater innovation.  Again, this storage systems evolution is not unlike what happened in the server world - where servers were large, proprietary and expensive years ago. Smaller, industry-standard servers and open source software changed the economics in the server market - and they are doing the same thing to storage.

Why Open Storage?   

Four reasons:

1. Enterprise-class storage:  Systems that offer as much or more quality, reliability and data integrity as closed systems.  Sounds like a stretch?  Just see the InfoWorld review of Sun's x86, SATA, open source software-based archive solution.  It scored perfect 10s in reliability and scalability.  Let's also not forget that ZFS offers 19 9s of data integrity with predictive, self-healing features.          

2. An Open Storage Software Community:   This is important if you are a developer, a company that has developers or a company that is planning on hiring developers to differentiate through IT.  When we launched OpenSolaris Storage last year we had only a handful of open source projects - we now have over 30.  The OpenSolaris community has more than 96,000 registered members in all.  Why is this important?  Customers don't have to wait on a vendor for the features they need - they can find new innovations in the community or develop features themselves.  Innovation is not held back by vendor objectives or limited R&D budgets...  

3. Breakthrough Economics:  Probably the most compelling argument for open storage and why open storage is needed today.   The best way to understand how open storage can impact storage economics is through an overly-simplistic diagram of a closed system:

 

Now consider an open storage architecture:

 

In a nutshell, storage applications are free from licensing costs and open to developers.  Open storage users can choose the platform their IT staff is most familiar with.   An industry-standard server with ZFS (which again includes RAID, data management and data integrity features) can take the place on an expensive controller.   And affordable, market-priced disks can be deployed under the system - even fast and cheap JBOD if you leverage ZFS.

To see the real-world impact we compared some closed systems vs. open systems using Ideas International pricing:



In full disclosure, these are US list prices.  We did configure every system to be as close in capacity as possible using affordable SATA drives in most every configuration.  And finally, certain applications, features and environments simply must run higher-end arrays today - I am not implying that everyone throw out their closed storage and go with 100% open.  But at this economic price difference - users will be compelled to determine which applications and which data should migrate to more open storage - and we presume the data center mix of open vs. closed storage will change over time.         

You can also read about the open storage impact in the VTL space here...

4. Dynamic Scalability: Lastly, the ability to dynamically and efficiently scale to meet today's huge data demand has become business critical, especially with emerging Web 2.0 applications.  Sun sells systems that scale from less than 10TB to greater than 100PBs.   And OpenSolaris ZFS is a 128-bit file system that provides 16 billion, billion times (yes you read that right) the capacity of 32-bit or even 64-bit file systems. 

Sun Open Storage 

Sun offers every component of open storage:  A tried-and-tested enterprise platform in OpenSolaris, the leading HPC file system Lustre, and the open source storage applications mentioned above.  Sun also offers a complete portfolio of innovative and efficient industry-standard servers and storage. 

Sun has, and will announce, storage systems built on an open storage platform as well - Sun Fire X4500 and Sun StorageTek 5800 are Sun's first products built on a truly open storage platform. 

Sun also offers open storage services and resources through its community :

Sun Open Storage Customers

More compelling than anything I can write is what our customers say about Sun and open storage:

DigiTar
DigiTar provides advanced messaging security and processing services over the Internet.  They are using the opensolaris, Solaris ZFS and Sun Fire X4500's.  Jason Williams is DigiTar's COO/CTO and highlights his expereince with Sun open storage in his blogpost Democratizing Storage.  Even though the DigiTar team is self-described as “Linux zealots,” OpenSolaris was brought in because it made a superior storage platform.  Some of my favorite quotes are below:

“That’s the really amazing thing about OpenSolaris as a storage platform. It has all of the features of an expensive array and because it allows you to build reliable storage out of commodity components, you can build the storage architecture you need instead of being held hostage by the one you can afford.”
“When you’ve got rock-solid iSCSI, NFS, and I/O multipathing implementations, as well as a file system (ZFS) that loves cheap disks…and none of it requires licensing…you can suddenly do anything.  Need to handle 3600 non-cached IOPs for under $60K? No problem. Have an existing array but can’t justify $10K for snapshotting? No problem. How ‘bout serving line-rate iSCSI with commodity storage and CPUs? No problemo.” 

“By using X4500s, we get the same reliability and redundancy for about 85% less cost. That kind of savings means we can deploy 6.8x more storage for the same price footprint and do all sorts of cool things..."

Nexenta
Nexenta has built its NexentaOS and NexentaStor software appliance on Sun open storage products – OpenSolaris and ZFS.  This is significant, as the Nexenta team developed an iSCSI stack that was adopted by the Linux community.  Nexenta's team choose OpenSolaris for their storage platform to actually build a new NAS appliance.  Nexenta's NexentaStor offering is a software-based NAS and iSCSI solution - read about it here.  There is also an excellent blog on ZFS and Nexenta here.   

TACC 
Open Storage also has a large play in HPC - consider one of the world's largest supercomputer built from Sun's open storage, servers and traditional storage offerings.  TACC's Ranger system will be used in computational science & technology research.  Ranger runs 3,936 nodes and 62,976 processing cores; has 23TB of memory and 504TFlops at peak performance; and uses 1.73PB of shared disk and 31.4TB of local disk.  Ranger uses Lustre file system running across 72 Sun Fire X4500 servers. For long-term data retention and archive, Ranger runs Sun StorageTek SAM software over six metadata servers - and deploys five Sun StorageTek SL8500 libraries with 48 StorageTek T10000 tape drives.  Ranger will scale to over 3.1PB of online storage and 200PB of near-line storage. 

From a simple NAS appliance to one of the world's largest supercomputers - open storage scales!

You can read more user case studies below: 

  • The University of Oxford is storing 19th century works with the ST5800
  • Gracenote uses Sun Fire X4500's for its mobile music services
  • Web 2.0 SaaS provider Sapotek says, "The ZFS file system feature of the Solaris 10 OS is a marvel. It creates a common storage pool where all storage performs as fast as if it were local. Our administrators can grow, add or remove storage on the fly in a single step. Just 2 people administer 24 terabytes.”

What about Sun's other Storage offerings? 

I invariably get this question when we highlight one architecture or approach.  So, to be clear - Sun sells closed systems too...and we sell a lot of them.  We now sell both depending on customer needs.  But we see the need for open storage - and we are investing in it while other vendors are not.  We are also investing in our traditional storage products - our customers deploy a mix of storage architectures depending on their needs - so Sun sells both.  Lastly, you can't claim breakthrough economics without leveraging tape in your portfolio.  If you want to hear about Sun's tape commitments, read about my trip down to Imation

But as far as open storage goes, I think Sun is on to something...

---- Updates ---- 

Other Open Storage Blogs: 

Friday Nov 30, 2007

Top 10 Storage Technology Trends

It’s getting to be the end of the year – and Sun, like every other vendor I assume, is looking at the new technology trends that have (or will) impact our industry.

My team and I were asked to evaluate some – to see where to prioritize future investments, evaluate current and future competitive threats, etc.  Ensuring we stay ahead of the technology game.  So below is our very own top 10 list of emerging technologies.  (As well as some color to what Sun is doing here) 

I’m sure we’re missing some and comments and claims are our own, they don’t represent any commitment or position from Sun Corporate (For official news on Sun go here…)    

 Top 10 Storage Technology Trends:  

1. Open Storage Platform (aka general purpose storage, open source storage):  Trend #1 is a term we coined, so it may not sound familiar.  It is a combination of market trends as well as a direction Sun is taking with its newer products.  The concept of a common platform is not new – several vendors have tried to build one platform that can run multiple storage applications, saving users time and money.  “Open” is a relatively new concept for storage, but not for software or servers.  There are generally three components that make up an Open Storage Platform:

  • General Purpose Components: General-purpose servers, processors, storage and operating systems are now being deployed in enterprise-class storage devices. Previously, storage vendors had built proprietary operating systems, ASIC chips and other custom-built components. As commodity chips, components and software have matured, storage vendors are now using general purpose components in their systems. The cost savings are significant - however, most vendors' prices remain the same (contributing to vendor margin). Sun's philosophy is to pass these savings to the consumer. Enterprise-class systems based on general purpose components give customers higher-value systems at a fraction of the cost (see graphic below). They are also much more flexible, as they can be re-purposed for other uses.
  • OS/File System Storage Services: Traditional appliances charge customers extra in software licensing costs for data management services like administration, replication or volume management.  What if this functionality came already embedded in the storage system itself?  Sun's newest file system, ZFS, has started to incorporate these services at the File System level. ZFS deploys point-in-time-copy, volume management, administration and data integrity features like Copy-on-write and RAID. Storage services deployed at the OS-level have several benefits, including efficiency, performance, reliability and affordability (no more licenses for extra software needed).  And before we think this is a new concept, mainframe has been doing it for years. The new concept is doing this in the open systems space – and Sun Solaris ZFS is leading the charge.      
  • Open Source Storage Software: The ability to download software, test it and add features to it is critical to developers building new applications – we’ve seen this in the OS market.  But what about storage?   This has been an investment area of Sun’s - Sun actually offers one of the most complete open source storage software stacks (from protocols to drivers to data management software).  Developers can build their own storage solutions and sell them by leveraging Sun’s open source software (see the Nexenta Storage Appliance for a perfect example).  Open source has an added benefit to customers with in-house development resources - customers can deploy new software features by searching for it in the open source community or even developing it themselves (why wait for vendor roadmaps?). This is not possible with a proprietary appliance. 

The simplest way to show the impact the Open Storage Platform concept will have on the storage industry is a basic economic comparison.  We used IDC's Pricing Database to compare over 50 actual purchase orders of enterprise disk (Sun ST9900, EMC DMC3), Midrange Disk (Sun ST6140, EMC CX3) and a Sun product built on our Open Storage Platform (the SunFire X4500):

  • Enterprise Disk = $18.84/GB
  • Midrange Disk = $10.39/GB
  • Open Storage Platform = $1.50/GB
    (7x less than Midrange and 13x less than Enterprise)
NOTE:  There are certain applications and features that must run on higher-end disk.  But at a $1.50, customers will be compelled to find which applications should be running on the SunFire X4500. What is even more remarkable, the SunFire X4500 includes the server, OS, data services, storage and networking components - all for $1.50/GB! 

Game-changing economics…

2.  Virtualization:  Where to begin?  The benefits are obvious – massive savings through optimization, consolidation and optimization.  The best definition of virtualization I have seen came from the 451 Group’s Virtualization Report which defined Virtualization as, “A software abstraction layer that permits aggregation, emulation or partitioning.”  Let’s look at the different types of virtualization out there today

  • Server Virtualization:  The ability to host Windows, Linux and Solaris operating systems on one platform has literally taken the market by storm.  A huge boon to IT managers and developers alike.  VMWare led the charge with some start-ups and open source initiatives in tow.  On October 5, 2007 Sun entered the race with Sun xVM.  (And if virtualization has hit the systems world, expect to see it hit the desktop market soon…)      
  • File Virtualization (a.k.a Clustering, Global Namespace, Unified Namespace, NAS Virtualization):  A few years ago we were talking about Grid Storage – a bunch of storage nodes acting as one, single system.  File virtualization is shaping up to be the technology that will take us there.  In 2005 & 2006 we saw EMC buy Rainfinity, NetApp buy Spinnaker and Brocade buy NuView.  In recent years we we have seen EMC announce Rainfinity, NetApp announce a rough start to Data OnTap GX (based on Spinnaker technology) and file virtualization start ups Ibrix, Isilon and LeftHand Networks continue to grow.  Expect this market to heat up, and consolidate as start ups are either bought, or simply run out of runway.    
  • Virtual Tape: A disk system that emulates a tape library.  Virtual Tape has been in the mainframe market for years, and Sun is the market leader here.  Virtual Tape Libraries (VTL) in the open systems market are relatively new – disk prices have eroded to a point where backup administrators can now take advantage of disk’s access speeds without replacing their exiting backup infrastructure.  IDC pegs this market as small, but growing at 16.2% (2006-2011 CAGR). 
  • Virtual Disk (aka Storage Virtualization):  A disk system that aggregates 3rd-party disk and offers disk partitioning.  A necessity for anyone that needs to consolidate their environment and/or make data migration between independent systems easier.  Virtual Disk Systems include IBM’s San Volume Controller (SVC), FalconStor’s IPStor and Sun’s award-winning StorageTek 9000 arrays.          

3. Thin Provisioning:  Better system utilization is the name of the game.  Most admins know that the utilization rates on their disk systems are not where they need to be.  Thin Provisioning allows admins to allocate or provision space to specific applications, making full use of their system’s capacity.  3PAR spearheaded open systems Thin Provisioning and NetApp offers it as a part of Data OnTap.  Sun announced Thin Provisioning on its StorageTek 9990V system in May – meaning consumers can have the world’s fastest enterprise array, Virtual Disk AND Thin Provisioning all on one platform. 

Pretty cool…  

 
4. Data Deduplication (aka De-dup, Single-instance storage):  In a world where there is more data coming into a company than can possibly be managed – data compression ratios ranging from 10:1 to 50:1 sound pretty darn nice (See how De-dup works here).  Data Domain, Diligent, FalconStor and other upstarts get credit for bringing this new technology to market and larger vendors are quickly following suite.  De-dup is still emerging, can have performance issues and does not work perfectly for every application – but economics dictate its worth consumers investigating where it can work for them.

There are two emerging de-dup architectures:  “Inline” – where the de-dup magic happens in real-time, as data comes into the system, as found in Diligent's ProtecTIER appliance.  Or “Post-Processing” where the magic happens as a secondary process after the backup job, as found in FalconStor’s Single Instance Repository (SIR) software.  Both have their pros and cons, and deciding which approach to use depends on balancing your performance vs. complexity needs.  For the record, Sun sells both….         

5. Data Encryption:  One need only read the horror stories of lost tape and disk drives to see the importance of data encryption.  While it has been around for a while – the need has never been greater.  Growing storage capacity has caused another problem – one can store a lot of personnel records on a single cartridge or drive.  In an age of identify theft, losing one storage device can put a company out of business.  The new trend is not how to encrypt, but where to encrypt  On the host server?  On an appliance in the network?  In the storage device itself?  Decru (since bought by NetApp) benefited from this trend with their encryption appliance.

I once worked with a brilliant engineer whose favorite saying was “never put a product where a feature should be.  I’d say this was Sun’s philosophy when we delivered the Sun StorageTek T10000 tape drive.  Put simply, Sun put an encryption chip next to the compression chip on the drive – so data is encrypted as it is fed onto the tape.  Simple and affordable – no extra appliance needed.  Sun also offers the StorageTek Crypto Key Management Station to centrally authorize, secure and manage encryption keys.

6. Eco Storage (aka Green Storage/IT):  I freely admit that when I was first approached with “Green Storage” I was a skeptic.  I would have also never guessed Al Gore would win the Nobel Peace Prize!  But Eco also stands for Economics.  If you save power and footprint, and the world while you are at it – who can argue with that?  But the challenge for storage customers will be sorting through the vendors who make REAL Eco investments vs. the ones that just add “Eco” or “Green” to their marketing collateral.  Sun’s in the “real” category, investing heavily in Eco IT.  Sun’s Eco efforts can be seen here...        

7. Object Archive (aka CAS, Application Aware Archive): The dizzying array of regulations, compliance requirements and influx of data have made the archive market one of the fastest growing markets in IT and storage.  And customers must continually evaluate which archive approach will work best for them.  The trend here is to “build a better mousetrap archive.”  The challenge is this, an archive system must:

  • Store a lot of data affordablyHoneycomb
  • Have WORM  functionality so documents show up un-altered in an audit or court of law
  • Be easily and quickly accessible
  • For more than 100 years…
Easier said than done.  While tape continues to be the old staple in archive, and faster and denser tape systems are coming out each year – a lot of new innovations are happening on the disk and software side.  Disk-based object-level archive systems include EMC’s Centera, HP’s RISS and Sun’s StorageTek 5800 “Honeycomb” pictured at right.  Sun’s ST5800 system uses advanced meta data features and processors close to each storage cluster for fast access to deep archives – great for digital library, Web 2.0 and HPC applications and environments. 

But do keep in mind for deep archive; Sun’s StorageTek SL8500 Tape Library is tough to beat – just one library's max raw capacity is 56 Petabytes, and data sitting on tape consumes 0 kilowatts and generates 0 CO2 (see above trend #6)   

8. New Interfaces, Protocols & Configurations: There is a lot of change happening in storage systems and how they are configured.  The three primary ways storage is attached is Direct Attached Storage (DAS), Storage Area Networks (SAN) and Network Attached Storage (NAS).  A disk system can also be configured in a couple different ways.  RAID configurations stripe data across multiple drives and impact a system’s reliability and performance.  JBOD (Just a Bunch of Disks) is more affordable because it does not require a disk controller, but provides no data redundancy.  New interfaces and protocols will impact each of these markets significantly. 

  • SAS:  In DAS, Serial Attached SCSI (SAS) is a newer serial communication protocol that will make data transfer speeds much faster and at a lower cost.  SAS drives have replaced parallel SCSI for internal storage, and SAS HBA external host interfaces are just starting to ship in volume.  SAS disk arrays started shipping this year and SAS JBOD are starting to ship.  To complement the SAS HBA's and SAS arrays, there are new network components called expanders which are basically switches.  Generally SAS disk will be cheaper than Fibre Channel (FC) with similar performance and capacity.  (Sun has SAS host I/F, disk I/F and SAS disk drives)
  • iSCSI:  In the SAN market, the iSCSI protocol is making a significant impact by taking the cost out of expensive Fibre Channel SANs.  iSCSI allows users to send SCSI commands over their existing IP networks. 
  • File Virtualization:  In the NAS market, file virtualization is making NAS farms much more scalable and manageable (See trend #2)
  • Clustered RAID:  A new innovation to watch in the RAID market is horizontally scalable RAID or clustered RAID for large applications.  Digi-Data is a small storage company pushing this innovation. 
  • JBOD:  Keep your eye on Sun in the JBOD market.  JBOD is more affordable than RAID, but does not have RAID’s redundancy and reliability features.  But what if you had an infinitely scalable file system with data integrity and RAID features running on JBOD, say, something like ZFS?
  • Unified Storage:  Another trend in this area are unified or hybrid devices – storage systems that can handle multiple protocols and interfaces, including iSCSI, Fibre Channel, and NAS – all in one unit.  Makes sense to users with a dizzying array of choices in the market.             
  • 10GbE:  Lastly, 10 Gigabit Ethernet or 10GbE is the latest and fastest of the Ethernet standards that will re-shape data center networking – offering a fast, common and affordable network technology for IT and Web 2.0 applications from supercomputing to networked storage.
  •  

9. Solid State Disk (aka SSD, Flash, Memory):  We see the perfect storm happening around SSD.  SSD has no moving parts, making it one of the most reliable and fastest storage mediums in the world.  But it is incredibly expensive when you compare $/GB vs. Disk and Tape storage.  However, SSD has made a name for itself in the consumer market (digital cameras, phones, iPods, etc.)  The price pressure in the consumer arena is enormous – and this had been rapidly eroding the price of SSD.  As SSD prices drop, expect to see hybrid disk drives and storage systems that leverage more SSD for greater speed and IOPS.        

 

10.  Storage as a Service: 

Storage as a service offered over the Internet has been talked about for years – but poor performance and implementations have cooled this trend.  However, Amazon has given Storage as a Service a power boost with its Amazon Simple Storage Service (Amazon S3).  By leveraging Amazon’s existing e-commerce and storage infrastructure, the company is offering customers storage capacity for $0.15 per GB-Month of storage used – possibly the cheapest $/GB on the planet.  And while this may have more play in the consumer market, Amazon could re-invigorate the storage as a service trend.  Also keep an eye on Sun’s Internet Service offerings over Network.com...             

--- Updates ---

Tuesday Jun 19, 2007

Lacrosse and the future of storage...

I like storage, strategy and sports - so I really liked Scott Tracy's "Telegraph" Blog.  I commented on it, but will elaborate further here.  He talks about Storage running on general-purpose Solaris, and shows a nice OpenSolaris Storage Platform diagram. I'll serve up my own sports analogy (and re-live the glory days while I am at it...)

I played Football and Lacrosse in high school and college  (yes, I was a UPS Logger) - but I'll stick with the Lacrosse theme. 

In high school my team competed in Florida's Cocoa Expo Lacrosse Tournament. We were quickly mocked by all the teams there - they represented the best of east coast lacrosse and we were from cow-town Colorado. We were at first intimidated by their "moves" - quick sticks, behind the back passes, etc. But our coach told us to stick to the basics - two hands on the stick, straight forward passes, etc.

What happened?

We won the tournament that year - first place, #1 (and went on to win State by the way). So what's the storage tie in?

There is an appliance or point-product for everything in storage today - virtualization, data movers, back up, CDP, encryption, etc.  A lot of fancy stuff that can solve individual problems, but adds to the overall complexity of IT storage. I like the OpenSolaris Storage Platform approach because it sticks to the basics - data volume management, data services, file systems - open and residing at the heart of any system or solution - its OS.

That's a winning strategy...


Today's Page Hits: 94