\ˈō-pən\ \ˈär-ˌkīv\

What does it mean to keep data around for generations to come? It's an interesting problem. Were the Egyptians thinking about this when they created their documents on Papyrus? When that data becomes digital, the challenges seem daunting. Changing technology soon obsoletes the data that we have created even within a decade. Applications change the format of the documents that they create. If you had a VisiCalc spreadsheet with important information in it, how would you read it today without the VisiCalc application, the operating system and perhaps even the hardware that supports that OS? What if that spreadsheet was on a 5 1/4 inch floppy?

The idea of an Open Archive means that we can think about what it means to keep data around for generations and make it accessible by the technology that will exist 100 years from now. The elements that I think will be necessary to consider include:

  • An Open Document Format
  • Open Application Interfaces to the repositories
  • Open Export and Import of Data between repositories
  • Open architecture for data services that will keep data accessible
  • Open Source implementations of the above elements

It is the combination of all these elements that makes up an Open Archive in my mind. Many of these elements are already being addressed by industry efforts, and of course Sun is an active participant in these efforts.

Open Document Formats are being addressed by standardization efforts such as the OASIS Open Document Format for Office Applications standard. OpenOffice.org has an open source implementation of this standard that Sun integrates into it's StarOffice suite. Of course we need to go beyond just office documents over time.

Open Application Interfaces is an emerging area that is being addressed by the Storage Networking Industry Association (SNIA) with it's eXtensible Access Method (XAM) standard. I have blogged about this in the past, but the standard is now nearing completion. What's new (announced here) is that Sun has donated source code to the SNIA to allow it to create a reference implementation of the XAM API. This will be software that implements the XAM standard and uses a standard filesystem as the repository. This will be released by SNIA as an SDK for XAM once the standard and the software work is completed.

Open Export and Import is needed in order to move your data and it's associated metadata from storage system to storage system over the life of that data. The XAM Standard also addresses this with an XML based format for this purpose. This allows you to not only be agnostic to vendors, but the changing technology over the years.

An Open Architecture for Data Services allows those data services to periodically read data in old formats, convert them to newer versions of standard formats, and then write them out via newer, standard interfaces. This is an essential element of an active archive that can keep data accessible without having to archive the machines, operating systems and application version that you used to create the data. Sun has donated source from it's StorageTek 5800 to a new project on Java.net called Royal Jelly. We are working here in an open community to create the architecture for the data services that will implement this vision of an active archive.

I have mentioned some Open Source Implementations above, but I want to also highlight Open Solaris itself as community where this work is going on. We also posted the source to Project HoneyComb on it's Project Page. We envision adding support for the standard XAM interface to Open Solaris with this code and that being developed in the other two communities (SNIA and Royal Jelly) mentioned above. In addition to supporting application access to fixed content storage systems attached to Open Solaris based host machines, we expect Open Solaris as a Storage Platform to provide XAM capability natively.

If you care about the issue of keeping data around for generations to come, if you care about openness as the way to achieve this, you should be asking potential archive vendors some pointed questions. If you really care about the evolution of open archives, please consider participating in one or more of the communities mentioned here.
Comments:

[Trackback] If you've been walking the halls of Sun StorageTek of late, you would have heard a lot of talk about the "Archive Launch" and changing IT and storage economics... Today, Sun made a large announcement in the Archive storage space . Firs...

Posted by Taylor's Take on Sun Storage on February 29, 2008 at 10:51 AM MST #

Post a Comment:
Comments are closed for this entry.

This blog copyright 2009 by mac