We had a very successful Sun Preservation and Archiving Special Interest Group (PASIG) meeting in Baltimore November 19-21. The event attracted about 150 participants including about 20 Sun attendees. Michael Keller from Stanford U. did his usual exemplary job moderating and the speakers were all excellent. Special thanks goes to the working group moderators - Carl Grant for Preservation, Michael Witt for Research Data Curation, Tom Cramer for Archiving and Preservation Repositories, and Raymond Clarke for Long-term Storage and Data Management, Keith Rajecki for directing the object-oriented collaboration discussions, and Cliff Lynch for an insightful meeting summary on Friday. Final presentations are or will be at www.sun-pasig.org. The attendee list is on the registration site. Not enough can be said about the Sun event manager, Peggy Taylor, who always seems to find the perfect venue and reception settings to foster collaboration and discussion.
It is interesting that as the conference and community evolve, the PASIG still has a unique focus and value; linking high-level preservation and archiving architectural goals to practical working experiences and foundation technologies. This is a community of practice and the application of theory. Based on this, we can see tangible directions developing of benefit to both Sun and Sun customers globally. I feel gratified that the right type of Sun participants came; people with realistic vision that want to actually work collaboratively with customers on their institutional goals and projects.
Mike Keller and I did a short audiocast on the Sun PASIG and general Digital Library directions just before the conference began. This is archived at http://www.blogtalkradio.com/stations/sunradio/SunNews/2008/11/17/Global-Leaders-in-Education-Libraries-and-More-to-Meet-on-Digital-Archives
Based on discussions and survey results here are some key developments, inputs, and my impressions from the meeting:
1. Upcoming Events
Logistically, attendees still strongly feel - as has been the response since the group's inception in early 2007 - that a standalone meeting is best vs piggybacking on an existing event. The next Sun PASIG event is targeted for June in Europe. Mike Keller and I will be looking at dates that won't conflict with other events too much.
Related to the June Sun PASIG event, Sun will also be a sponsor for;
4th International Digital Curation Centre Conference, Dec. 1-3, in Edinburgh - http://www.dcc.ac.uk/
Open Repositories 2009, May 18-21, in Atlanta - http://or09.library.gatech.edu/
2. Sun PASIG Webinars in 2009
Based on discussions at the PASIG and the successful webinar series we had leading up to the Baltimore event (archived at http://www.education-webevents.com/), we will look at joint webinars focused on solutions in early 2009. Expect cooperation with other organizations and projects like the DSpace Foundation, Fedora Commons, etc.
3. Technology Communities and Sun
Discussions were held with DSpace and Fedora community members. We are all unified in the belief that we need reference architectures and close technology cooperation to provide the larger repository, preservation, and archiving community with easy-to-use, sustainable, scalable, mix-and match, open solutions. Keith Rajecki, Sun Education Solutions Architect, Ray Clarke, Sun Enterprise Storage Architect, and I will coordinate our interactions with the EPrints, iRods, DSpace, and Fedora groups. Much of the previous cooperation with these groups had been focused on the EOLed Honeycomb ST5800 object-oreinted cluster in-a-box. It became evident that the organizations want Sun to share some of our thinking on; 1) cloud computing, 2) open storage, 3) Infinite Archive System (IAS), SAM-QFS, and ZFS, 4) managed services and modular datacenters, 5) virtualization, 6) MySQL-based scalable architectures, 7) web services design, and 8) basic reference systems for entry-level users.
How do we make things easier for the broader institutional community? How do we develop institutional open, flexible repository architectures? How do we move to managed services with viable service level capabilities?
4. New and Continuing Focus Areas
People feel that the Sun PASIG really ties theory to practice in the field of repositories, archiving, and preservation. Going forward, we will maintain the comprehensive backbone of Sun and key vendor IT technologies within the content structure. For instance, Chris Wood's talk on the future of storage technologies is still the event favorite and the webinars on Open Storage and Digital Asset Management (DAM) technologies have been keenly received. But we will focus increasingly on solutions built on Sun technology vs the raw technology itself. An example of this was the recent Ex Libris-Sun webinar on the Digital Preservation System (DPS) with Carl Grant and Yaniv Levi (http://www.education-webevents.com/).
Research data curation, sustainability, DAM, and new managed service offerings are key topics I will build into the next meeting. Additionally, people really want to hear about these topics within the context of different types and sizes of institutions. The surveys indicate that this breadth of practitioners is a real strong point of the PASIG. I definitely do not want to just have an undifferentiated conference from the great meetings going on with DCC, iPres, Open Repositories, CNI, etc.
5. Sun Infinite Archive System (IAS)
There is a lot of interest in this packaged Sun product offering. Much of the interest is in the way it melds Storage Archive Manager (SAM), tape, disk storage, and software technologies together to fit the organizational, budgetary, IT, and content needs of the customer. While many of the larger Sun customers such as The Church of Latter Day Saints FamilySearch, the Library of Congress, the USC Shoah Foundation Institute (Reference story at http://www.sun.com/customers/storage/usc_shoah.xml), or Sun's high performance computing (HPC) customers have done customized archival installations, many of the institutions involved in the PASIG could use a semi-packaged solution. The key issue is to get the right architect upfront to properly scope out your institution's data management needs and goals before purchasing anything. http://www.sun.com/servers/cr/ias/
Additionally, the IAS product group attended the PASIG en masse and wanted me to convey their thanks for all the useful input they received! Based on your collaboration with them, this group also agreed to co-sponsor Sun's sponsorship of Open Repositories 2009 in May.
6. Object-Oriented Storage Collaboration
Keith Rajecki, Peter Buckingham (Open Storage Evangelist), James Simon (US Systems CTO), and I met with some of the key development institutions who were working with the ST5800 Honeycomb object-oriented storage system prior to its EOLing in September. PASIG member input from previous conferences was that the community loved the use and future potential of object-oriented technology, but wanted this code available on more hardware options. Since the core software has been open, we agreed to foster a PASIG collaboration around the open Honeycomb code. The lead schools will be Southampton U. and Oxford. Stanford, Alberta Digital Library, Johns Hopkins, and several other institutions have also shown interest in participating.
This is a collaboration, so the energy and vision will be driven by the institutions, not Sun. Value added developments and products coming from the collaborators can be made available to other PASIG members. Keith Rajecki has taken the lead on coordinating this group and will begin documenting the project scope, functional requirements, and resource needs. He has set up an open wiki at; http://wikis.sun.com/display/ObjectOrientedStorage/Home.
7. Opportunities for Value Added Services via the PASIG
On the www.sun-pasig.org website there is a Digital Library Consultants Portfolio. I will update this semi-annually. This is a great mechanism for organizations with specialized expertise and products to publicize themselves to the PASIG community. In addition, there was discussion at this last meeting about future opportunities for PASIG 'certified' training and possible peer group project auditing. Together with packaging of entry solution reference implementations, I think these would be fantastic future value added services the PASIG members could offer to one another.
8. Cliff Lynch's Summary Thoughts
Cliff Lynch of the Coalition for Networked Information (www.cni.org) attended the entire meeting and really gave some valuable summation points on Friday before we adjourned. Some of the things he pointed out have got me thinking and I feel need to be reviewed deeper by this group, including the Sun people! I apologize for my vague paraphrasing;
- As we move data around the globe, are we thinking about Provenance enough? There are a lot of conversion, migration, and sustainability issues. How do you address the "legal chain of custody" via signatures? Most of these architectures have "an amazing number of moving parts."
- What is the right level of trust between federated repositories?
- We are all looking at Cloud Computing. But what is the related cost of archiving over time? The problems are still "opaque". Do we understand how to set the right type of service-level agreement? "What if you fail?" "How do you monetize risk?"
- As new service and IT models come into play people will need more guarantee than "trust us".