Art Pasquinelli's Library Community Preservation Archives Library

Friday Aug 28, 2009

Some of you have been asking for information on the PASIG presentations. Here is the up-to-date agenda and chronological list of abstracts; about one-half of them. I have asked all presenters and working group leaders to give me their information asap. This will be posted on the registration page. Registration can be accessed from the PASIG website at www.sun-pasig.ning.com. You can also view a list of attendees on the registration site. We have a combined list of 60+ confirmed speakers and registrants as of today.


Remember that the Early Bird Registration ends September 7.


Partners who are attending and want a table for literature should send me a note this next week. Additionally, anyone needing to arrange a 1-1 architectural or topical session with Sun technologists or a solution partner should tell me asap.


Best Regards,


Art 


--------------------


PASIG Agenda as of Aug. 28 (Rev. 6)


8:30am-9:00am Introduction to the PASIG - Art Pasquinelli, Education Market Strategist, Sun Microsystems

9:00am-9:15am Introduction to the Agenda - Michael Keller, University Librarian and Director of Academic Resources, Stanford U.

9:15am-9:35am DuraSpace: Open Technologies for Durable Digital Content - Sandy Payette, CEO, DuraSpace

9:35am-9:55am Islandora: Repository in a Box - Mark Leggott, University Librarian, University of Prince Edward Island

9:55am-10:15am Biodiversity Heritage Library Architecture with Fedora and DuraCloud - TBD


10:15am-10:35am Break


10:35am-10:55am VTLS Update on VITAL Institutional Repository - Vinod Chachra, President and CEO, VTLS Inc.

10:55am-11:15am Fez Update - Keith Webster, U. Queensland

11:15am-11:35am iRods: Policy Based Use of Cloud Storage - Reagan Moore, Director, DICE Center, U. North Carolina

11:35am-11:55am Ten Years of Digital Preservation with EPrints - David Tarrant, Research Fellow and Developer, U. Southampton

11:55am-12:15pm Trends Update - Lee Dirks and TBD, Microsoft


Lunch 12:15pm-1:20pm


1:20pm-2:25pm Sun Technology Update - Keith Rajecki, Education Solutions Architect, Raymond Clarke, Enterprise Storage Specialist, Philippe Trautmann, Global HPC Business Development Manager, Sun Microsystems

2:25pm-2:45pm Internet Archive Technical Overview - Kris Carpenter, Director, Web Group, The Internet Archive


2:45pm-3:05pm Break


3:05pm-3:25pm Oxford U. Update - Neil Jefferies, Oxford U.

3:25pm-3:45pm Stanford Digital Repository - Tom Cramer, Stanford

3:45pm-4:05pm National Library of New Zealand; Obsolescence, Format Registries, and Preservation Risk Mgt. - TBD

4:05pm-4:25pm The French National Library (BNF) SPAR Architecture Developments - Thomas Ledoux, BNF

4:25pm-4:45pm From Ingest to Access: A Day in the Life of a HathiTrust Digital Object - Jeremy York, Project Librarian, U. Michigan

4:45pm-5:05pm Slovakia National Library Architecture - TBD

5:05pm-5:25pm SHAMAN - Sustaining Heritage Access through Multivalent Archiving - Ruben Riestra, SHAMAN Coordinator, Matthias Hemmje, Prof. Dr.-Ing.

5:25pm-5:30pm Summary - Michael Keller, Stanford


5:35pm-6:15pm Immersive Technologies: Project Wonderland Overview - Kevin Roebuck, Sun Microsystems (Optional)


6:30pm Reception at the Westin St. Francis (with vendor and project tables)


Thursday, October 8, 2009


8:45am-9:00am Recap - Michael Keller, Stanford

9:00am-9:20am Towards Physical and Digital Archive and Preservation at the U. of the Witwatersrand - Derek Keats, Deputy Vice Chancellor, Knowledge and Information Mgt., U. of the Witwatersrand

9:20am-9:45am Storage Futures: Planning for Long-Term Sustainability - Chris Wood, Consultant

9:45am-10:05am Family Search Overview, The Church of Jesus Christ of Latter-Day Saints - Gary Wright, Preservation Product Manager, Family Search

10:05am-10:25am USC Shoah Foundation Institute Architecture - Sam Gustman, USC


10:25am-10:45am Break


10:45am-11:05am Digital Submission System - Experiences from the Library of Congress - Carl Watts, PC Mall

11:05am-11:25am Next Generation Storage at Penn State - Mark Saussure, Director, Digital Library Infrastructure. Ben Grissinger, Storage Services Lead, Digital Library Technologies

11:25am-11:45am Permanent Objects, Evolving Services, and Disposable Systems: An Emergent Approach to Digital Curation Infrastructure - John Kunze, Preservation Technologist, California Digital Library

11:45am-12:05pm Improving Inter-Institutional Preservation - Tyler Walters, Associate Director, Technology & Resource Services, Georgia Institute of Technology, David Minor, UCSD


12:05pm-1:15pm Lunch


1:15pm-1:35pm Columbia U. Digital Library Architecture - Robert Cartolano, Director, Library IT Office, Columbia U.

1:35pm-1:55pm Southampton and Oxford Preservation Network Collaboration - David Tarrant, Southampton U., Neil Jefferies, Ben O'Steen, Oxford U.

1:55pm-2:15pm PASIG Repository Working Group Collaboration Directions - Tom Cramer, Stanford, Neil Jefferies, Oxford U.

2:15pm-2:35pm New Software for the Florida Digital Archive: DAITSS 2.0 Architectural Overview - Priscilla Caplan, Assistant Director for Digital Library Services, Florida Center for Library Automation

2:35pm-2:55pm French Public Information Library Federated Search Implementation - Terry Reese, Oregon State, Roger Essoh, Business Development Manager and Head of Innovation, ATOS Origin


2:55pm-3:15pm Break


3:15pm-3:35pm Australian National Data Service (ANDS) Update - Robin Stanton, Pro-Vice Chancellor, Australian National U.

3:35pm-3:55pm Data-Intensive Environmental Research: Re-envisioning Science, Cyberinfrastructure, and Institutions - TBD, California Digital Library

3:55pm-4:15pm Johns Hopkins U. eScience Directions - Sayeed Choudhury, Associate Dean of University Libraries, Johns Hopkins U.

4:15pm-4:55pm Public and Permanent Scientific Data - Professor Michael Lesk, Rutgers U.

4:55pm-5:00pm Summary - Michael Keller, University Librarian and Director of Academic Resources, Stanford U.


6:30 Off-site Reception


Friday, October 9, 2009


8:45am-9:30am Where Things Are Headed - Michael Keller, University Librarian and Director of Academic Resources, Stanford U.

9:30am-9:45am Working Groups Introduction

9:45am-11:30am Working Groups:


Ex Libris Rosetta Overview and Discussion

PASIG Repository Group

Islandora In-depth Demo and Discussion

SAM/IAS Deep-Dive

Tessella SDB Discussion

Digital Curation Discussion


11:30am-12:00pm Working Group Summaries and Going Forward Discussion


12:00pm - Meeting end.


1:30pm-3:00pm Taking the PASIG Forward: Tangible Action Plans (Open to All Who Want to Influence the PASIG Directions)


----------------


DuraSpace: Open Technologies for Durable Digital Content

Sandy Payette, CEO, DuraSpace


The DuraSpace is a not-for-profit organization that is the home of two major open source repository platforms DSpace and Fedora. The organization is also developing a new cloud-based service known as DuraCloud. This presentation will provide an update on the latest developments in the repository platforms especially new integration possibilities. A preview of the Alpha version of DuraCloud will be provided as well a report of the pilot partners program funded by the National Digital Information Infrastructure and Preservation Program (NDIIPP) of the Library of Congress. Pilot partners will demonstrate the role of DuraCloud as a component within a broader digital preservation strategy.


Bio:


*Sandy Payette *is Chief Executive Officer of DuraSpace (http://duraspace.org <http://duraspace.org/>), a not-for-profit organization that provides leadership and innovation in open source technologies that support scientific, scholarly, and cultural communities. Sandy collaborates nationally and internationally to further the mission of DuraSpace to enable the sharing, preservation, and archiving of digital information. Sandy was the co-creator of the Flexible Extensible Digital Object Repository Architecture (Fedora) at Cornell University’s Department of Computer Science (1998) and she later established the open source Fedora Repository Project (2001-present). Sandy was the Executive Director of the Fedora Commons not-for-profit organization, which joined with the DSpace Foundation in 2009 to form DuraSpace. Sandy was formerly a researcher in Computing and Information Science at Cornell where her work in digital libraries and digital preservation bridged research with practical applications as described in her various publications. Previously, Sandy spent ten years in industry leading information technology projects at Corning Incorporated, a Fortune 500 company. Her leadership led to early adoption of decision support systems, helping the company to forge new processes and techniques for strategic business analysis.


---------


Islandora: Repository in a Box

Mark Leggott, University Librarian, University of Prince Edward Island


The Islandora system combines the Drupal CMS with Fedora to provide one of the most modular and easy to install repositories available. The Islandora project will provide a host of "solutions packs" that provide repository systems in such areas as institutional repositories, digital collections, document management and research projects. This session will provide a brief overview.


-------


VTLS Update on VITAL Institutional Repository

Vinod Chachra, President and CEO, VTLS Inc.


VITAL is a product based on Fedora. The present version of VITAL (Release 4.0) runs on Fedora 3.1. This presentation will provide an update of the new features of Release 4.0. In particular, the discussion will center round a very granular access control module that has been implemented. It allows for restrictions to be placed on items and people. Access or denial of access is determined by the intersection of these restrictions.


In addition, VTLS has conducted extensive performance testing and scaling on Release 4.0. The findings from this testing will be discussed. Repository sizes and usage patterns will be discussed.


---------



iRods: Policy Based Use of Cloud Storage

Professor Reagan Moore, Director, DICE CenterU. North Carolina


The integrated Rule-Oriented Data System (iRODS) organizes distributed data into a sharable collection. The data may reside in cloud storage, in institutional repositories, in tape archives, in laptop file systems. We will demonstrate the enforcement of management policies across the multiple storage locations, access mechanisms ranging from web browsers to Fedora to Web-DAV to EnginFrame interfaces, and types of assertions that can be made on data in cloud storage.


Dr. Reagan W. Moore

Reagan Moore is a Professor in the School of Information and Library Science at the University of North Carolina at Chapel Hill, Chief Scientist for Data Intensive Cyber Environments at the Renaissance Computing Institute, Director of the Data Intensive Cyber Environments Center at UNC., and Principal Investigator on projects developing the integrated Rule Oriented Data System. He coordinates research efforts in development of data grids, digital libraries, and preservation environments. An ongoing research interest is use of data grid technology to automate execution of management policies and validate trustworthiness of repositories.


--------- 


Ten Years of Digital Preservation with EPrints

David Tarrant, Research Fellow and Developer, U. Southampton


2009 represents the 10th birthday of EPrints and, more importantly, the start of the second decade of EPrints. 10 years of digital preservation is a substantial amount of time and in this presentation we look back on some of the main successes as well as the key lessons learned during the development of the EPrints platform. Building on 10 years of experience allows us to move forward with confidence into new areas of research, adding to, enhancing and refining the EPrints platform. From the view point of digital preservation we look at the key decisions made during the development of EPrints and how these are still affecting the development of the platform even today.


------


Sun Technology Update

Keith Rajecki, Education Solutions Architect, Raymond Clarke, Enterprise Storage Specialist, Philippe Trautmann, Global HPC Business Development Manager, Sun Microsystems


This combined session will focus on 1) Sun technologies overview including the Storage Archive Manager (SAM) and Infinite Archive System (IAS), 2) reference solution architectures, 3) Sun open storage directions, 4) industry standards work, and 5) infrastructures for large research dataset management.


------


From Ingest to Access: A Day in the Life of a HathiTrust Digital Object

Jeremy York, Project Librarian, U. Michigan


HathiTrust was launched in October 2008 by twenty five institutional partners as a means of preserving and providing access to materials digitized in their large scale digitization programs (information, including a list of current partners, is available at www.hathitrust.org). This talk will explore the technology behind HathiTrust and the processes digital objects pass through as they are ingested into the repository, preserved, and delivered to end users.


Bio:

Jeremy York is a project librarian for HathiTrust Digital Library. He received a B.A. in history from Emory University in 2001, and an M.S.I. from the University of Michigan in 2008. York joined the HathiTrust team in July 2008, and has supported efforts in communication and development since that time. He has more than ten years of experience in libraries, working in areas of course reserves, archives and special collections, and information technology.


------


SHAMAN - Sustaining Heritage Access through Multivalent Archiving - Ruben Riestra, SHAMAN Coordinator, Matthias Hemmje, Prof. Dr.-Ing.


The SHAMAN Integrated Project sets out a framework integrating advances in the data grid, digital library and persistent archives communities in order to create an innovative preservation environment which may be used to manage the storage, access, presentation, and manipulation of potentially any digital object over long periods of time.


To achieve this, SHAMAN will establish an Open Distributed Resource Management Infrastructure Framework enabling Grid-based Resource Integration, that is firmly grounded in a conceptual and technical reference architecture offering a more complete set of features supporting digital preservation than contemporary systems/approaches.


The talk will introduce the first instance of SHAMAN's Digital Preservation Reference Architecture and

Framework as a next-generation Digital Preservation Framework. SHAMAN will provide and trial this framework in three prototypical application prototypes.

These exemplary use cases will demonstrate the viability, advantages, and potential impacts of taking-up the new technology derived from SHAMAN. Therefore, the talk will also introduce an overview of

the three application trials.


-------




Towards Physical and Digital Archive and Preservation at the U. of the Witwatersrand

Derek Keats, Deputy Vice Chancellor, Knowledge and Information Mgt., U. of the Witwatersrand


The University of the Witwatersrand, Johannesburg (Wits) has 28,000

students, and is located at the centre of the industrial heartland of

South Africa. The University has a rich collection of historical

materials and is often the preferred site for materials that have

previously been held in corporate and private collections. In early

2009, Wits created a deputy vice chancellor portfolio for Knowledge and

Information Management (KIM), which includes the traditional ICT

portfolio, Management Information, eLearning and the Library. Archiving

and preservation is thus part of the KIM portfolio, and the inclusion of

ICT and eLearning within the portfolio creates opportunities for

technology synergy that can facilitate digital preservation and

archiving. This paper documents the design of a private cloud in

relation to creating synergy among various IT initiatives in order to

deliver a cutting-edge archiving infrastructure based on Free and Open

Source Software (FOSS). The paper also details the role of a FOSS-based

software engineering collaboration in Africa in producing software

innovations, and uses it to suggest that collaboration in digital

repository space could produce similar innovations. An emerging physical

archiving vision is also discussed.


------


Family Search Overview, The Church of Jesus Christ of Latter-Day Saints

Gary Wright, Preservation Product Manager, Family Search

/

While the ISO Reference Model for an Open Archival Information System (OAIS) provides an excellent architecture to address many aspects of digital preservation, it was developed from the perspective that consumer access is to be satisfied exclusively from archival storage. This approach works well for many applications, but is not practical for a large-scale operation delivering high-volume access through the Internet. This paper summarizes how FamilySearch has extended the OAIS Reference Model in order to provide viewable genealogical records and other online services to millions of Internet users. // ///



*Bio*


Gary Wright is a Senior Product Manager at FamilySearch, the world’s largest family history organization. Prior to joining FamilySearch, he had an extensive career in marketing, program management, and solution development with leading storage companies such as IBM, StorageTek, and HP.



------


Next Generation Storage at Penn State

Mark Saussure, Director, Digital Library Infrastructure. Ben Grissinger, Storage Services Lead, Digital Library Technologies Columbia U. Digital Library Architecture


The exponential growth of data and the ever-increasing associated storage requirements have significantly changed the landscape of how higher education saves and manages digital assets. Regulatory requirements, research data explosions, and archiving of native digital data only compound the problem. Information Technology Services at Penn State will present our strategic storage service solutions, address the need to access data for as many as 72 years, and discuss the benefits of our solution: elimination of vertical information silos, enhancement of information access management and development of an economically sustainable storage infrastructure.


------


Improving Inter-Institutional Preservation

Tyler Walters, Associate Director, Technology & Resource Services, Georgia Institute of Technology, David Minor, UCSD*


Georgia Tech and the San Diego Supercomputer Center are embarking on a project to develop tools and methods to automate the exchange of data, in particular large data collections, between the MetaArchive Cooperative (LOCKSS-based), and Chronopolis (Storage Resource Broker-based, which will be migrating to iRODS in 2010). The project builds on these two mature digital preservation frameworks (LOCKSS and iRODS) by examining the best methods for exchanging data between them. The project also will be informed by the new curation micro-service technologies (i.e. BagIt, and others) developed at the California Digital Library (CDL). A highly robust, easy to use data transfer system for these preservation systems will be created, allowing digital objects to be shared between several major preservation networks in the U.S.



-----


Permanent Objects, Evolving Services, Disposable Systems: An Emergent Approach to Digital Curation Infrastructure

John Kunze, Preservation Technologist, California Digital Library


The California Digital Library (CDL) is exploring a new emergent approach to its digital curation infrastructure. This approach is based on devolving repository function into a set of independent, but interoperable, micro-services, where the complex function necessary for effective curation emerges from the strategic combination of individual atomistic services. Since the emphasis is on the persistence of managed content, rather than the systems in which that management occurs, the paradigmatic archival culture is not unduly coupled to any particular technological context. This provides a curation environment that is comprehensive in scope, yet flexible with regard to local policies and practices and the inevitability of disruptive evolution of technology and user expectation. This talk provides an overview of work to date.


Bio:


John Kunze is a preservation technologist for the California Digital

Library and has a background in computer science and mathematics. His

current work focuses on archiving websites, creating long-term durable

information object references using ARK identifiers and the N2T resolver,

and promoting lightweight "Kernel" metadata.  He contributed heavily to

the standardization of URLs, Dublin Core metadata, and the Z39.50 search

and retrieval protocol.  In an earlier life he created UC Berkeley's

first campus-wide information system, which was an early rival and client

of the World Wide Web.  Before that he was a BSD Unix hacker whose work

survives in today's Linux and Apple systems.




-----




Columbia U. Digital Library Architecture

Robert Cartolano, Director, Library IT Office, Columbia U.


Columbia University Libraries/Information Services has developed an institutional repository and digital library long-term archive with a Fedora repository backed by a Sun SAM-FS storage system. This presentation will review the design and implementation of the new sustainable storage architecture, the challenges of deploying an off-site disk mirror at the NYSERNet Data Center, migration from DSpace to Fedora for our Academic Commons institutional repository, and tools developed for ingest, curation and support on the Fedora platform.



------


New Software for the Florida Digital Archive: DAITSS 2.0 Architectural Overview

Priscilla Caplan, Assistant Director for Digital Library Services, Florida Center for Library Automation


Abstract


The Florida Digital Archive has used a locally-developed preservation repository application called DAITSS since its inception in 2005. DAITSS was designed to be strictly OAIS-conformant, to allow many different user institutions to control their own content, and to support the preservation strategies of format normalization and forward migration. After three full years of operation and the ingest of more than 10 TB and 50,000 SIPs, we decided in 2008 to rearchitect the software. Surprisingly, the original functionality built into DAITSS 1 proved to be mostly adequate, and only a few changes were desirable. The program architecture, however, required redesign to allow increased throughput, easier code maintenance, and simpler daily operations. The new DAITSS 2 software now partially implemented and will be complete in early 2010. It is a set of relatively simple, RESTful Web services, each of which can be run and tested independently. It is designed to make it simple to add support for new file formats and to incorporate externally written tools. It is fully PREMIS-conformant and implements the complete PREMIS Data Model. This presentation will provide a high-level description of DAITSS 2.0, summarizing the functionality of the various Web services and showing how they fit together to implement a flexible, high performance, low maintenance, preservation repository application.


-----


Data-intensive environmental research: re-envisioning science, cyberinfrastructure, and institutions

TBD, California Digital Library

Posted at 11:19PM Aug 28, 2009 by Art Pasquinelli in Sun | Comments[0] | Permalink

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed