Georg Edelmann's Weblog Georg's Weblog

Wednesday Apr 02, 2008

This week, I have the pleasure to attend Open Repository 2008 in Southampton, UK.   Labelled the "International Conference of Open Repositories", it brings together researchers, scientists, repository managers, software architects and IT providers from all over the world.   From the our part of the industry, I've met folks from Sun, Microsoft and HP.   This is a very mixed crowd, and I'll talk about this aspect later for a bit.

Over the next few blog entries, I would like to share some of my impressions from the show.

Tuesday morning's keynote was given by Peter Murray-Rust from Cambridge University, UK.   My biggest takeaway was his point that PDF data format is bad, even evil when it comes to mining the document content with machines.   He promoted the idea of using Microsoft Word document formats instead.   I wonder if his intent was supporting the idea of the the adoption of any industry-standard format, e.g. ODF...   or maybe just anything better than PDF.

The following train of session was called "Web2.0", which surprised me as a topic for this kind of show to an extend that I attended all the presentations.   The first one was on the issue of inter-repository authentication via OpenID.   Interesting talk.   The project is called Connotea.   I took an action to find out how this all plays with Sun's Identity Management solutions.

The next talk was about project SNEEP "Social Networking Extensions for Eprints" from Richard Davis of the University of London.   The notion here was to use Web2.0 practises like Blogs, comments, bookmarks and tagging for content in repositories. Why ?   Mainly, to enrich the material by encouraging the participation of the users.   Interesting thought, but I wonder if the community is ready to make a leap forward here.

A note about the domain of institutional repositories.   These are the people who are in charge in ingesting, maintaining, managing, and preserving a sometimes phenomenal about of data.   One customer I spoke to, talked about 80 million objects (think images, thesis, research papers, experimental data, etc), stored within 9200 distinct repositories.   When asked about the storage capacity he required, he mentioned "multiple petabytes", growing at steep rates.   Wow.

The first day of the show ended with a preamble to the poster sessions called "Minute Madness".   Every poster session presenter got 60 seconds to introduce the topic of his/her poster.   There was a clock counting down behind the presenter, which turned red when going negative.   Fun to watch the presenter trying to cram as much into the 60 seconds as possible.

The subsequent poster session was an informal gathering of the attendees over a beer.   I spoke to the folks who developed Manakin, a tool to develop user interfaces with Dspace.   Pretty powerful stuff.   I was impressed by the flexibility Manakin provides to the user when customizing their web interface to a Dspace repository.   There's a demo here.

Sun had a poster stand that demonstrated Honeycomb's fit as a data repository platform.   We got tons of interest.   Gail Truman, our Honeycomb product manager, was visibly exhausted.   Well done, Gail.

Day one, ended with a quiet dinner at our hotel.

Comments:

Post a Comment:
Comments are closed for this entry.