Day 2 started with a series of extremely interesting presentations on national and international perspectives of open repositories, followed by six talks describing a wide selection of scientific repositories. The afternoon was occupied by talks about models, architectures and framework, and a section on usage. I'll try to pick a representative sample of the day here.
Simon Coles from the University of Southampton did an fun-packed talk about his experiences with repositories and blogs in laboratories. His R4L project aims to address the gap between actual laboratory experiments and the publications of papers. I got quite a bit of contextual understanding around the academic life from his talk. Here's one example. Simon said : "40 years ago a PhD student would determine 3 crystal structures during the course of their study, this can now be done in one day." Now, that's what we call data explosion !
Christian Gumpenberger, Novartis, gave the audience a deep understanding of the trials and tribulations of introducing an Eprints based pharmaceutical repository corporate-wide at Novartis. His talk stood out for me, as it was one of the very few session in which a commercial entity took it upon themselves to organize their knowledge in a repository. Project OAK (Open Access to Knowledge) was a master's lesson on how to navigate the corporate world when it comes to implement a central knowledge data bank. Challenges were many, most of which were in keeping the project going after a successful start. On the technology side, Eprints was according to Christian, the right choice for Novartis. A good thing to say, when you present at the "Home of Eprints".
I the jumped onto the "Models, Architecture and Framework" track for two sessions. One of which was a presentation by Herbert van de Sompel, Los Alamos National Laboratory (LANL), on aDORe Federation Architecture. This was a brilliant talk. Herbert explained how LANL designed and implemented an architecture to federate repositories for scale. Scale at LANL means the 100 million objects in 9200 repositories. Massive scale, I'd say. Tons of ideas popped into my mind here. I could see how one could build hardware platform building blocks that would support the idea of scaling repositories by federation in a completely transparent manner.
My last session of the day was entitled "MESUR: Implications of usage-based evaluations of scholarly status for open repositories" by Johan Bollan, Los Alamos National Laboratory. Just reading the show brochure, this looked like a less interesting topic. Statistics and numbers, right ? Not so. Johan, being a skilled presenter, combined with his fast-paced style, was a blast. The project mined a wide choice of journals and created a graphical model via their citations on how the publications (and therefore the sciences) interconnect. Very interesting. For me, their work was one of the best visualisations of huge datasets I have ever seen. Check out the project's website.
Before I forget, I also attended the Microsoft session from Lee Dirks, Santosh Balasubamanian and Savas Parastatidis. We met the guys in the hotel earlier the same day, and got talking. The folks are working on a research project around using Microsoft technologies for repositories. Build on top of Microsoft SQL server, Santosh and Savas showed a series of impressive demos centered around the ease of development for repository software. From what I have seen during the last couple of days, this is probably the most complete development environment, even at this early stage of the project. It does require the developer to stay within the well-padded Microsoft environment, and as the question and answer session illustrated, cross-platform (read non-Windows) deployment does present a challenge. What did surprise me was the presenters sincere commitment to being open. Have the winds shifted to a more open-source attitude at Microsoft ? I wondered.
This was a long day. Off to the pub for some well-deserved pints of London Pride.