Session 5: Legal

Grace Agnew, from Rutgers Universities Library, presented over 40 PPT slides on Digital Rights Management. Her book is due to be published later this year. It’s still not clear to me why we need DRM in open access repositories. Surely this conference is an opportunity to promote the benefits of Copyleft. A simple way of managing the rights to academic research, which costs nothing, is to attach a Creative Commons license to the work. It’s what software developers, with the similar GPL license have been doing for 20 years with great reward. Of course, this ignores the issue that for most academics, the IPR ultimately belongs to the University and it’s at this level that the discussion needs to be had. An academic who deposits their work in a repository and chooses to license their work under a Creative Commons license, may be forgetting that they do not own the work in the first place. In practice, academics are usually free to publish their work as they wish, but the explicit application of a Copyleft license is, unfortunately, not a guaranteed right.

Next, Brian Fitzgerald, from QUT Law School, discussed the OAK Law Project. The project looks fascinating and notably links to the Creative Commons initiative in Australia. It’s a shame that Brian didn’t talk about that and its relevance to open access repositories.

Finally, Jenny Brace, from the Version Identification Framework Project, presented the results of their project. By this point, the microphone in the auditorium had stopped working and I couldn’t hear very much, which was a shame, as it’s an important and interesting area of study, something which I’ve had to deal with ever since working in Collections Management at the NFTVA, where the correct ‘versioning’ of TV and Film materials was a constant issue. ‘Version’ means different things to different groups of people.

Session 4: National & International Perspectives

Arjan Hogenaar & Wilko Steinhoff, from KNAW, gave a presentation on AID, a Dutch Academic Information Domain. I’ll be honest and admit I didn’t pay much attention to this as I was writing up my blog notes for Session 3. Follow the hyperlinks for more information.

I was able to concentrate on the next two presentations which were both interesting and relevant to our work at Lincoln. The first was by Chris Awre, from the University of Hull, who is working on the EThOS project, a joint project between several HE institutions and the BL. It’s a project to provide a central repository service for e-theses produced in the UK. The idea is that the BL will harvest e-thesis specific UK ETD metadata provided by University repositories to create a single point of access to this type of academic output. Interestingly, the business model for this is a subscription service, whereby universities are expected to pay for the harvesting of metadata and digitisation of hard copy theses when they are requested. The content is Open Access (search, download), financially supported by a paid-for harvesting and digitisation service. It’s always interesting to see how people are creating new business models based on freely giving a product away. I hope it’s a success.

The third presentation was by Vanessa Proudman, from Tilburg University and the DRIVER Project. This was excellent, not least because of the rare clarity of presentation but also because the research findings are directly relevant and useful to us at Lincoln as we embark on establishing a repository service in the University. Vanessa looked at the challenges we face in populating our repositories and suggested key methods of increasing the number of deposits, noting that even with a Mandate, the deposit rate is only 40-60%. This work is published as part of a new book (chapter 3), which, naturally, can be downloaded here. Upon return to work, I intend to look at this in detail and begin drafting a plan for the next phase of our repository project, which is to establish an Open Access Mandate at the University and begin the important advocacy work within the Faculties.

Session 3: Interoperability

The final three presentations of the day focussed on interoperability. The first two, specifically discussed ways to make it easier for users to deposit materials into repositories. Julie Allinson, from the SWORD Project, discussed the work they have done and the use of the Atom Publishing Protocol as a framework for developing a derivative SWORD deposit profile. The presentation finished by noting that a NISO standard and tools are being developed for this same purpose and it is hoped that they take into consideration the work done by the SWORD project.

Scott Yeadon, from the Australian National University, gave a presentation on tools which the RIFF Project have developed for DSpace and Fedora to facilitate easier deposit of content into these repositories. Their work took real world examples of content to deposit and developed a submission service, a METS content packaging profile and dissemination service.

Both the SWORD and RIFF Projects demonstrated working examples of their services, albeit in early form. The main question remaining is whether they will be adopted beyond the confines of the project. Part of project work is research and development, but a significant part is also the marketing of the results of the project, for which OR2008 is clearly an important venue.

Finally, Dean Krafft, from Cornell University, presented NCore, a wide range of open source tools for creating digital repositories. Much bigger in scale than the previous two projects, the NCore platform is notable for being released on Sourceforge as a community project. It also has guaranteed funding until 2012, suggesting that even greater work is to come. It’s basically a suite of software tools and services built around the Fedora repository, developed to manage millions of objects, initially at the National Science Digital Library (NSDL). It was an excellent presentation of what appears to be a successful project and set of products. Building a Fedora repository requires a higher investment of resources than installing DSpace or EPrints and projects which use this platform, although often complex and difficult, tend to produce very interesting and impressive results.

Session 2: Social Networking

Carrying on from the morning’s Web 2.0 session, in the afternoon I attended a session on how social networking tools are being developed for and integrated into repositories.

Jane Hunter, from the University of Queensland, discussed the HarvANA project, a system which supports and exploits repository users’ tags, comments and other annotations through the development of separate collections of user contributed metadata. It seems like an interesting and ultimately useful idea, acknowledging the ‘added value’ that user annotations can make to repository objects. Significantly,users can annotate sections of text, images and other media, allowing annotations to be created for parts of the repository object, rather than just the whole.

David Millard, from The University of Southampton, presented the Faroes project, a development of EPrints for teachers wishing to deposit learning resources. He said that their experience on previous projects had shown that users were not interested in nor required content packaging standards and that repository user interfaces needed to provide similar functionality to other repositories such as Flickr and YouTube. Their project aims to provide a simple, attractive interface to EPrints (called ‘PuffinShare’) aimed at teachers sharing documents, images and other single files (or ‘learning assets’), rather than packages of learning objects. It looked like a great project, highlighting some of the challenges we’ve faced on the LIROLEM project and one which I think we would be interested in trying. A public beta is due this summer. He pointed out that the growth of Web 2.0 is due to the popularity of personal services (Flickr, YouTube, Delicious), which also have an optional, additional social value to them, too.

Carol Minton, from the National Science Digital Library, discussed the work they have done on embedding Web 2.0 applications such as MediaWiki and WordPress, into their repository service. Essentially, they have created services that link blog articles and wiki pages to repository objects, enriching the objects with these community ‘annotations’.

Session 1: Web 2.0

Ian Mulvaney, from Nature Publishing, gave a presentation on Connotea. He discussed how their earlier ‘Tagging Tool’, EPrints plugin required repository users to register and sign-in to Connotea, in order to use the service from participating repositories. This, they found, created a barrier to entry which he thinks the use of OpenID and OAuth may overcome.

Richard Davis, from the University of London Computer Centre, gave a presentation on SNEEP, the JISC project to develop Web 2.0 plugins for EPrints. They are developing Comments, Bookmarks and Tags (CBT) plugins, which we’re actually going to be using in one form or another in our own repository at Lincoln. He raised the question of whether we really need this functionality of not in our repositories, and I’d argue that the functionality should be there, or else they remain read-only alternatives to publishing. With a ‘user space’ for commenting, bookmarking and tagging, an informal method of peer-review is introduced that could mature into something very valuable.

Daniel Smith, from The University of Southampton, presented Rich Tags, a web application for cross-browsing repositories. It uses the mSpace faceted browser for exploration of multiple repositories in an interface similar to iTunes. It’s a nice interface, a bit heavy on resources when I loaded it in my browser, but provides a more enjoyable interface than the default EPrints UI, with the addition of searching more than one repository.

Open Repositories 2008 Conference

OR2008, the Third International Conference on Open Repositories began today and I’ll be posting my session notes to the LL Blog. It’s a bigger conference than I imagined. There are 486 delegates, from 35 countries, with about a third from the UK, a third from Europe, a quarter from the USA and the remaining 10% of people from Asia, Australia and New Zealand. The conference halls are packed and there is even an over-flow room where people watch the main sessions via a video feed. Not me! I get to the conference sessions early in order to get a decent seat.

The Conference website has the full programme in PDF format. True to form, there’s also a repository of conference papers and presentations.

The Keynote speaker discussed the requirements of the scientific community, arguing that open access to, the interpretation and display of scientific data is essential for these repository users. He said that the Protein Data Bank was an excellent example of such a repository. These repositories need to be embedded into the ‘white coat’ researcher’s daily work and not just places to deposit finished articles. He told us that researchers are not prepared to change the way they work and that students should be trained in good information management as they are the future of research. He pointed to several ‘open’ endeavours, such as ‘science commons‘, the ‘Open Knowledge Foundation‘, ‘open data‘ and ‘open science‘.

He reminded everyone that even the simplest of processes are complex to map, and being integral to the authoring process, trying to capture and integrate this digitally, without intruding on the work itself, is the most difficult of challenges. A worthwhile challenge, nonetheless, as 90% of scientific research data is, apparently, lost. The ICE-RS Project, is one such endeavour.

Of course, this is something businesses are also concerned with and is the stuff of Enterprise Content Management Systems (ECMS). I do wonder whether it would be a useful project to study and report on current commercial and open source ECMS solutions, as there seems to be little to no overlap between academic content management systems and repositories and those used in the corporate world.

Interestingly, he mentioned the use of Subversion as a way of backing up and versioning research, something I’m familiar with in software development but hadn’t thought about using for version control of my own work. Something to look into…