OR2008, the Third International Conference on Open Repositories began today and I’ll be posting my session notes to the LL Blog. It’s a bigger conference than I imagined. There are 486 delegates, from 35 countries, with about a third from the UK, a third from Europe, a quarter from the USA and the remaining 10% of people from Asia, Australia and New Zealand. The conference halls are packed and there is even an over-flow room where people watch the main sessions via a video feed. Not me! I get to the conference sessions early in order to get a decent seat.
The Conference website has the full programme in PDF format. True to form, there’s also a repository of conference papers and presentations.
The Keynote speaker discussed the requirements of the scientific community, arguing that open access to, the interpretation and display of scientific data is essential for these repository users. He said that the Protein Data Bank was an excellent example of such a repository. These repositories need to be embedded into the ‘white coat’ researcher’s daily work and not just places to deposit finished articles. He told us that researchers are not prepared to change the way they work and that students should be trained in good information management as they are the future of research. He pointed to several ‘open’ endeavours, such as ‘science commons‘, the ‘Open Knowledge Foundation‘, ‘open data‘ and ‘open science‘.
He reminded everyone that even the simplest of processes are complex to map, and being integral to the authoring process, trying to capture and integrate this digitally, without intruding on the work itself, is the most difficult of challenges. A worthwhile challenge, nonetheless, as 90% of scientific research data is, apparently, lost. The ICE-RS Project, is one such endeavour.
Of course, this is something businesses are also concerned with and is the stuff of Enterprise Content Management Systems (ECMS). I do wonder whether it would be a useful project to study and report on current commercial and open source ECMS solutions, as there seems to be little to no overlap between academic content management systems and repositories and those used in the corporate world.
Interestingly, he mentioned the use of Subversion as a way of backing up and versioning research, something I’m familiar with in software development but hadn’t thought about using for version control of my own work. Something to look into…
Hmm.
Alot to take in here! I think the real problem with Content Management Systems is more cultural than technical. The reason so much data gets lost is that it’s not seen as central to the “results”. Which in a “cathedral” culture, is what’s seen as important. (Yes, I’ve been taking advantage of the quiet few days to read Eric Raymond – but I haven’t yet managed to reconcile his arguments, which seem sound, with the reality of the results driven capitalist culture. I’d be very interested in an open source approach to paying my mortgage! And I wonder how he paid his!)
If I look at my own doctoral research, I have about 5 megabytes of data over the past 5 years – some of which is interview transcripts, some notes on journal articles, some drafts of articles, some early attempts at coding and so on. I can see that this would be of value to future researchers. Yet, from my selfish perspective all anyone is interested in is the finshed thesis and my ability to defend it. Or at least that is what I have been encouraged to believe.
There is a great deal of work to be done in tagging all this data – OK I guess there wouldn’t be if I’d started from the beginning, but why would I? I think what I’m getting at is that for the bazaar model to work, you need a community to bounce your ideas off – to shop at your stall as it were. But much research is incredibly esoteric and researchers tend not to believe that it will be of interest to anyone else.
So I suppose essentially I’m wondering about how we achieve a level of critical mass sufficient to encourage people to engage with a CMS. For most researchers there isn’t a “hacker” community after all. One suggestion that crosses my mind is related to the fact that I have actually kept all this data – so I clearly do feel the need to manage my own content.
You’re right about version control too. It’s something you learn the need for very quickly when working on a major written project – but I’ve never really learnt how to do it properly. I’ll have to have a look at subversion.
Enjoy the rest of the conference!
Interesting,
These discussions can take all sorts of directions. If we start by looking at the value of ‘the finished product’ in a research process, one begins to wonder if the intended and unintended consequences of such a perspective to repositories a well thought out.
The finished product in a piece of research is in, my experience, a systematically nested work using data and analytical frameworks. This is considered as the knowledge (or product) of the research indeavour. This kind of thinking places the research process in a continuum whereby ‘data’ is weaved into ‘information’ and furthermore into ‘knowledge’ and maybe other things.
There may be two intriguing pressumptions here: one is that repository users have the same (benevolent) purposes; two is that the substance and value of the data has the same level of sensitivity (commercial, social, or scientific).
Another perspective is that ‘the knowledge economy’. The capitalist proposal of knowledge economy suggests that the extent to which the value of knowledge has increased is such that there is no need to have huge factories. Rather, energies are invested in knowledge production which in turn is sold in various forms for various purposes. This justifies the increase in value and currency of knowledge. You can infer what this means (Julian highlights this: ‘ I do not know how he pays his mortgage).
Alternatively, one could also say that technology creates the opportunity for ‘proliferation’. Free/cheap access reduces digital/economic divide and increases world productivity etc.
I am just wondering where educational developers stand on this. What is their purpose Vis-à-vis the above two perspectives.