Scholarly publishing with WordPress

Working on the JISCPress project, I’ve been thinking quite a lot about scholarly publishing on the web, and in particular with WordPress. This morning, I read a post over on the ArchivePress blog about some WordPress plugins which are useful additions for creating a scholarly blog and it got me thinking a bit more about what features WordPress would need to support scholarly publishing.

JISCPress does away with the idea that WordPress is a blogging tool, and instead uses WordPress Multi-User as a document publishing platform, where one site or ‘blog’ is a document. The way WPMU is structured means that despite serving multiple (potentially millions) of document sites, the platform remains relatively ‘lightweight’ as each document site generates just a handful of additional database tables, while sharing the same administrative core as a single WordPress install. So, 100 WordPress blogs on WPMU is nothing like the equivalent of running 100 separate WordPress blogs, both from the point of resource requirements and administration. In fact, quite soon, there will be no such thing as WPMU as the two products are going to be merged and because they share 90%+ of the same code already, it’s not too difficult to achieve. ((Has anyone done a diff on the two code bases to measure exactly what percentage of the code is shared between WP and WPMU?))

Anyway, my point here is to discuss whether WordPress can be extended to accommodate most conventions found in scholarly publishing and where it is lacking, to identify the development work required to meet the needs of most academic who wish to write on and publish to the web. ((Actually, I think I’ll save the discussion of its shortfalls for my next post. This one is already long enough.))

Scholarly publishing extends to a wide variety of published outputs. As a Content Management System (CMS) and technology development platform, I believe that WordPress has the potential to support any type of scholarly publishing that the web supports. It is extremely extensible, as can be seen from the 6000+ plugins that are available. However, what I’m interested in is what can be done now, by an academic wishing to publish their work through the use of WordPress acting as a CMS. What can be achieved with a few quid ((I pay $5/year for my domain name and as many sub-domains as I need. I pay $10/month for my hosting with unlimited storage and bandwidth.)) to self-host WordPress so that a few plugins can be installed and a well structured, typical, scholarly paper can be published.

My Dissertation

For some time, I’ve been meaning to publish my MA dissertation. Back in 2002, I undertook some unique research which has not, to my knowledge, been repeated and I think there is some value in having it easily accessible on the web. I have an OpenOffice file and a PDF and, in the course of a morning, have published it under my own domain. The reason I did not publish it on the university WPMU platform is because I have been experimenting with different plugins and did not want to install plugins that were untested or we may not support long-term.  In this case, I’ve used a single WordPress installation, but ideally an individual researcher, group of researchers or research institution, would run a WPMU installation which allowed multiple documents to be authored individually or collaboratively ((Like any decent CMS, WordPress supports role-based authoring and editing and maintains a revision history of edits, auto-saved once per minute. Revisions can be compared alongside of each other.)) and published directly to the web as XHTML.

BuddyPress, by the way, can make the experience even more natural, not only because it is based around a community of like-minded people writing together  on the same web publishing platform, but also because, with a few tweaks here and there, we can move away from the language of blogs and towards the language of documents.


BuddyPress admin bar

Profile menu

Enough of BuddyPress on WPMU for now and back to my dissertation. I set up the site in ten minutes, without using FTP or a command line because I use a host that provides a one-click install of WordPress and WordPress allows you to search for and install plugins from its Dashboard, rather than having to use FTP. Once the site was installed, I then  made some basic changes to the settings, turning on XML-RPC and AtomPub, so that, if I decided to, I could publish to the site using my Word Processor. ((On a scholarly WPMU installation, plugins could be pre-installed and activated, a default theme selected and settings tweaked so very little work is required by the academic author prior to writing her document.)) I didn’t use this in the end, but trust me, it works very well using recent versions of MS Word, Open Office (free) and other blogging clients such as MS Live Writer (free).

So, what are the common characteristics of an academic paper? What does WordPress have to support to provide functionality that meets most scholars’ publishing requirements? I scratched my head (and asked on Twitter) and came up with the following:

  • footnotes/endnotes
  • citations
  • use of LaTeX (sciences)
  • tables
  • images
  • bibliography
  • sub-headings
  • annexes
  • appendices
  • dedication
  • abstract
  • table of contents
  • index to figures
  • introduction
  • exposition
  • conclusion

Many of these are supported in WordPress by default and don’t require any additional plugins (tables, images, sub-headings, annexes, appendices, dedication, abstract, introduction, exposition, conclusion, are all either basic literary conventions or just part of a simply structured document).

For additional support, I installed digress.it, which we have funded through the JISCPress project. This is a WordPress plugin which allows readers to comment on the paragraphs of a document, rather than at the document section level. We’re adding a lot more functionality to meet the objectives of the JISCPress project, but I chose digress.it, principally for the reason that it is designed to turn a WordPress blog into a document site. I could have used any other WordPress theme, but digress.it automatically creates a Table of Contents and allows you to re-order WordPress posts when they are read so that you don’t have to author your document in reverse or adjust the publication dates so the document sections appear in the correct order.

My dissertaion published using digress.it
My dissertation published using digress.it

I added the abstract for my dissertation to the ‘about’ page, so it shows up on the front of the site. I also uploaded a PDF version so that people can download it directly. You’ll see that I also added some links to a related book and DVD, which will certainly appeal to people who are interested in my dissertation. The links pull an image and some basic metadata from Amazon, using the Amazon Machine Tags plugin. This could be used to link to the book in which your article is published and earn you money in click referrals. An alternative, would be the Open Book Book Data plugin, which retrieves a book cover and metadata from Open Library, where your book may already be catalogued. If it’s not on Open Library, catalogue it!

After setting this up, I installed a few more plugins:

Dublin Core for WordPress: Automatically adds ten Dublin Core metadata elements to the document mark up.

wp-footnotes: This allows you to easily add footnotes to your document by enclosing your footnote in double parentheses. ((I am using the plugin on this blog!))

OAI-ORE Resource Map: Automatically marks up the document sections with a OAI-ORE 1.0 resource map.

Google Analyticator: Adds Google Analytics support so you can collect statistics on the readership of your document.

WP Calais Archive Tagger: Analyses your entire document and automatically keywords each section, using the Open Calais API.

Search API: WordPress comes with search built in, but there is a new search API which will eventually make its way into the WordPress core. I’ve installed the plugin to provide full-text search across the document. It can also add Google Search to your document site.

wp-super-cache: This is simple to install and will significantly speed up your document site, making it a pleasure to navigate through and read 🙂

Plugins I didn’t use

wp-latex: Although I didn’t need it for my dissertation, it’s worth noting that WordPress supports the use of \LaTeX.

Academic Citation: You need to add a line of code to your theme for this to display. It supports the concept of an article being a single blog post, rather than a ‘document site’ and displays a variety of citation formats for readers to use.

Do you know of any other plugins for a scholarly blog?

The Beauty of Feeds

The other useful thing about managing a document using WordPress and in particular, using digress.it, is that you automatically get RSS/Atom feeds for the document. I’ve already discussed these in detail. It means that I was able to read my document in my feed reader, with footnotes and images displayed correctly.

Document in Google Reader

See how nicely the formatting is preserved. \LaTeX is also rendered correctly in feed readers.

Document formatted nicely in Google Reader
Reading my dissertation in Google Reader

You’ll see that the document sections are listed in order; that is, first section on top. As I noted above, blogs list posts in reverse (most recent first), so I sorted the feed items in Yahoo Pipes and sorted it in ascending order. Yahoo Pipes exports as RSS and it’s that feed that I subscribed to in Google Reader. Wouldn’t it be nice, if I could import my document feed into an Institutional Repository? Wait a minute, I can! 🙂

Importing an RSS feed into EPrints

Click to see the item in the repository
Click to see the item in the repository

When importing the default feed, the HTML output is accurate but in reverse order, while the RSS output from Yahoo Pipes didn’t import into EPrints very cleanly at all. I’ll work on this. UPDATE: Forget Yahoo Pipes. WordPress feeds can be sorted with a switch added to the URL: http://example.com/feed/?orderby=post_date&order=ASC

So there it is. An academic paper, published to the web using a modern CMS which supports most authoring and publishing requirements. I would favour an institutional WPMU platform for academics to author directly to, publish their pre-print to the web for open access and detailed comment, and import their RSS feed into the repository. As a proof of concept, I’m quite pleased with this. We are currently developing a widget that can be embedded in a web page or WordPress sidebar and allow a member of staff to upload a document or zipped folder of documents to the Institutional Repository. I wonder if we can also support the import of a feed from the widget, too?

So, what would your requirements be? Tell me and I’ll do my best to test WordPress against them.

There is a tension between being relevant and being reputable

…or ‘how to get read on the World-Wide-Web.’

This is a presentation about Search Engine Optimisation (SEO), but it is also about literacy and reputation in the age of the Internet. It is about how to understand and write well for the web so that like-minded people can learn about what you’ve got to say and be compelled to tell others about what you’ve got to say, too.

Although it’s not aimed at scholarly writing, that doesn’t matter. To Google’s crawlers, HTML source code is HTML source code, whether you publish articles about research into HIV or have something pointless to say about the latest gadget. No matter what the content is about there are literary as well as technical observations that can improve your communication and the impact of your writing.

Much of the presentation elaborates on this: “There is a tension between relevance and reputable.” It’s interesting.

The FolkSemantic Widget for OER discovery

I like this. A widget that analyses the content of your web page and suggest related Open Educational Resources (OERs), using FolkSemantic, a collaborative website that allows you to “browse and search over 110,000 OERs”. You can see the widget in the sidebar of this blog under the heading of ‘Related Educational Resources’ –>

So, if I dump a load of text relating to ‘physics’, say, you should see physics-related OERs… Does it work? 🙂 Some random tests on other blog posts, suggests it is a bit hit and miss, but is certainly matching some OERs to the content. I wonder if we could use this approach to find related documents on the JISCPress project?

Physics (Greekphysis – φύσις meaning “nature“) is a natural science; it is the study of matter[1] and its motion through spacetime and all that derives from these, such as energy and force.[2] More broadly, it is the general analysis of nature, conducted in order to understand how the world and universe behave.[3][4]

Physics is one of the oldest academic disciplines, perhaps the oldest through its inclusion of astronomy.[5] Over the last two millennia, physics had been considered synonymous with philosophychemistry, and certain branches of mathematics and biology, but during the Scientific Revolution in the 16th century, it emerged to become a unique modern science in its own right.[6] However, in some subject areas such as in mathematical physics and quantum chemistry, the boundaries of physics remain difficult to distinguish.

Physics is both significant and influential, in part because advances in its understanding have often translated into new technologies, but also because new ideas in physics often resonate with the other sciences, mathematics and philosophy.

For example, advances in the understanding of electromagnetism led directly to the development of new products which have dramatically transformed modern-day society (e.g., television, computers, and domestic appliances); advances in thermodynamics led to the development of motorized transport; and advances in mechanics inspired the development of calculus.

Physics covers a wide range of phenomena, from the smallest sub-atomic particles, to the largest galaxies. Included in this are the very most basic objects from which all other things are composed, and therefore physics is sometimes said to be the “fundamental science”.[7]

Physics aims to describe the various phenomena that occur in nature in terms of simpler phenomena. Thus, physics aims to both connect the things we see around us to root causes, and then to try to connect these causes together in the hope of finding anultimate reason for why nature is as it is.

For example, the ancient Chinese observed that certain rocks (lodestone) were attracted to one another by some invisible force. This effect was later called magnetism, and was first rigorously studied in the 17th century.

A little earlier than the Chinese, the ancient Greeks knew of other objects such as amber, that when rubbed with fur would cause a similar invisible attraction between the two. This was also first studied rigorously in the 17th century, and came to be calledelectricity.

Thus, physics had come to understand two observations of nature in terms of some root cause (electricity and magnetism). However, further work in the 19th century revealed that these two forces were just two different aspects of one force – electromagnetism. This process of “unifying” forces continues today (see section Current research for more information).

Physics uses the scientific method to test the validity of a physical theory, using a methodical approach to compare the implications of the theory in question with the associated conclusions drawn from experiments and observations conducted to test it. Experiments and observations are to be collected and matched with the predictions and hypotheses made by a theory, thus aiding in the determination or the validity/invalidity of the theory.

Theories which are very well supported by data and have never failed any competent empirical test are often called scientific laws, or natural laws. Of course, all theories, including those called scientific laws, can always be replaced by more accurate, generalized statements if a disagreement of theory with observed data is ever found [8]. ((Source: Wikipedia))

Reading ‘The Edgeless University’ and ‘HE in a Web 2.0 World’ reports

I have been asked to present the recent Higher Education in a Web 2.0 World report to the University’s next Teaching and Learning Committee. The report came out shortly before, and is referenced by, The Edgeless University. Why Higher Education Must Embrace Technology, which was launched by David Lammy MP at the end of June. I’ve been going through both reports, pulling out significant quotes and annotating them. Here are my notes. It is not a comprehensive nor formal review of the reports, nor a statement from the University of Lincoln. Just personal reflections which I will take to my colleagues for discussion. I don’t whole-heartedly agree with every statement made in both reports or even those quoted here, but I do take government promoted reports, and the funding that accompanies them, seriously.

I include quotes from David Lammy’s speech, as it can be read as a formal statement from government on the recommendations of the ‘Edgeless’ report and a commentary on future funding priorities.

If you’ve not yet read the reports, my notes might provide a useful summary, albeit from the bias of someone charged with supporting the use of technology to enhance teaching and learning.  I am also an advocate of Open Access and Open Education on which the Edgeless report has a lot to say. Methodologically, the writing of both reports combined both current literature reviews and interviews across the sector and as I write, they are the most current documents of their kind that I am aware of.

If you have commented on either of these reports on your own blog or have something to say about the excerpts I include here, please do leave a comment and let me (and others) know.  Thanks.

Continue reading “Reading ‘The Edgeless University’ and ‘HE in a Web 2.0 World’ reports”

Ten reasons why you should pay attention to the geeks because actually they have something quite important to say which us non-geeky people should be listening to

Re-broadcasting Mike Ellis’ recent presentation

Books, LibraryThing and me

We recently planned, designed and built our own, small, house. Once the builders had gone, I finished off the interior, laying the finished floors, decorating and tiling. I also put up book shelves that went up one side of the back door, over it and then down the other side, so that when you walk through the door, you walk under our books. There’s about 500 or so, I guess.

Why am I writing about this? Well, those books, accumulated by my wife and I over many years, look good against the wall there, and when people visit, they often stand looking across the shelves at the range of books we’ve bought and sometimes even read. Personally, I buy books on a whim and there are many that I’ve yet to read. I rarely read books cover-to-cover and rarely read fiction. When you look across the shelves, tilting your head to read the spines, you’ll come across the fiction I started reading in my late teens, the books on Buddhism, philosophy and Japanese, that I bought as a student; you’ll see the books I bought while a post-graduate student, studying film archiving. Then there are all the books I bought in-between, while teaching myself about computers, not to mention all my wife’s books. I don’t know about you, but those books say a lot about me and about the last 20 years of my life. Many of them reflect my interests and ambitions before I sent my first email, before I first used the World Wide Web and before I knew 70% of the people I now call ‘friends’ (such an abused word these days).

The thing is, I haven’t opened many of those books since I first looked at them. I’ll never read them again and most of the people I know at work and online, will never scan my bookshelves. We can chat over Twitter, subscribe to each other’s FriendFeed and read each other’s blogs, but I’ll tell you now, that’s not even half of me.

I signed up to LibraryThing a couple of years ago, added a couple of Cormac McCarthy books and then left the account alone. I thought my wife would enjoy it more than me. She reads books cover-to-cover all the time. I don’t. I use books to learn about things that I don’t know. With the exception of a few authors, I rarely read books for relaxation. I relax in my own time online and have done for years. It suits my wandering mind.

It occurred to me a few days ago that LibraryThing could enrich my digital identity in a way that no other social networking site could. By importing my book collection into LibraryThing, I could go back over my book collection, dust off the covers and gradually enrich my online identity, and at the same time people would get to know me better, if they cared to look. Never mind Twitter, where I’ve read that we’ll get to know one-another through a glimpse of the small details of our lives. I don’t buy it. You’ll learn more about me by perusing my LibraryThing collection, I can assure you.

As I write this, I’ve added 110 books that I own, about a third of the total, I reckon.

This also got me thinking about e-portfolios and how the books I accumulated as a student reveal a lot more than my CV about my depth of study and research interests during and shortly after those periods of learning. To build a book collection over time is an achievement in itself. We tend to think of a portfolio as an accumulated, curated presentation of work that we have undertaken. It’s a product – something to show; but until recently we couldn’t include a book collection in our portfolio.

As a student, I spent more time reading than writing and I read wider than my term papers and exams reflected. Despite a good degree, my undergraduate essays aren’t worth your time today, even if I could still find them, but I still have the books I collected and they are worth your time. If I was employing me, I’d take an interest in my book collection. It’s a background check I can recommend.

Of course, I could have added books I don’t own to give a false impression of myself (I haven’t). And I could exclude books from LibraryThing that I do own, because they might give a false impression of who I think I am (I have – the exceptions are trivial). But this is my point about Identity and LibraryThing: I’ve got a collection of physical books that I’m now curating online to develop a ‘portfolio’ that better represents me.

I guess that’s what a lot of LibraryThing members do. Have you?

Commons based peer-production: One minute of Wikipedia edits

The technical conditions of communication and information processing are enabling the emergence of new social and economic practices of information and knowledge production. ((The Wealth of Networks: Direct link))

You may have read Yochai Benkler’s book, The Wealth of Networks, where he discusses Wikipedia as an example of commons-based peer-production. Did you know that you can see this relatively new model of knowledge and economic production live, in real-time? The video below is just one minute of Wikipedia edits recorded from the live changes on the irc.wikimedia.org #en.wikipedia channel. Using the IRC channel, you can watch Wikipedia being created as it happens, which means you can see the incremental production of collective knowledge as it happens. I recommend full-screen HD to see the detail as it passes up your screen. There are different channels for the different language versions. I chose the English version.

The Wikimedia site provides detailed statistics about the use of their sites, although the English Wikipedia statistics stop at October 2006 🙁 Perhaps there’s just too much activity on that site for them to collect and measure?

A lot of people still have an aversion to Wikipedia, but I don’t think they get it. Wikipedia is completely open to anyone to contribute. If you don’t think it’s good enough, ((See the famous Nature article which compared Wikipedia to Encyclopedia Britannica [PDF])) isn’t it your (moral?) responsibility to correct and improve it? Like it or not, as a single source, it has by far the widest reach of any web-based learning resource and although I don’t have the time to substantiate this, I bet that after Google, it’s the second online resource that students visit when beginning their research. ((Via Twitter, AJCann just pointed me to some research he’d done which shows that 100% of his student cohort use Wikipedia)) If you challenge what’s happening on Wikipedia, you’re fighting a losing battle. Stop complaining and start contributing!

Personally, I watch the Wikipedia edits rolling up my screen, seeing contributions as they happen from individuals I’ll never know and am filled with optimism. Each edit is underwritten by a Creative Commons license which protects and preserves this body of knowledge for perpetuity. If there were world heritage sites on the Internet, Wikipedia would surely be the first to be recognised as such.

PubSubHubbub: Realtime RSS and Atom Feeds

It’s made Dave Winer happy, which is no easy task, so I think PubSubHubbub is worth mentioning here. If it’s working as it should, this post should appear in my Google Reader, almost immediately after I’ve published it. That’s because PubSubHubbub is “a simple, open, server-to-server web-hook-based pubsub (publish/subscribe) protocol as an extension to Atom [and RSS].” My blog feed is managed by FeedBurner which has already implemented the new protocol, as has Google Reader FriendFeed. They should therefore ‘talk’ to each other in realtime. Watch the video and you’ll see how it works. It’s pretty straightforward. It just takes a company the size of Google to push it through to adoption. The engineers say they were using it like Instant Messaging the night before the demo, which says something about how responsive this is. Technically, it should be another challenge to Twitter in that it allows for a distributed method of near realtime communication.  I’d like to see that. I feel like an idiot communicating within the confines of  Twitter, sometimes.