Working on the web

Each month, David, Paul and I offer workshops for ‘Working on the web’, aimed at introducing staff to different aspects of Web 2.0 which might be useful in their research and teaching. Our original outline for these sessions can be seen over on the Learning Lab wiki.

A couple of things have reminded me recently that it might be useful to describe how I work on the web.

First of all, I use an up-to-date browser (Firefox or Chrome) with a few extensions. I block all advertising, using AdBlock, all trackers, using Ghostery and a password management extension, so I never use the same password on any two websites. Chrome allows me to synchronise all my preferences, bookmarks, passwords and other bits and pieces across different computers, so my experience on my desktop, laptop or home computer is the same. When using Firefox, I have the Sync extension installed, for the same reason.

Next, in terms of my basic set up, I have four useful ‘bookmarklets’: One for j.mp, which allows me to create a short URL for the current site, another for Readability, that makes reading long articles somewhat easier, one for delicious, to bookmark or ‘favourite’ sites, and a Posterous bookmarklet that allows me to quickly take clippings from web pages and post them to my Posterous site.

My Posterous site ‘things that stick’, is one of a few ways that I organise information on the web. I use Posterous almost exclusively for posting selected text (‘clippings’) from websites or PDF articles that make an impression on me. I use delicious for straightforward social bookmarking of a website, usually copying a piece of text from the site that best describes what it’s about. I use Google Reader to ‘Share’ whatever crops up in my feed reader that interest me. Whether I clip, bookmark or share, none of these actions is any kind of endorsement of the content but simply means the information is, in some way, of interest to me and I might want to come back to.

I share what is of interest to me by creating a ‘bundle’ from the RSS feeds of these three services in Google Reader. That bundle has a public web page and atom feed. However, all the items are presented in full text and therefore a hassle to get a quick overview of what’s been recently shared. So, I also aggregate the three sources to my own blog, ‘Elsewhere‘, where anyone can get a quick summary of the information I gather each day (and you can grab an RSS feed, too). I do this using the lifestream plugin for WordPress. This also means that through this process, the links I’m collecting ultimately come back to a site that I own and I have some kind of control over the retention of that data.

Google Reader is central to how I work on the web. I subscribe to news feeds from anywhere between 200 and 400 sites at any one time. Currently, it’s at a comfortable 230 subscriptions, which I read on my walk to and from work and occasionally during the day. I scan a couple of hundred headlines a day and click on about 10% of those headlines to read the article. This is my main method of reading the web.

I also use Google Reader to subscribe to every service I use on the web, so it’s a way of aggregating my own footprint on the web and keeping track of services I have used. The other reason for doing this is that Google Reader is searchable, so I can search over any of my activity on the web if I want to go back to something I read, create, shared or wrote.

Next, I have this work blog, which I use as a notebook more than anything else. I regularly refer back to it and search through it to remind me of the work I’ve done, ideas I’ve had and events I’ve been involved in. Whenever I have to report on my work, I refer back to this blog.

I use an Amazon ‘wishlist’ to maintain a list of books that look interesting and I might buy in the future. It’s a shame that there’s no RSS feed from wishlists. If there was, I’d add it to my daily bookmarks and clippings on my Elsewhere blog.

I use Mendeley to organise research papers in PDF format. Currently, I have over 500 PDF files synchronised across my work desktop and laptop (about 1.3GB). I moved to Mendeley, not for its social features, but simply because it renames and organises the files nicely on my hard drive and synchronises across computers. Before using it, I was in a mess.

I visit Wikipedia more than any other single website. It’s not perfect but its imperfections merely reflect our own imperfections and it is more perfect than any other collected source of information on the web.

I use Google docs for most of my non-blog writing these days. Funding applications, conference papers and articles I’m working on, all start off on Google docs and only move to Open Office if formatting requires.

I use slideshare to publish any presentations I give. I used to use Scribd until they starting charging people to download content from their site. When slideshare start charging, as I suspect they will, I’ll delete my account there, too.

On the subject of deleting accounts, I stopped using Twitter at the weekend. I’ve been trying to wean myself away from Twitter for months, having moved to using it largely for sharing links and as a news aggregator, picking up links from other people. I’ve never really liked it for conversation, finding the 140 character limit, well, limiting, in a demeaning sort of way. More recently, I’d created a private list of 20 or so people out of the 400 or so that I followed, who regularly pick up on sources of information I value, and this had become the extent of my experience using Twitter as I intentionally tried to wind down my use of it. Last weekend, I felt particularly overwhelmed with work and the intrusion that it can become at home, and so I deleted my account altogether. I know from past experience that not using it, rather than deleting it, wasn’t an option for me. I’d have simply ‘done a Stephen Fry’ and returned to it before too long, sorry addict that I’d become.

I’ve been on Twitter for a couple of years and had over 1000 followers, a few of whom are now real friends, though about half looked like people simply looking for re-follows, another large percentage were people who subscribed on mass to lists of people (usually EdTech lists) and quite a few more were people I’ve never had any contact with whatsoever. I’ve also found that my ability to concentrate has severely diminished over the last couple of years, with the constant distraction of having email/SMS/Twitter present in the back of my mind. Even turning off all notifications on my phone and computers hasn’t helped. Now I just use Google Reader to follow the RSS feeds of about 10 people on Twitter. It’s a bloody relief, to be honest. Here’s to being able to concentrate a little better from now on.

As with Twitter, I stopped using Facebook at the end of last year. The web is my social network and the above tools, my personal working learning environment.

Web 2.0 and endless growth

I’ve just read The Digital Given: 10 Web 2.0 Theses (2009) and found it to be one of those rare pieces of writing that makes me feel like I’m not alone. Here is thesis 9:

Soon the Web 2.0 business model will be obsolete. It is based on the endless growth principle, pushed by the endless growth of consumerism. The business model still echoes the silly 90s dotcom model: if growth stagnates, it means the venture has failed and needs to be closed down. Seamless growth of customised advertising is the fuel of this form of capitalism, decentralized by the user-prosumer. Mental environment pollution is parallel to natural environment pollution. But our world is finished (limited). We have to start elaborating appropriate technologies for a finite world. There is no exteriority, no other worlds (second, third, fourth worlds) where we can dump the collateral effects of insane development. We know that Progress is a bloodthirsty god that extracts a heavy human sacrifice. A good end cannot justify a bad means. On the contrary, technologies are means that have to justify the end of collective freedom. No sacrifice will be tolerated: martyrs are not welcome. Neither are heroes.

We know that the advertising-based consumerist model of the web is as fragile as the pursuit of growth but what alternatives to growth are there? Over the last year, looking at energy, the economy and its impact on the environment, it’s become clear to me that the pursuit of economic growth is deeply flawed and limited. This was the subject of a conference in Leeds in June, on Steady State or no-growth economics. Yesterday, the conference organisers produced a report, Enough is Enough, which is an attempt to outline an alternative to a growth-based economy. You might remember that the new economics foundation also produced a report in January called Growth isn’t possible. The publication that first drew my attention to this subject was the government commissioned report, Prosperity Without Growth.

Thanks to Richard Hall for pointing ’10 Web 2.0 Theses’ out to me. It was first published on nettime, which is a mailing list I can recommend.

Scholarly publishing with WordPress

Working on the JISCPress project, I’ve been thinking quite a lot about scholarly publishing on the web, and in particular with WordPress. This morning, I read a post over on the ArchivePress blog about some WordPress plugins which are useful additions for creating a scholarly blog and it got me thinking a bit more about what features WordPress would need to support scholarly publishing.

JISCPress does away with the idea that WordPress is a blogging tool, and instead uses WordPress Multi-User as a document publishing platform, where one site or ‘blog’ is a document. The way WPMU is structured means that despite serving multiple (potentially millions) of document sites, the platform remains relatively ‘lightweight’ as each document site generates just a handful of additional database tables, while sharing the same administrative core as a single WordPress install. So, 100 WordPress blogs on WPMU is nothing like the equivalent of running 100 separate WordPress blogs, both from the point of resource requirements and administration. In fact, quite soon, there will be no such thing as WPMU as the two products are going to be merged and because they share 90%+ of the same code already, it’s not too difficult to achieve. ((Has anyone done a diff on the two code bases to measure exactly what percentage of the code is shared between WP and WPMU?))

Anyway, my point here is to discuss whether WordPress can be extended to accommodate most conventions found in scholarly publishing and where it is lacking, to identify the development work required to meet the needs of most academic who wish to write on and publish to the web. ((Actually, I think I’ll save the discussion of its shortfalls for my next post. This one is already long enough.))

Scholarly publishing extends to a wide variety of published outputs. As a Content Management System (CMS) and technology development platform, I believe that WordPress has the potential to support any type of scholarly publishing that the web supports. It is extremely extensible, as can be seen from the 6000+ plugins that are available. However, what I’m interested in is what can be done now, by an academic wishing to publish their work through the use of WordPress acting as a CMS. What can be achieved with a few quid ((I pay $5/year for my domain name and as many sub-domains as I need. I pay $10/month for my hosting with unlimited storage and bandwidth.)) to self-host WordPress so that a few plugins can be installed and a well structured, typical, scholarly paper can be published.

My Dissertation

For some time, I’ve been meaning to publish my MA dissertation. Back in 2002, I undertook some unique research which has not, to my knowledge, been repeated and I think there is some value in having it easily accessible on the web. I have an OpenOffice file and a PDF and, in the course of a morning, have published it under my own domain. The reason I did not publish it on the university WPMU platform is because I have been experimenting with different plugins and did not want to install plugins that were untested or we may not support long-term.  In this case, I’ve used a single WordPress installation, but ideally an individual researcher, group of researchers or research institution, would run a WPMU installation which allowed multiple documents to be authored individually or collaboratively ((Like any decent CMS, WordPress supports role-based authoring and editing and maintains a revision history of edits, auto-saved once per minute. Revisions can be compared alongside of each other.)) and published directly to the web as XHTML.

BuddyPress, by the way, can make the experience even more natural, not only because it is based around a community of like-minded people writing together  on the same web publishing platform, but also because, with a few tweaks here and there, we can move away from the language of blogs and towards the language of documents.


BuddyPress admin bar

Profile menu

Enough of BuddyPress on WPMU for now and back to my dissertation. I set up the site in ten minutes, without using FTP or a command line because I use a host that provides a one-click install of WordPress and WordPress allows you to search for and install plugins from its Dashboard, rather than having to use FTP. Once the site was installed, I then  made some basic changes to the settings, turning on XML-RPC and AtomPub, so that, if I decided to, I could publish to the site using my Word Processor. ((On a scholarly WPMU installation, plugins could be pre-installed and activated, a default theme selected and settings tweaked so very little work is required by the academic author prior to writing her document.)) I didn’t use this in the end, but trust me, it works very well using recent versions of MS Word, Open Office (free) and other blogging clients such as MS Live Writer (free).

So, what are the common characteristics of an academic paper? What does WordPress have to support to provide functionality that meets most scholars’ publishing requirements? I scratched my head (and asked on Twitter) and came up with the following:

  • footnotes/endnotes
  • citations
  • use of LaTeX (sciences)
  • tables
  • images
  • bibliography
  • sub-headings
  • annexes
  • appendices
  • dedication
  • abstract
  • table of contents
  • index to figures
  • introduction
  • exposition
  • conclusion

Many of these are supported in WordPress by default and don’t require any additional plugins (tables, images, sub-headings, annexes, appendices, dedication, abstract, introduction, exposition, conclusion, are all either basic literary conventions or just part of a simply structured document).

For additional support, I installed digress.it, which we have funded through the JISCPress project. This is a WordPress plugin which allows readers to comment on the paragraphs of a document, rather than at the document section level. We’re adding a lot more functionality to meet the objectives of the JISCPress project, but I chose digress.it, principally for the reason that it is designed to turn a WordPress blog into a document site. I could have used any other WordPress theme, but digress.it automatically creates a Table of Contents and allows you to re-order WordPress posts when they are read so that you don’t have to author your document in reverse or adjust the publication dates so the document sections appear in the correct order.

My dissertaion published using digress.it
My dissertation published using digress.it

I added the abstract for my dissertation to the ‘about’ page, so it shows up on the front of the site. I also uploaded a PDF version so that people can download it directly. You’ll see that I also added some links to a related book and DVD, which will certainly appeal to people who are interested in my dissertation. The links pull an image and some basic metadata from Amazon, using the Amazon Machine Tags plugin. This could be used to link to the book in which your article is published and earn you money in click referrals. An alternative, would be the Open Book Book Data plugin, which retrieves a book cover and metadata from Open Library, where your book may already be catalogued. If it’s not on Open Library, catalogue it!

After setting this up, I installed a few more plugins:

Dublin Core for WordPress: Automatically adds ten Dublin Core metadata elements to the document mark up.

wp-footnotes: This allows you to easily add footnotes to your document by enclosing your footnote in double parentheses. ((I am using the plugin on this blog!))

OAI-ORE Resource Map: Automatically marks up the document sections with a OAI-ORE 1.0 resource map.

Google Analyticator: Adds Google Analytics support so you can collect statistics on the readership of your document.

WP Calais Archive Tagger: Analyses your entire document and automatically keywords each section, using the Open Calais API.

Search API: WordPress comes with search built in, but there is a new search API which will eventually make its way into the WordPress core. I’ve installed the plugin to provide full-text search across the document. It can also add Google Search to your document site.

wp-super-cache: This is simple to install and will significantly speed up your document site, making it a pleasure to navigate through and read 🙂

Plugins I didn’t use

wp-latex: Although I didn’t need it for my dissertation, it’s worth noting that WordPress supports the use of \LaTeX.

Academic Citation: You need to add a line of code to your theme for this to display. It supports the concept of an article being a single blog post, rather than a ‘document site’ and displays a variety of citation formats for readers to use.

Do you know of any other plugins for a scholarly blog?

The Beauty of Feeds

The other useful thing about managing a document using WordPress and in particular, using digress.it, is that you automatically get RSS/Atom feeds for the document. I’ve already discussed these in detail. It means that I was able to read my document in my feed reader, with footnotes and images displayed correctly.

Document in Google Reader

See how nicely the formatting is preserved. \LaTeX is also rendered correctly in feed readers.

Document formatted nicely in Google Reader
Reading my dissertation in Google Reader

You’ll see that the document sections are listed in order; that is, first section on top. As I noted above, blogs list posts in reverse (most recent first), so I sorted the feed items in Yahoo Pipes and sorted it in ascending order. Yahoo Pipes exports as RSS and it’s that feed that I subscribed to in Google Reader. Wouldn’t it be nice, if I could import my document feed into an Institutional Repository? Wait a minute, I can! 🙂

Importing an RSS feed into EPrints

Click to see the item in the repository
Click to see the item in the repository

When importing the default feed, the HTML output is accurate but in reverse order, while the RSS output from Yahoo Pipes didn’t import into EPrints very cleanly at all. I’ll work on this. UPDATE: Forget Yahoo Pipes. WordPress feeds can be sorted with a switch added to the URL: http://example.com/feed/?orderby=post_date&order=ASC

So there it is. An academic paper, published to the web using a modern CMS which supports most authoring and publishing requirements. I would favour an institutional WPMU platform for academics to author directly to, publish their pre-print to the web for open access and detailed comment, and import their RSS feed into the repository. As a proof of concept, I’m quite pleased with this. We are currently developing a widget that can be embedded in a web page or WordPress sidebar and allow a member of staff to upload a document or zipped folder of documents to the Institutional Repository. I wonder if we can also support the import of a feed from the widget, too?

So, what would your requirements be? Tell me and I’ll do my best to test WordPress against them.