One of the lasting outcomes of our Total Recal ‘rapid innovation’ project in 2010, was that Alex Bilbie wrote the first (and only) OAuth 2.0 server for the CodeIgniter PHP development framework that we use. Since then, he’s been refining it and with every new project, we’ve been using it as part of our API-driven approach to development. As far as we know, the use of the OAuth 2.0 specification, which should be finalised at a forthcoming IETF meeting, is not yet being used by any other university in the UK. There are a few examples of OAuth revision A in use, but OAuth 2.0 is a major revision currently in its 23rd draft.
As a result of his work, Alex was invited to talk about OAuth 2.0 at Eduserv’s Federated Access Management conference last year.
Nick Jackson gave the same presentation at the Dev8D conference a couple of weeks ago.
Since Total Recal, we’ve used OAuth 2.0 for Jerome, data.lincoln.ac.uk, Zendesk, Get Satisfaction, and more recently Orbital and now ON Course. We’re at the stage where our ‘single sign on’ domain https://sso.lincoln.ac.uk is the gateway to our OAuth 2.0 implementation and it will soon be running on two servers for redundancy. In short, due to various JISC projects helping pave the way, it has been formally adopted by central ICT Services, and staff and students are gradually being given control over what services their identity is bound to and what permissions those services have.
The work Nick is doing on the Orbital project is extending Alex’s OAuth 2.0 server to include some of the optional parts of the specification which we’ve not been using at Lincoln, such as refresh tokens and using HTTP Authentication with the client credentials flow. This means that the server is able to drop straight in to a wider range of projects and services.
Recently, JISC published a call for project proposal around Access and Identity Management (AIM), which I am starting to write a bid for. Appendix E1 states:
JISC is particularly interested in seeing innovative and new uses for OAuth. Bids should show how this technology brings benefits to the community and can help address institutional requirements within research, teaching and learning, work based learning, administration and Business Community Engagement.
In Total Recal, we released version 1 of the server code but have learned a lot since that project through integrating OAuth with other services. Version 2 of our OAuth server is more representative of our current implementation and fully implements the latest draft (23) of the specification.
However, this is what access and identity management currently looks like:
At the moment, the most widespread use of the OAuth server is Zendesk, our ICT and Estates online support service. Projects such as Jerome, Orbital, and ON Course, as well as three 3rd year Computer Science student dissertation projects are using it, too. The plan is to use OAuth alongside Microsoft’s Unified Access Gateway (UAG), which can talk SAML to OAuth via the OAuth SAML 2.0 specification. Here’s what we intend to do:
The primary driver for this is the ‘student experience’ and it cuts three ways:
Richer sharing of data between applications: A student or lecturer should be able to identity themselves to multiple applications and approve access to the sharing of personal data between those applications.
A consistent user experience: What we’re aiming for at first is not strictly ‘single sign on’, but rather ‘consistent sign on’, where the user is presented with a consistent UX when signing into disparate applications.
Rapid deployment: New applications that we develop or purchase should be easier to implement, plugging into either OAuth or the UAG and immediately benefiting from 1) and 2).
Following a recent meeting between ICT and the Library, we agreed to take the following steps:
All library (and ICT) applications that we operate internally must have Active Directory sign-in instead of local databases. Almost all of our applications achieve this already. This is the first step towards step (3).
All web-based applications must offer a consistent looking sign-in screen based on the sso.lincoln.ac.uk design (which uses the Common Web Design). This is the second step towards (3).
All systems must implement web-based single sign on via OAuth, SAML or ADFS and they will be sent to either UAG or the OAuth/SAML server.
The library are going to investigate to what extent we can do (2) with their applications such as Horizon and EPrints, and from then on, systems that are purchased or updated must do (3). It also makes sense to look at EPrints and WordPress in the short-term as applications that can use OAuth.
Two of the outputs we’ll propose to JISC are a case study of this work, as well as further development of the open source server Alex and Nick have been developing including an implementation of the OAuth SAML specification that we’ll share. Like our related work on staff profiles, the need to get access and identity right is becoming increasingly apparent as staff and students become accustomed to the way access and identity works elsewhere on the web. For Lincoln, a combination of OAuth and UAG is the preferred route to achieving consistent sign on across all applications, bridging both the internally facing business applications managed by ICT (e.g. Sharepoint, Exchange, Blackboard) and the more outward facing academic and social applications such as those developed and run by the Library and the Centre for Educational Research and Development.
Just a heads up to say that we’ll be advertising for a Web Developer to work on Orbital, our JISC-funded ‘Managing Research Data’ project. The post, starting in March/April, will be a 12 month, full-time, grade 5 (c.£21K) position.
The Web Developer (‘you’) will be working in the Centre for Educational Research and Development, alongside Nick Jackson, Lead Developer on Orbital, and also benefit from being in a team that includes staff in central ICT services and the Library. Orbital builds on and extends previous work we’ve been doing over the last couple of years, so if you’re interested, you should read through our projects pages.
If we were to summarise our technologies and interests I guess they would be #agile, #opensource, #opendata #LAMP, #php, #codeigniter, #mongoDB, #OAuth, #APIs, #HTML5, #CSS3, #github and moving towards #RDF and #LinkedData.
Just seeing these hashtags listed together should cause your heart to beat with excitement 🙂
When we advertise in January, you’ll see that the job spec is actually a pretty standard affair. What I want to emphasise here is how interesting and fun the job will be.
The key section in the Job Description is what you’d be working on with Nick:
Development and implementation of a set of web services, which re-use and develop our previous, JISC-funded work as well as other initiatives (e.g. SWORD and DataCite DOIs).
Documented source code will be made available under an open source license by the end of the project.
Development and implementation of mechanisms for managing and transferring data, including the use of MongoDB, OAuth, read/write RESTful APIs, SWORD2 interoperability, and integration with the administrative functions of EPrints.
That actually summarises a lot of work.
I’m managing the project and try to run things with as little hierarchy as possible within a university environment. You’ll always know the project priorities and will be trusted to self-organise and deliver on time, working to two-week iterations and, roughly, monthly releases. I regularly reflect on how we work and our overall working environment. For Orbital, I favour the Crystal Clear agile methodology, as does Nick. You’ll be encouraged to reflect on this with us, too.
We work hard, and not always 9-5pm, but we work at a pace that is sustainable over a long period of time. We take our work seriously but, in the spirit of hacking, are always looking for ways to have fun, too. We recognise that we’re fortunate to be working in a diverse and intellectually stimulating academic environment, but are user/product focused at the end of the day. You’ll be working directly with our users, who are Researchers in the School of Engineering and Siemens, and staff in the Library and ICT. You’ll need to be showing them refreshed, working software every couple of weeks and iteratively improving Orbital, based on their feedback and requirements. There may also be times when you’ll be asked to talk publicly about your work and you’ll be encouraged to blog about it every so often, too. I expect the project to produce one or two conference/journal papers, and you’ll be named as a contributor and can take as active role in that as you like.
I hope this sounds like an interesting job. At £21K, I recognise that it will probably attract younger developers looking to gain experience, though of course, we welcome applications from anyone whatever your age. By the time the post starts, we’ll have set up a decent dev/staging/production environment, hosted in the cloud, and relying on Github and Jenkins to keep things versioned, integrated and tested. Nick will have been developing Orbital for a couple of months or more and laid the groundwork for someone to start coding quickly in a supportive environment.
If you’re thinking of applying and don’t live in Lincoln, you’ll be pleased to know that it’s a decent small city, and a relatively cheap place to live. The campus is modern and sits by a Marina in the middle of the city. You can walk to work. I love the place. Oh, and you can choose your own hardware for development, within reason. Most of us use Macs, but whatever suits you. I’ll ask the successful candidate what they prefer when we offer them the job.
If, after reading around the project website, you’ve got any questions about the post, please do get in touch. Thanks.
* Wondering what the hell ‘web scale’ means? Something like this.
In this chapter, I reflect on Wikileaks and its use of technology to achieve freedom in capitalist society. Wikileaks represents an avant-garde form of media (i.e. networked, cryptographic), with traditional democratic values: opposing power and seeking the truth. At times, http://wikileaks.org appears broken and half abandoned and at other times, it is clearly operating beyond the level of government efficiency and military intelligence. It has received both high acclaim and severe criticism from human rights organisations, the mainstream media and governments. It is a really existing threat to traditional forms of power and control yet, I suggest, it is fundamentally restrained by liberal ideology of freedom and democracy and the protocological limits of cybernetic capitalism.
I’ve written before about how I used EPrints as a back end for WordPress, which was a front end for some OERs which are aimed at anyone wanting to learn how to sketch. I didn’t really know where I was going with it, but it worked out OK. I’ve also written about how WordPress can be used for scholarly publishing with the addition of a few plugins. In that post, I showed how I deposited my MA Dissertation into EPrints via RSS from WordPress. I’m going to take a similar approach with the OERs we’ve created for the ChemistryFM project, using the repository as canonical storage and WordPress as a front end for the course. I think that for these reasons, I was asked to provide a brief ‘position paper’ for next week’s JISC CETIS event on repositories and the open web. ((The distinction between the open web and the social web isn’t very clear on the CETIS event page. I think that the open web is not necessarily social and that the social web is not necessarily open. For me, the open web refers to a distributed web built on open source and open standards like HTML, RSS, RDF, OAuth, OpenID. Although the two are converging, Twitter for example is not as good an example as Status.net in terms of the open web, but a better example of the social web in terms of its uptake.))
My position is pretty straight forward really. I don’t think it’s worth developing social features for repositories when there is already an abundance of social software available. It’s a waste of time and effort and the repository scene will never be able to trump the features that the social web scene offers and that people increasingly expect to use. The social web scene is largely market driven (people working in profit making companies develop much of the social web software) and without constantly innovating, businesses fail. Repositories, on the whole, are not developed for profit and do not need to innovate for the sake of something new that will drive revenue. That is a good position to be in. Why change it? When repositories start competing for features with social web software, it is the beginning of the end for them.
EPrints offers versioned storage for the preservation of digital objects and a rich amount of data in a number of formats can be harvested and exported from each EPrint. The significance of the software is the exposure of its data to Google, as you will see from looking at the web analytics for any repository.
In thinking about how to join EPrints to the social web, I’ve toyed with the idea of a socialrepo, where WordPress harvests one or more feeds from the repository. With a little design work, WordPress could be the defacto front end for the repository providing all the social features of a mature blogging platform.
We’ve also commissioned a couple of plugins for EPrints that extends the reach both to and from EPrints. The first is a simple widget that can be placed on any web page and provides a way for a member of staff to upload a paper to their EPrints workspace. The second is an XML-RPC plugin that allows you to post a summary of your EPrint to your blog at the end of the deposit process so that the item can be advertised in a place more meaningful to you than an institutional repository and discussed alongside all your other academic blogging.
As I’ve shown with my own dissertation, EPrints can consume RSS feeds and if we want to add social web compatibility to EPrints, why not focus on improving the ingest process so that data can be harvested from the feed to populate the cataloguing fields? And while we’re at it, recall that the social web is rich in multimedia. EPrints could be much improved in how it ingests multimedia and the batch editing functionality that is essential when dealing with hundred of images, for example. Much could be done on the inside of EPrints, but on the outside, EPrints is an excellent example of the open web but a poor example of the social web. But let’s not beat ourselves up about it. The social web thrives on the technologies of the open web. Give it what it needs to thrive and make it easier for users to feed the beast.
I’m at JISC’s #dev8D conference. There’s no end of developer challenges but I’m not a developer. Still, here’s an idea that maybe someone will pick up and run with:
The use of eBook readers is on the rise. Anyone with an iPhone, Android phone, as well as Kindles and Sony Readers, has an eBook reader.
Institutional Repositories provide scholarly articles in PDF format, which eBook readers don’t handle very well at all, especially the phone versions.
Why not provide a Word-to-PDF conversion facility in your repository? EPrints currently offers Word-to-PDF conversion durinng the deposit process. Why not Word-to-ePub format, too?
Why not provide an ePub file as an alternative to the PDF download? ePub is a free, open, standards-based (XHTML/CSS) file format for eBook Readers. There are many advantages for the reader to having an ePub version rather than a PDF version when using an e-Book reader. i.e. better page navigation, search, bookmarks, variable font sizing.
There are PDF-to-ePub converters on the web, so technically it’s possible. They are a bit hit and miss, but so are the Word-to-PDF converters.
Anyone interested? I’d be keen to help if required.
In my previous job as Audiovisual Archivist, I spent a lot of time examining various metadata standards in detail; hours spent pouring over PBCore, METS, MODS, MIX, EXIF and IPTC/XMP, because we were designing a content model for an in-house Digital Asset Management system. I thought I had put it all behind me yet here I am staring at Phil Barker’s informative post about ‘metadata and resource description’ and it’s all coming back to me… Arrghhh 🙂
Workpackage six of the Chemistry.fm project aims to:
Plan the storage, delivery and marketing of the course.
Choose a metadata standard
Evaluate third-party hosting such as Flickr, Slideshare and YouTube as well as JORUM and the IR.
Ah, if only life were as simple as a series of bullet points!
As I was creating the project poster yesterday, I was reminded about the various ways that our project OERs could be ‘broadcast’. Although collaboration with our community radio station SirenFM, is core to the approach of our project, we all know that there are many ways for anyone to be a broadcaster on the web and part of the fun of this project for me, is being able to explore the different ways that educational content can be pulled and pushed between subscribing students and members of the public.
My plan at the moment is to use our Institutional Repository as the ‘canonical reference’ for the OERs. During our JISC-funded LIROLEM project, we developed EPrints to better accommodate multimedia resources and it makes sense to use a versioned digital archive that supports embedded media enriched by copious amounts of metadata. (I know it’s a requirement to use JORUM, too, but at the first Programme Meeting, it became clear that JORUM can be used simply as a directory where we can register URIs of existing OERs, so that’s what I’ll be doing).
Anyway, Archivists, have you ever feasted your eyes on the source code of an EPrint? Of course you have. Here’s a reminder.
Now, we could choose to lump all the OERs that we create into one single EPrint, but that doesn’t give us much flexibility and remember that EPrints is serving as the canonical reference for the OERs, not necessarily the final presentation layer that people will actually be using to browse, download and use the resources from. So if we were to group the OERs into sets of items that constituted an EPrint and then relate those EPrints to each other, using the “DC.isPartOf” property, from the point of view of metadata, we’ll be creating a consistent whole, but giving ourselves some flexibility in how we ‘broadcast’ the content of the course.
If we consider the course MindMap that we knocked up a while back, we might decide to create a single EPrint for each of the five major ‘nodes’ of the course. Doing this, would then give us an RSS 1.0 (RDF), RSS 2.0 and Atom feed for the course where each node was an item.
Before I move on with this, look at the export formats that EPrints offers for a query. Imagine that the course could be exported in each of these ways:
The zip export allows you to download the entire query and all it’s resources at once. The HTML citation format allows you to produce some HTML you could copy and paste into any web page. It could just as easily be dropped into Blackboard as it could on any other (and anybody’s) web page. BibTex would allow you to browse the course via your preferred reference management software and JSON… I still don’t completely get it, but it’s pretty fancy, I know that much.
Anyway, If each of the mindmap nodes is an ‘item’ in the RSS feed, then perhaps we can use that to feed a WordPress site, using the FeedWordPress plugin? Nope. It doesn’t seem to work. FeedWordPress recognises the feed but doesn’t import anything. Testing it with another feed based on keywords does work, but the information included in the feed is sparse, so that’s no good. By the way, the EPrints RSS 2.0 feed does include the xmlns:media=”http://search.yahoo.com/mrss” namespace and marks up the preview thumbnails accordingly:
(Another way to tackle this might be using our newly developed ‘EPrints2Blog’ plugin, which allows a depositor to post information about their new EPrint to a blog of their choice (using XML-RPC). As we deposit the course EPrints, each could be posted to a WordPress site. The resulting feed from the WordPress site does include some embedded media, but it’s still a bit of a hack. No, scrap this idea).
Right, how about this…?
Using EPrints as the canonical source for each of the files for the course, we could create a WordPress site with the addition of the Dublin Core and OAI-ORE plugins for WordPress.
For each WordPress post, this gives us the following metadata:
This is more like it. Click on the oai-ore link and look at the source code. It’s too big to display here, but it does what you’d expect and produces a OAI-ORE 1.0 compliant Atom/XML file. Contained within the file is a ‘resource map’ of all the WordPress posts and pages marked up with Dublin Core and FOAF terms. Thinking about how the course site might be represented in this way, it makes sense to atomise the course even further so that each of the sub-nodes of the Mind Map is a WordPress post. Using the current course structure, that would result in about 20 separate posts to represent the course. Each post would contain one or more resources such as a PDF, video, audio, slides, etc. Is it worth atomising it even further and creating a post for each of these resources, too, I wonder? Quite possibly.
Unfortunately, the resource map does not include media that are included in each post or page – apparently it’s on the developer’s list of things to do. Maybe we could use some of the project budget to ask Alex, who’s working on the JISCPress project with me, to extend the plugin in this way…
Finally, there’s also a MediaRSS plugin for WordPress, which could enhance the RSS feeds to include all the media used in the course. Here’s an example that’s including images by default. I’ve already written about the various feeds that are available for WordPress, with some careful categorisation and tagging, media rich feeds would be available for different points (‘nodes’) of entry into the course.
Once we are at this point, I guess we’re ready to think about broadcasting the course via Boxee and DeliTV (no time to dig into that now. Sorry!)
p.s. you’ve probably noticed that I’m a bit weak on the EPrints and OAI-ORE stuff, to say the least. Please do pick me up on where I’m going wrong with this. Thanks 🙂
As is often the case, I struggle at first glance to see the full implications of a new development in technology, which is why I so often rely on others to kick me up the arse before I get it. ((I am not ashamed to admit that I’m finding that my career is increasingly influenced by following the observations of Tony Hirst. Some people are so-called ‘thought-leaders’. I am not one of them and that is fine by me. I was talking to Richard Davis about this recently and, in mutual agreement, he quoted Mario Vargas Llosa, who wrote: “There are men whose only mission is to serve as intermediaries to others; one crosses them like bridges, and one goes further.” That’ll do me.))
Where I ramble about WordPress as a learning tool for the web…
I first read about web hooks while looking at WordPress, XMPP and FriendFeed’s SUP and then again when writing about PubSubHubbub. Since then, Dave Winer’s RSSCloud has come along, too, so there’s now plenty of healthy competition in the world of real time web and WordPress is, predictably, a mainstream testing ground for all of it. Before I go on to clarify my understanding of the implications of web hooks+WordPress, I should note that my main interest here is not web hooks nor specifically the real time web, which is interesting but realistically, not something I’m going to pursue with fervour. My main interest is that WordPress is an interesting and opportunistic technology platform for users, administrators and developers, alike. Whoever you are, if you want to understand how the web works and how innovations become mainstream, WordPress provides a decent space for exercising that interest. I find it increasingly irritating to explain WordPress in terms of ‘blogging’. I’ve very little interest in WordPress as a blog. I tend to treat WordPress as I did Linux, ten years ago. Learning about GNU/Linux is a fascinating, addictive and engaging way to learn about Operating Systems and the role of server technology in the world we live in. Similarly, I have found that learning about WordPress and, perhaps more significantly, the ecosystem of plugins and themes ((Note that themes are not necessarily a superficial makeover of a WordPress site. Like plugins, they have access to a rich and extensible set of functions.)) is instructive in learning about the technologies of the web. I encourage anyone with an interest, to sign up to a cheap shared host such as Dreamhost, and use their one-click WordPress offering to set up your playground for learning about the web. The cost of a domain name and self-hosting WordPress need not exceed $9 or £7/month. ((I am thinking of taking the idea of WordPress as a window on web technology further and am tentatively planning on designing such a course with online journalism lecturer, Bernie Russell. It would be a boot camp for professional journalists wanting (needing…?) to understand the web as a public space and we would start with and keep returning to WordPress as a mainstream expression of various web technologies and standards.))
… and back to web hooks
Within about 15 minutes of Tony tweeting about HookPress, I had watched the video, installed the plugin and sent a realtime tweet using web hooks from WordPress.
It’s pretty easy to get to grips with and if a repository of web hook scripts develops, even the non-programmers like me could make greater use of what web hooks offer.
Web hooks are user-defined callbacks over HTTP. They’re intended to, in a sense, “jailbreak” our web applications to become more extensible, customizable, and ultimately more useful. Conceptually, web applications only have a request-based “input” mechanism: web APIs. They lack an event-based output mechanism, and this is the role of web hooks. People talk about Unix pipes for the web, but they forget: pipes are based on standard input and standard output. Feeds are not a sufficient form of output for this, which is partly why Yahoo Pipes was not the game changer some people expected. Instead, we need adoption of a simple, real-time, event-driven mechanism, and web hooks seem to be the answer. Web hooks are bringing a new level of event-based programming to the web.
I think the use of the term ‘jailbreak’ is useful in understanding what HookPress brings to the WordPress ecosystem. WordPress is an application written in PHP and if you wish to develop a plugin or theme for WordPress you are required to use the PHP programming language. No bad thing but the HookPress plugin ‘jailbreaks’ the requirement to work with WordPress in PHP by turning WordPress’ hooks (‘actions’ and ‘filters’) into web hooks.
WordPress actions and filters, are basically inbuilt features that allow developers to ‘hook’ into WordPress with their plugins and themes. Here’s the official definition:
Hooks are provided by WordPress to allow your plugin to ‘hook into’ the rest of WordPress; that is, to call functions in your plugin at specific times, and thereby set your plugin in motion. There are two kinds of hooks:
Actions: Actions are the hooks that the WordPress core launches at specific points during execution, or when specific events occur. Your plugin can specify that one or more of its PHP functions are executed at these points, using the Action API.
Filters: Filters are the hooks that WordPress launches to modify text of various types before adding it to the database or sending it to the browser screen. Your plugin can specify that one or more of its PHP functions is executed to modify specific types of text at these times, using the Filter API.
In other words, what is happening is that WordPress is posting data to a URL, where lies a script, which takes that data and creates an event which notifies another application. Because the scripts can be hosted elsewhere, on large cloud platforms such as Google’s AppEngine, the burden of processing events can be passed off to somewhere else. I see now, why web hooks are likened to Unix pipes, in that the “output of each process feeds directly as input to the next one” and so on. In the case of HookPress, the output of the ‘publish_post’ hook feeds directly as input to the scriptlet and the output of that feeds directly as input to the Twitter API which outputs to the twitter client.
Besides creating notifications from WordPress actions, the other thing that HookPress does (still with me on this ‘learning journey’ ??? I’ve been reading, writing and revising this blog post for hours now…), is extend the functionality of WordPress through the use of WordPress filters. Remember that filters in WordPress, modify text before sending it to the database and/or displaying it on your computer screen. The example in the video, shows the web hook simply reversing the text before it is rendered on the screen. ‘This is a test’ becomes ‘tset a si sihT’.
The output of the ‘the_content‘ filter has been posted to the web hook, which has reversed the order of the blog post content and returned it back to WordPress which renders the modified blog post.
Whereas the action web hooks are about providing event-driven notifications, the filter web hooks allow developers to extend the functionality of WordPress itself in PHP and other scripting languages. In both cases, web hooks ‘jailbreak’ WordPress by turning it into a single process in a series of piped processes where web hooks create, modify and distribute data.
In the presentation, there are two quotes which I found useful. One from Wikipedia which kind of summarises what HookPress is doing to WordPress:
“In computer programming, hooking is a technique used to alter or augment the behaviour of [a programme], often without having access to its source code.”
and another from Marc Prensky, which relates back to my point about using WordPress as a way to learn about web technologies in a broader sense. WordPress+HookPress is where programming for WordPress leaves the back room:
As programming becomes more important, it will leave the back room and become a key skill and attribute of our top intellectual and social classes, just as reading and writing did in the past.
It’s made Dave Winer happy, which is no easy task, so I think PubSubHubbub is worth mentioning here. If it’s working as it should, this post should appear in my Google Reader, almost immediately after I’ve published it. That’s because PubSubHubbub is “a simple, open, server-to-server web-hook-based pubsub (publish/subscribe) protocol as an extension to Atom [and RSS].” My blog feed is managed by FeedBurner which has already implemented the new protocol, as has Google Reader FriendFeed. They should therefore ‘talk’ to each other in realtime. Watch the video and you’ll see how it works. It’s pretty straightforward. It just takes a company the size of Google to push it through to adoption. The engineers say they were using it like Instant Messaging the night before the demo, which says something about how responsive this is. Technically, it should be another challenge to Twitter in that it allows for a distributed method of near realtime communication. I’d like to see that. I feel like an idiot communicating within the confines of Twitter, sometimes.