LaTeX support in WordPress

My recent proposal to do a workshop session on WordPress MU and BuddyPress at this year’s ALT conference was accepted on the condition of a couple of modifications. It’s been suggested that it should be run as a demonstration rather than workshop and that I offer more detail on what will be demonstrated. Fair enough. The reviewer of my proposal suggested that I might aim the session at “teachers of mathematics-intensive disciplines because of WordPress’ decent support of \textrm{\LaTeX{}} for processing mathematical formulae.”

This isn’t an area I would normally think to support (although I did write my MA dissertation in \textrm{\LaTeX{}}, using LyX – it produces beautiful typeset text, regardless of whether you use it for science-related work). Anyway, a quick search showed that indeed, WordPress has supported \textrm{\LaTeX{}}, on both wordpress.com and as a plugin for a couple of years. You can adjust the size and style of the output and enable it for comments, which, if discussing mathematical formulae with peers, could be of huge benefit.

Maxwell’s Equations

\nabla \cdot \mathbf{D} = \rho_f

\nabla \cdot \mathbf{B} = 0

\nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}} {\partial t}

\nabla \times \mathbf{H} = \mathbf{J}_f + \frac{\partial \mathbf{D}} {\partial t}

Realtime WordPress updates to FriendFeed (and therefore to XMPP)

Quickly following on from my previous post about realtime XMPP updates from wordpress.com:

For self-hosted WordPress blogs, there’s the Jabber Feed plugin, which promises something similar but is going to take me a while to set up. For something quick and simple, there’s the wp-sup plugin which “implements FriendFeed’s Simple Update Protocol. Your blog posts will appear on FriendFeed near-instantly after they are published.”

And indeed, they do. Even better, you can receive FriendFeed notifications via Google Talk/XMPP/Jabber, achieving pretty much what wordpress.com are offering via their IM bot.

FriendFeed is a decent social lifestreaming and messaging platform. There’s an iPhone interface, a FaceBook app; it allows for a little more engagement than Twitter with the ability to ‘like’ content and comment extensively (with no 140 character limit). I can see it being quite useful to aggregate and discuss course work. You can have private groups, too.

I’ve been trying to ‘hack’ ((I use the term ‘hack’ loosely. ‘Botch’ might be a better way of describing my approach to code.)) the plugin to also work for comments, but it’s beyond me. I’ve submitted an issue to the plugin’s Google Code site and, if it’s possible, maybe it will be possible in a future version.

Hoorah for the realtime web!

XMPP PubSub on WordPress.com

Just some notes on how to get XMPP notifications from any wordpress.com blog. It’s an experimental service so might not work tomorrow 😉

I use the Pidgin IM client. You need an account on wordpress.com.

  1. Add a new account to your IM client.
  2. The username and password are your wordpress.com credentials
  3. The domain/server address is ‘im.wordpress.com’
  4. You need to add a ‘buddy’ to this account so you can receive notifications from it. The buddy name is ‘bot@im.wordpress.com’
  5. So now you’ve connected to the wordpress.com subpub service and made friends with their bot.
  6. Next, open a chat window with the bot and type ‘help’
  7. You’ll see a message like this:

(10:51:54 PM) ‘bot@im.wordpress.com’: Here I come to save the day! Pubsub bot is on the way!
My prime directive is to deliver blog posts and comments as soon as they are published. I am only a hack. If you can speak Pubsub (XEP-0060) you don’t need me.
Subscribe to a blog and I will deliver posts. Subscribe to a post and I will deliver comments.
Commands: sub/subscribe, unsub/unsubscribe, subs/subscriptions
Try these:
sub en.blog.wordpress.com
sub en.blog.wordpress.com/2009/04/02/march-wrap-up-2
sub en.blog.wordpress.com/comments
You can subscribe to any public WordPress.com blog. Private blogs are available to members. Please limit yourself to a reasonable number of subscriptions.
This service is experimental and subject to change or termination without notice.

Unlike RSS feeds, you can’t get notifications from tags or categories (actually, I haven’t tried categories, but tags didn’t work)

Isn’t it GREAT! 🙂

For self-hosted blogs, the Jabber Feed plugin is our greatest hope. The difficult bit is getting the server side stuff set up but I’m going to try to do that over the next few days and will post any notes here.

Developing BuddyPress for education

In February, I wrote a brief post about setting up BuddyPress with LDAP authentication within a university context (you’re looking at it). Four days ago, BuddyPress reached maturity by hitting version 1.0, marking a time to reflect on what I’d like to see developed for BuddyPress for use within a university context. This is an initial wish list. I’m not looking for BuddyPress to be an all singing, all dancing, social network. I don’t care about image collections and status updates (Flickr and Twitter do those jobs nicely) I would, however, like to see it being used for building group identity (projects, special interest groups, classes, courses) and portfolio/resume building. Right now, it’s pretty limited in those areas.

Privacy controls

As I mentioned previously, our social network is private, while the blogs have five levels of optional privacy controls, ranging from public and indexed by Google, to private, single-user blogs. However, privacy within the social network is currently all or nothing. It’s a hack that works but has no flexibility. The BuddyPress activity plugin is currently turned off because the privacy plugin I use, doesn’t account for the feeds that the plugin exposes. It would be nice to be able to have the site-wide social activity visible when logged in. Currently, only information about new blog posts is published site-wide. What I would like is for everything that the activity plugin logs, to have site admin options to be 1) visible to non-logged in users/public; 2) visible to logged in users; 3) visible to my groups and friends,4)  visible to my friends and 5) not visible. In addition, the feeds that are exposed of site-wide activity and member activity, could also be configurable so that 1) a site admin can choose to expose them or not; 2) if allowed, a member can choose to expose their personal feed or not; 3) a feed key could be used in place of the normal feed URI so that private member feeds could be created. Finally, groups and member profiles could optionally be made public or private. So anything following /groups/ or /members/ has an option to be visible outside the community.

Group activity

Currently, groups don’t publish very much information and you can’t aggregate information from elsewhere into a group profile. I submitted a ‘wishlist’ ticket to BuddyPress for group activity feeds, requesting that feeds for when a new group member joins and changes to the group wire. It would also be nice to be able to aggregate content from other sites via RSS into the group ‘news’ field, or a new lifestream-like field so group photos or videos or whatever, could be sucked in. It was possible to do this via a Yahoo! Pipe which combined various feeds which could then be put through feed2js and dropped into the ‘News’ field. However, embedded javascript is now intentionally blocked 🙁 I guess I could find a work-around.

Member profiles

For both teachers and students, the profile pages could be effective resumes. Currently, the site-admin can build basic grouped fields and there’s a choice of field types, too. I’d like members to be able to build their own fields and for there to be pre-built field types to choose from. It’s possible for the site admin to pre-build fields and probably easy enough for me to pre-build specific fields to design a resume (the examples given of language, country and state are just .csv lists). However, currently, if I provide three ‘Employment’ fields, a member can’t add a fourth ‘Employment’ field, nor can they select dates to correspond to when that employment was. I’m pretty sure I could create the fields, but it’s beyond me to allow a member to build their own profile pages from a selection of pre-built fields.

Finally, in addition to my request for members to be able to make their profiles public, I’d like the member profile to be marked up with hResume markup and exportable in a variety of styled formats: xhtml+css, xml, pdf, txt, doc and rtf.

The entire member profile should use microformat markup where possible. Currently, the profile can export a simple, personal hCard but could also use hCard for company and school addresses, hCalendar for dates, and rel=”tag” for creating a set of tagged skills. LinkedIn partially implements this, by the way.

So, privacy controls, group feeds and a resume builder. Not too much to ask is it? I’d probably be able to pay for the resume builder if anyone is interested…

Getting your Triples into Talis Connected Commons

A few days ago, I wrote about adding Triplify to your web application. Specifically, I wrote about adding it to WordPress, but the same information can be applied to most web publishing platforms. Earlier this month, TALIS announced their Connected Commons platform and yesterday they announced a commercial version of their platform for the structured storage of Linked Data. Storage is all very well, but more importantly they have an API for developers, so that the data can be queried and creatively re-used or mashed up.

So this got me thinking about JISCPress, our recent JISC Rapid Innovation Programme bid, which proposes a WordPress Multi-User based platform for publishing JISC funding calls and the reports of funded projects. This is based on my experience of running WriteToReply with Tony Hirst.

Although a service for comment and discussion around documents, one of the things that interests me most about WriteToReply and, consequently the JISCPress proposal, is the cumulative storage of data on the platform and how that data might be used. No surprise really as my background is in archiving and collections management. As with the University of Lincoln blogs, WriteToReply and the proposed JISCPress platform, aggregate published content into a site-wide ‘tags’ site that allows anyone to search and browse through all content that has been published to the public. In the case of the university blogs, that’s a large percentage of blogs, but for WriteToReply and JISCPress, it would be pretty much every document hosted on the platform.

You can see from the WriteToReply tags site that over time, a rich store of public documents could be created for querying and re-use. The site design is a bit clunky right now but under the hood you’ll notice that you can search across the text of every document, browse by document type and by tag. The tags are created by publishing the content to OpenCalais, which returns a whole bunch of semantic keywords for each document section. You’ll also notice that an RSS feed is available for any search query, any category and any tag or combination of tags.

Last night, I was thinking about the WriteToReply site architecture (note that when I mention WriteToReply, it almost certainly applies to JISCPress, too – same technology, similar principles, different content). Currently, we categorise each document by document type so you’ll see ‘Consultations‘, ‘Action Plans‘ ‘Discussion Papers‘, etc.. We author all documents under the WriteToReply username, too and tag each document section both manually and via OpenCalais. However, there’s more that we could do, with little effort, to mark up the documents and I’ve started sketching it out.

You’ll see from the diagram that I’m thinking we should introduce location and subject categories. There will be formal classification schemes we could use. For example, I found a Local Government Classification Scheme, which provides some high level subjects that are the type of thing I’m thinking about. I’m not suggesting we start ‘cataloguing’ the documents, but simply borrow, at the top level, from recognised classification schemes that are used elsewhere. I’m also thinking that we should start creating a new author for each document and in the case of WriteToReply, the author would be the agency who issued the consultation, report, or whatever.

So following these changes, we would capture the following data (in bold), for example:

The Home Office created Protecting the public in a changing communications environment on April 27th which is a consultation document for England, Wales and Scotland, categorised under Information and communication technology with 18 sections.

Section one is tagged Governor, Home Department, Office of Public Sector Information, Secretary of State, Surrey.

Section two is tagged communications data, communications industry, emergency services, Home Secretary, Jacqui Smith MP, Rt Hon Jacqui Smith MP.

Section three is tagged Broadband, BT, communications, communications changes, communications data, communications data capability, communications data limits, communications environment, communications event, communications industry, communications networks, communications providers, communications service providers, communications services, emergency services, Her Majesty’s Revenue and Customs, Home Office, intelligence agencies, internet browsing, Internet Protocol, Internet Service, IP, mobile telephone system, physical networks, public telecommunications service, registered owner, Serious Organised Crime Agency, social networking, specified communications data, The communications industry, United Kingdom.

Section four is tagged …(you get the picture)

Section five, paragraph six, has the comment “fully compatible with the ECHR” is, of course, an assertion made by the government, about its own legislation. Has that assertion ever been tested in a court? authored by Owen Blacker on April 28th 11:32pm.

Selected text from Section five, paragraph eight, has the comment Over my dead body! authored by Mr Angry on April 28th 9:32pm

Note that every author, document, section, paragraph, text selection, category, tag, comment and comment author has a URI, Atom, RSS and RDF end point (actually, text selection and comment author feeds are forthcoming features).

Now, with this basic architecture mapped out, we might wonder what Triplify could add to this. I’ve already shown in my earlier post that, with little effort, it re-publishes data from a relational database as N-Triples semantic data, so everything you see above, could be published as RDF data (and JSON, too).

So, in my simple view of the world, we have a data source that requires very little effort to generate content for and manage (JISCPress/WriteToReply/WordPress), a method of automatically publishing the data for the semantic web (Triplify) and, with TALIS, an API for data storage, data access, query, and augmentation.  As always, my mantra is ‘I am not a developer’, but from where I’m standing, this high-level ‘workflow’ seems reasonable.

The benefits for the JISC community would primarily be felt by using the JISCPress website, in a similar way (albeit with better, more informed design) to the WriteToReply ‘tags’ site. We could search across the full text of funding calls, browse the reports by author, categories and tags and grab news feeds from favourite authors, searches, tags or categories. This is all in addition to the comment, feedback and discussion features we’ve proposed, too. Further benefits would be had from ‘re-publishing’ the site content as semantic data to a platform such as TALIS. Not only could there be further Rapid Innovation projects which worked on this data, but it would be available for any member of the public to query and re-use, too. No longer would our final project reports, often the distillation of our research, sit idle as PDF files on institutional websites and in institutional repositories. If the documentation we produce it worth anything, then it’s worth re-publishing openly as semantic data.

Finally, in order to benefit from the (free) use of TALIS Connected Commons, the data being published needs to be licensed under a public domain or Creative Commons ‘zero’ licence. I suspect Crown Copyright is not compatible with either of these licenses, although why the hell public consultation documents couldn’t be licensed this way, I don’t know. Do you? For JISCPress, this would be a choice JISC could make. The alternative is to use the commercial TALIS platform or something similar.

As usual, tell me what you think… Thanks.

Triplify: Make your blog mashable

Last week, I wrote about how it is relatively simple to ‘pimp your ride on the semantic web‘. Over the weekend, I stumbled upon Triplify, a small ‘plugin’ for pretty much any web publishing platform, that “reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.” What is so appealing about Triplify is how easy it is to implement, especially alongside a WordPress site.

I can confirm that the three-step installation process is all it takes, although I wouldn’t undertake implementing this blindly as you are, literally, exposing a semantic representation of your database content. In other words, you should look at the configuration file you’re using and check that it’s going to expose the right data and not clear text passwords and unpublished posts and comments. Before I  implemented it, I realised that it would expose comments on a bunch of posts that I have since made private (they were imported from an old, private blog), so I had to ‘unapprove’ those comments so the script didn’t expose them to the public. A five minute job. Alternatively, the script could probably be modified to work around my problem, by only exposing comments after a certain date, for example.

The end result is that, with a WordPress site, you expose a semantic representation of your users, posts, pages, tags, categories, comments and attachments in RDF (N-Triples) and JSON formatted data (for JSON, just add ‘?t-output=json’ to the end of the URI). Like I said though, it could be used on any database driven web application. Here’s what you get when you expose the high level links to your content:


<http://blog.josswinn.org/triplify/> <http://www.w3.org/2000/01/rdf-schema#comment> "Generated by Triplify V0.5 (http://Triplify.org)" .
<http://blog.josswinn.org/triplify/> <http://creativecommons.org/ns#license> <http://creativecommons.org/licenses/by/2.0/uk/> .
<http://blog.josswinn.org/triplify/post> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/attachment> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/tag> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/category> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/user> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/comment> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .

Here’s an example of what you get when you expose the full content:


<http://blog.josswinn.org/triplify/post/154> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdfs.org/sioc/ns#Post> .
<http://blog.josswinn.org/triplify/post/154> <http://rdfs.org/sioc/ns#has_creator> <http://blog.josswinn.org/triplify/user/1> .
<http://blog.josswinn.org/triplify/post/154> <http://purl.org/dc/terms/created> "2008-10-06T05:55:25"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
<http://blog.josswinn.org/triplify/post/154> <http://rdfs.org/sioc/ns#content> "Up early to go to Sheffield for LPI exams. The last week has left me underprepared. Never mind." .
<http://blog.josswinn.org/triplify/post/154> <http://purl.org/dc/terms/modified> "2008-10-06T20:12:15"^^<http://www.w3.org/2001/XMLSchema#dateTime> .

...

<http://blog.josswinn.org/triplify/post/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag> <http://blog.josswinn.org/triplify/tag/27> .

...

<http://blog.josswinn.org/triplify/post/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag> <http://blog.josswinn.org/triplify/tag/41> .
<http://blog.josswinn.org/triplify/post/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag> <http://blog.josswinn.org/triplify/tag/42> .

...

<http://blog.josswinn.org/triplify/post/154> <http://sdp.iasi.rdsnet.ro/semantic-wordpress/vocabulary/belongsToCategory> <http://blog.josswinn.org/triplify/category/22> .

...

<http://blog.josswinn.org/triplify/tag/154> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/Tag> .
<http://blog.josswinn.org/triplify/tag/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/tagName> "valentine" .

You can choose to expose different levels of information in your HTML source. If you have more than a moderate amount of content, you’ll probably want to just expose the top level links as in the first example and let the users of your data dig deeper. You’ll also note that you can (and should) attach a license to your data.

A number of namespaces are recognised as well as a WordPress vocabulary.


$triplify['namespaces']=array(
'vocabulary'=>'http://sdp.iasi.rdsnet.ro/semantic-wordpress/vocabulary/',
'rdf'=>'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
'rdfs'=>'http://www.w3.org/2000/01/rdf-schema#',
'owl'=>'http://www.w3.org/2002/07/owl#',
'foaf'=>'http://xmlns.com/foaf/0.1/',
'sioc'=>'http://rdfs.org/sioc/ns#',
'sioctypes'=>'http://rdfs.org/sioc/types#',
'dc'=>'http://purl.org/dc/elements/1.1/',
'dcterms'=>'http://purl.org/dc/terms/',
'skos'=>'http://www.w3.org/2004/02/skos/core#',
'tag'=>'http://www.holygoat.co.uk/owl/redwood/0.1/tags/',
'xsd'=>'http://www.w3.org/2001/XMLSchema#',
'update'=>'http://triplify.org/vocabulary/update#',
);

So, what’s the point in doing this? Well, it’s fairly trivial and if you think that structured, linked, machine-readable licensed data is a Good Thing, why not?  The Triplify website lists an number of advantages:

Such a triplification of your Web application has tremendous advantages:

  • The installations of the Web application are better found and search engines can better evaluate the content.
  • Different installations of the Web application can easily syndicate arbitrary content without the need to adopt interfaces, content representations or protocols, even when the content structures change.
  • It is possible to create custom tailored search engines targeted at a certain niche. Imagine a search engine for products, which can be queried for digital cameras with high resolution and large zoom.

Ultimately, a triplification will counteract the centralization we faced through Google, YouTube and Facebook and lead to an increased democratization of the Web

The vision of the semantic web and semantic publishing is one of meaningfully identifying objects (and people) on the Internet and showing their relationships. This should improve searches for things on the web, but also improve how we exchange knowledge, re-use information and help clarify our identity on the web, too. It’s an ambitious task, but made easier with tools like Triplify.  The semantic web also raises questions over individual privacy and, if data is well formed and accessible, it may be easier to control and therefore censor. The creator of Triplify recently gave a technical presentation on Triplify and how it is being used to publish data collected by the OpenStreetMap project. It shows how geodata exposed in this way can result in mashup applications that directly benefit you and me.

Open Education Project Blueprint

Each participant on the Mozilla Open Education Course, has been asked to develop a project blueprint. Here is the start of mine. It’s basically a ‘Personal Learning Environment’ (PLE) ((See Personal Learning Environments: Challenging the dominant design of educational systems))and I’m going to try to show how WordPress MU is a good technology platform for an institution to easily and effectively support a PLE. I’m going to place an emphasis on ‘identity’ because it’s something I want to learn more about.

Short description

University students are at least 18 years old and have spent many years unconsciously accumulating or deliberately developing a digital identity. When people enter university they are expected to accept a new digital identity, one which may rarely acknowledge and easily exploit their preceding experience and productivity. Students are given a new email address, a university ID, expected to submit course work using new, institutionally unique tools and develop a portfolio of work over three to four years which is set apart from their existing portfolio of work and often difficult to fully exploit after graduation.

I think this will be increasingly questioned and resisted by individuals paying to study at university. Both students and staff will suffer this disconnect caused by institutions not employing available online technologies and standards rapidly enough. There is a legacy of universities expecting and being expected to provide online tools to staff and students. This was useful and necessary several years ago, but it’s now quite possible for individuals in the UK to study, learn and work apart from any institutional technology provision. For example, Google provides many of these tools and will have a longer relationship with the individual than the university is likely to.

Many students and staff are relinquishing institutional technology ties and an indicator of this is the massive % of students who do not use their university email address (96% in one case study). In the UK, universities are keen to accept mature, work-based and part-time students. For these students, university is just a single part of their lives and should not require the development of a digital identity that mainly serves the institution, rather than the individual.

How would it work?

Students identify themselves with their OpenID, which authenticates against a Shibboleth Service Provider. ((See the JISC Review of OpenID.)) They create, publish and syndicate their course work, privately or publicly using the web services of their choice. Students don’t turn in work for assessment, but rather publish their work for assessment under a CC license of their choice.

It’s basically a PLE project blueprint with an emphasis on identity and data-portability. I’m pretty sure I’m not going to get a fully working model to demonstrate by the end of the course, but I will try to show how existing technologies could be stitched together to achieve what I’m aiming for. Of course, the technologies are not really the issue here, the challenge is showing how this might work in an institutional context.

I think it will be possible to show how it’s technically possible using a single platform such as WordPress which has Facebook Connnect, OAuth, OpenID, Shibboleth and RPX plugins. WordPress is also microformat friendly and profile information can be easily exported in the hCard format. hResume would be ideal for developing an academic profile. The Diso project are leading the way in this area.

Similar projects:

UMW Blogs?

Open Technology:

OpenID, OAuth, RPX, Shibboleth, RSS, Atom, Microformats, XMPP, OPML, AtomPub, XML-RPC + WordPress

Open Content / Licensing:

I’ll look at how Creative Commons licensing may be compatible with our staff and student IP policies.

Open Pedagogy

No idea. This is a new area for me. I’m hoping that the Mozilla/CC Open Education course can point me in the right direction for this. Maybe you have some suggestions, too?