Pimping your ride on the semantic web

Yesterday, I wrote about how I’d marked up my home page to create a semantic profile of myself that is both auto-discoverable and portable. A place where my identity on the web can be aggregated; not a hole I’ve dug for myself, but an identity that reaches out across the web but always leads back home.

While I enjoy polishing my text editor regularly and hand-crafting beautifully formed, structured data, we all know it’s a fool’s game and that the semantic web is about machines doing all the work for us. So here’s a quick and dirty run down of how to pimp your ride on the semantic web with WordPress and a few plugins.

You’ll need a self-hosted WordPress site that allows you to install plugins. I’ve got one on Dreamhost that costs me $6 a month. Next, you’ll want to install some plugins. I’ll explain what they do afterwards. One thing to note here is that I’m using plugins from the official plugin repository whenever possible. It means that you can install them from the WordPress Dashboard and you’ll get automatic updates (and they’re all GPL compatible). In no particular order…

I think that’s quite enough. All but the SIOC plugin are available from the official WordPress plugin repository. Here’s what they provide:

APML: Attention Profile Markup Language

APML (Attention Profiling Mark-up Language) is an XML-based format for capturing a person’s interests and dislikes. APML allows people to share their own personal attention profile in much the same way that OPML allows the exchange of reading lists between news readers.

The plugin creates an XML file like this one that marks up and weighs your WordPress tags as a measure of your interests. It also lists your blogroll/links and any embedded feeds.

Extended Profile

This plugin adds additional fields in your user profile which is encoded with hCard semantic microformat markup and can then be displayed in a page or as a sidebar widget. You can import hCard data, too. There might also be another use for this, too. (see below)

Micro Anywhere

Provides a couple of additional editor functions that allow you to create an hCard or hCalendar events page. Here’s an example.

OpenID

This plugin allows users to login to their local WordPress account using an OpenID, as well as enabling commenters to leave authenticated comments with OpenID. The plugin also includes an OpenID provider, enabling users to login to OpenID-enabled sites using their own personal WordPress account. XRDS-Simple is required for the OpenID Provider and some features of the OpenID Consumer.

This is key to your identity. You can use your blog URL as your OpenID or delegate a third-party service, such as MyOpenID or ClaimID. In fact, you’ve almost certainly got an OpenID already if you have a Yahoo!, Google, MySpace or AIM account. It’s up to you which one you choose to use as your persistent ID. Read more about OpenID here. It’s important and so are the issues it addresses.

XRDS-Simple

This is required to add further functionality to the OpenID plugin. It adds Attribute Exchange (AX) to your OpenID which basically means that certain profile information can be passed to third-party services (less form filling for you!) Like a lot of these plugins, install it and forget about it.

SIOC

Provides auto-discoverable SIOC metadata. “A SIOC profile describes the structure and contents of a weblog in a machine readable form.”

wp-RDFa

Provides an auto-discoverable FOAF (Friend of a Friend) profile, based on the members of your blog. I’ve been in touch with the author of this plugin and suggested that the extended profile information could also be pulled into the FOAF profile. This is largely dependent on the FOAF specification being finalised, but expect this plugin to do more as FOAF develops.

OAI-ORE Map

Provides an auto-discoverable OAI-ORE resource map of your blog. It conforms to version 0.9 of the specification, which recently made it to v1.0, so I imagine it will be updated in the near future. OAI-ORE metadata describes aggregated resources, so instead of seeing your blog post permalink as the single identifier for, say, a collection of text and multimedia, it creates a map of those resources and links them.

LinkedIn hResume

LinkedIn hResume for WordPress grabs the hResume microformat block from your LinkedIn public profile page allowing you to add it to any WordPress page and apply your own styles to it.

I like this plugin because you benefit from all the features of LinkedIn, but can bring your profile home. Ideal for students or anyone who wants to create a portfolio of work and offer their resume/CV on a single site. Depending on the theme you use, it does require some additional styling.

Get_OPML

This is a nice way to create an OPML file of your sidebar links. If, like on my personal blog, your links point to resources related to you, you can easily create an OPML file like this one. There’s a couple of things to note about this plugin though. The instructions mention a Technorati API key. I didn’t bother with this. When you create your links, just scroll down the page to the ‘advanced’ section and add the RSS feed there. Secondly, the plugin author has, for some stupid reason, hard-coded the feed to their own site into the plugin. Assuming you don’t want this spamming your personal OPML file, download a modified version from here or comment out line 101 in get-opml.php. I guess the plugin author thinks that you’ll be using this to import the OPML into a feed reader and from there, you can delete his feed. That’s no good to us though. Finally, you’ll want to make your OPML file auto-discoverable. You can do this by adding a line of html in your header, using the Header-Footer plugin below.

Header-Footer

This simply allows you to add code to the header and footer of your blog. In our case, you can use it to add an auto-discovery link to the header of every page of your blog.


<link rel="outline" type="text/xml+opml" title="ADD YOUR TITLE HERE" href="http://YOUR_BLOG_ADDRESS/opml.xml" />

WP Calais * + tagaroo

These three plugins use the OpenCalais API to examine your blog posts and return a bunch of semantic tags. I’ve written about this in more detail here (towards the end).

The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.

It’s an easy way to add relevant tags to your content and broadcast your content for indexing by OpenCalais. They place an additional link in your header that lists the tags for web crawlers and, I guess, improves the SEO for your site.

Extra Feed Links

I’ve written about this plugin previously, too. It adds additional autodiscovery links to your blog for author, category and tag feeds. WordPress feed functionality is very powerful and this plugin makes it especially easy to make those feeds visible.

Lifestream

This isn’t a semantic web plugin, but is a powerful way of aggregating all of your activity across the web into a single activity stream. See my example, here. It also produces a single RSS feed from your aggregated activity. Nice ;-)

Wrapping things up

If you set all of this up, you’ll have a WordPress site that can act as your primary identity across the web, aggregates much of your activity on the web into a single site and also offers multiple ways for people to discover and read your site. You also get a ‘well-formed’ portfolio that is enriched with semantic markup and links you to the wider online community in a way that you control.

Bear in mind that some of these plugins might not appear to do anything at all. The semantic web is about machines being able to read and link data, right? If you look closely in the source of your home page, you’ll see a few lines that speak volumes about you in machine talk.


<link rel="meta" href="./wp-content/plugins/wp-rdfa/foaf.php"type="application/rdf+xml" title="FOAF"/>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<link rel="meta" type="text/xml" title="APML" href="http://blog.josswinn.org/apml/" />
<link rel="alternate" type="application/rss+xml" title="NoteStream RSS Feed" href="http://blog.josswinn.org/feed/" />
<link rel="resourcemap" type="application/atom+xml" href="http://blog.josswinn.org/wp-content/plugins/oai-ore/rem.php"/>

If you do want a way to view the data, I recommend the following Firefox add-ons

Operator: Auto-discovers any embedded microformats and provides useful ways to search for similar data via third-party services elsewhere on the web.

OPML Reader: Auto-discovers an OPML file if you have one linked in your header. Allows you to either download the file or read it on Grazr.

Semantic Radar: Auto-discovers embedded RDF data. Displays custom icons to indicate the presence of FOAF, SIOC, DOAP and RDFa formats.

The Tabulator Extension: Auto-discovers and provides a table-based display for RDF data on the Semantic Web. Makes RDF data readable to the average person and shows how data are linked together across different sites.

As always, please let me know how this overview could be improved or if you know of other ways to add semantic functionality to your WordPress blog. Thanks.

A few notes on data portability

I had a bit of fun over the weekend looking at how I could both aggregate my online presence and make it portable, all under my own domain name. I ended up touching on a bunch of interesting initiatives revolving around web and data standards. The minor output of this is over on my personal ‘home page’ at http://josswinn.org

You’ll see that there’s an Attention Profile (APML), Friend of a Friend document (FOAF), hCard generated from my contact details, an OPML file of the significant feeds I have spotted around the web (Delicious, this blog, Twitter, Last.fm, etc), an aggregated feed of my OPML file, and a link to my LinkedIn profile, which I happily learned includes hResume microformat markup. My OPML, FOAF profile and RSS feed are all auto-discoverable.

All links on the page are marked up using the XFN markup rel=”me” tag, which should help consolidated my identity on the web. There’s an interesting discussion over on Marshall Kirkpatrick’s blog about how our Twitter profiles are starting to rank higher in search engines than our personal blogs or home pages because Twitter is using the rel=”me” tag. Marshall suggests that we start using rel=”me” somewhere on our own sites to counteract that.

To add to the fun, I also tried to get the page to validate as HTML5, but in doing so, I had to remove the meta tag that provides OpenID Attribute Exchange via my OpenID Service Provider. I get the error:

Bad value X-XRDS-Location for attribute http-equiv on element meta.

Apparently the draft HTML5 spec currently disallows values for http-equiv. OpenID AX is a good thing if you want to consolidate your identity while at the same time ensure it is portable. It’s certainly more useful to me than validating as HTML5.

In addition to this, I added a Google Friend Connect (OpenSocial) widget and integrated Apture. I thought about adding the ability to leave comments via Disqus, the advantage being that comment authors could retain control over their own comments. But to be honest, I don’t think you or I need yet another method of communicating with each other. There are plenty of ways to do that already.

Other than providing a playground for fun, what this bit of tinkering on my home page has taught me is that microformats and the ethos of data portability is being embraced quite widely on the web and although I spent my time hand-crafting my new home page, there are opportunities to do much the same, quite easily, through the use of a WordPress blog and a bunch of third-party services. More on that later…

ALT-C 2008: A different approach.

Today, I took a different approach to the conference and relaxed. I usually take the approach of trying to attend as many sessions as possible and absorb and report back on as much as I can.  However, I’ve found that this approach quickly leaves me exhausted and somewhat removed from the rest of the conference as it allows little time for reflection.

So, my third day in Leeds was a much more enjoyable and stimulating one as I attended sessions, picking up on one or two things that were being presented and following threads and tangents that I found online and from talking with people.  One term that I’ve heard mentioned a few times is ‘lifestream’, that is, an aggregation of online activity into a timeline that can be shared with others. You can see my lifestream by going to this page. You’ll see that following a conversation I had at F-ALT08, I looked again at OpenID and setup my own personal website as an OpenID server, learning a great deal at the same time.

You can also see that I joined identi.ca, an open source microblogging site like Twitter, and found details on setting up Laconica, the software behind identi.ca, on my own server and potentially, the Learning Lab. My experience using Twitter at the conference has really demonstrated the value of microblogging within a defined community as a way of rapidly communicating one-to-many messages and engaging in large asynchronous conversations.

In the morning Digital Divide Slam session, we formed small groups and with two people I’d met previously at the fringe events, created a ‘performance’ that reflected on a form of digital divide. We chose ‘gender’, and produced this (prize winning) video which is now on YouTube.

During the second keynote, I drifted off and began to think about e-portfolios and aggregating our online social activity into a profile/portfolio that is controlled by the individual and is dynamically updated. I’d heard about the Attention Profiling Markup Language (APML), and spent time considering whether this could be used or adapted for aggregating a portfolio of work and experience. APML is primarily aimed at individuals’ relationship with advertisers and at a later F-ALT session was able to discuss the suitability of APML or an APML-like standard for aggregating a portfolio of work. Consequently, I’m developing an interest in this area and in other online relationships that can be made between people (see this link, too) and the data that we generate through purposeful and serendipitous online activity.

Having listened to quite a lot of discussion about web2.0 applications over the last few days, I’m even more pleased with the decision to use WordPress as a platform for blogging, web publishing and collaboration in the Learning Lab. With WordPress, we’re able to evaluate many of the latest social web technologies and standards through their plugin system.  This flexible plugin and theming system has led to the development of an entire social networking platform based on WordPress, called BuddyPress, and because it’s basically WordPress with some specific plugins and clever use of a theme, it can use any of the available WordPress plugins to connect to Facebook, Twitter, YouTube, Flickr and other popular web 2 services.  I’m looking forward to watching BuddyPress develop.

In the evening, we attended the conference dinner at Headingley Cricket Club. It was a great location, with good food and excellent service and while sitting next to one of my digital slam partners, he showed me JoikuSpot, an application that turns a mobile phone into a wifi router. There on our dinner table, he ran Joiku on a 3G Nokia phone and provided wifi access to his iPod Touch. What a great way to share high speed network access among friends, while meeting at a cafe or park to discuss work or study.

I was impressed. The Learning Landscape had extended to the cricket ground.