html – Joss Winn

Metadata …arrghhh!

Joss Winn / October 14, 2009 / Fun, Projects, Standards & Specs

In my previous job as Audiovisual Archivist, I spent a lot of time examining various metadata standards in detail; hours spent pouring over PBCore, METS, MODS, MIX, EXIF and IPTC/XMP, because we were designing a content model for an in-house Digital Asset Management system. I thought I had put it all behind me yet here I am staring at Phil Barker’s informative post about ‘metadata and resource description’ and it’s all coming back to me… Arrghhh 🙂

Workpackage six of the Chemistry.fm project aims to:

Plan the storage, delivery and marketing of the course.
Choose a metadata standard
Evaluate third-party hosting such as Flickr, Slideshare and YouTube as well as JORUM and the IR.

Ah, if only life were as simple as a series of bullet points!

As I was creating the project poster yesterday, I was reminded about the various ways that our project OERs could be ‘broadcast’. Although collaboration with our community radio station SirenFM, is core to the approach of our project, we all know that there are many ways for anyone to be a broadcaster on the web and part of the fun of this project for me, is being able to explore the different ways that educational content can be pulled and pushed between subscribing students and members of the public.

My plan at the moment is to use our Institutional Repository as the ‘canonical reference’ for the OERs. During our JISC-funded LIROLEM project, we developed EPrints to better accommodate multimedia resources and it makes sense to use a versioned digital archive that supports embedded media enriched by copious amounts of metadata. (I know it’s a requirement to use JORUM, too, but at the first Programme Meeting, it became clear that JORUM can be used simply as a directory where we can register URIs of existing OERs, so that’s what I’ll be doing).

Anyway, Archivists, have you ever feasted your eyes on the source code of an EPrint? Of course you have. Here’s a reminder.

Looking at the (draft) Metadata Guidelines for the OER Programme, you can see that the following are covered by EPrints:

programme tag [there is no “DC.keyword” term, so EPrints uses name=”eprints.keywords”]
title [name=”DC.title”]
author [name=”DC.creator”]
date [name=”DC.date”]
url [name=”DC.identifier]
technical information [name=”DC.format”]
language [hmmm, nowhere to be seen. Can we add that?]
subject classification [name=”DC.subject”]
keywords/tags [there is no “DC.keyword” term, so EPrints uses name=”eprints.keywords”]
comments [We use the SNEEP plugins but the comments are not showing in the source code – do we need to make sure they are crawlable? Some people aren’t keen…]
description [name=”DC.description”]

I’ve highlighted the Dublin Core terms above, but happily, the data is available in several other alternate formats:

<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/HTML/lirolem-eprint-1543.html" title="HTML Citation" type="text/html; charset=utf-8" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/Text/lirolem-eprint-1543.txt" title="ASCII Citation" type="text/plain; charset=utf-8" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/ContextObject/lirolem-eprint-1543.xml" title="OpenURL ContextObject" type="text/xml" />

<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/EndNote/lirolem-eprint-1543.enw" title="EndNote" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/BibTeX/lirolem-eprint-1543.bib" title="BibTeX" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/MODS/lirolem-eprint-1543.xml" title="MODS" type="text/xml" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/COinS/lirolem-eprint-1543.txt" title="OpenURL ContextObject in Span" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/DIDL/lirolem-eprint-1543.xml" title="DIDL" type="text/xml" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/XML/lirolem-eprint-1543.xml" title="EP3 XML" type="text/xml" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/JSON/lirolem-eprint-1543.js" title="JSON" type="text/javascript; charset=utf-8" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/DC/lirolem-eprint-1543.txt" title="Dublin Core" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/RIS/lirolem-eprint-1543.ris" title="Reference Manager" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/EAP/lirolem-eprint-1543.xml" title="Eprints Application Profile" type="text/xml" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/Simple/lirolem-eprint-1543.txt" title="Simple Metadata" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/Refer/lirolem-eprint-1543.refer" title="Refer" type="text/plain" />
<link rel="alternate" href="http://eprints.lincoln.ac.uk/cgi/export/1543/METS/lirolem-eprint-1543.xml" title="METS" type="text/xml" />

Now, we could choose to lump all the OERs that we create into one single EPrint, but that doesn’t give us much flexibility and remember that EPrints is serving as the canonical reference for the OERs, not necessarily the final presentation layer that people will actually be using to browse, download and use the resources from. So if we were to group the OERs into sets of items that constituted an EPrint and then relate those EPrints to each other, using the “DC.isPartOf” property, from the point of view of metadata, we’ll be creating a consistent whole, but giving ourselves some flexibility in how we ‘broadcast’ the content of the course.

EPrints DC.relation — Dublin Core relationships

If we consider the course MindMap that we knocked up a while back, we might decide to create a single EPrint for each of the five major ‘nodes’ of the course. Doing this, would then give us an RSS 1.0 (RDF), RSS 2.0 and Atom feed for the course where each node was an item.

Introductory Chemistry Mindmap — Course MindMap

Before I move on with this, look at the export formats that EPrints offers for a query. Imagine that the course could be exported in each of these ways:

EPrints export formats — Exporting from EPrints

The zip export allows you to download the entire query and all it’s resources at once. The HTML citation format allows you to produce some HTML you could copy and paste into any web page. It could just as easily be dropped into Blackboard as it could on any other (and anybody’s) web page. BibTex would allow you to browse the course via your preferred reference management software and JSON… I still don’t completely get it, but it’s pretty fancy, I know that much.

Anyway, If each of the mindmap nodes is an ‘item’ in the RSS feed, then perhaps we can use that to feed a WordPress site, using the FeedWordPress plugin? Nope. It doesn’t seem to work. FeedWordPress recognises the feed but doesn’t import anything. Testing it with another feed based on keywords does work, but the information included in the feed is sparse, so that’s no good. By the way, the EPrints RSS 2.0 feed does include the xmlns:media=”http://search.yahoo.com/mrss” namespace and marks up the preview thumbnails accordingly:


<media:thumbnail url="http://eprints.lincoln.ac.uk/1543/thumbnails/15/small.png" type="image/png"></media:thumbnail><media:content url="http://eprints.lincoln.ac.uk/1543/thumbnails/15/preview.png" type="image/png"></media:content>

(Another way to tackle this might be using our newly developed ‘EPrints2Blog’ plugin, which allows a depositor to post information about their new EPrint to a blog of their choice (using XML-RPC). As we deposit the course EPrints, each could be posted to a WordPress site. The resulting feed from the WordPress site does include some embedded media, but it’s still a bit of a hack. No, scrap this idea).

Post2Blog: An XML-RPC plugin for EPrints

Podcasting from Eprints in WordPress

Right, how about this…?

Using EPrints as the canonical source for each of the files for the course, we could create a WordPress site with the addition of the Dublin Core and OAI-ORE plugins for WordPress.

For each WordPress post, this gives us the following metadata:


<meta name="DC.publisher" content="../learninglab/joss" />

<meta name="DC.publisher.url" content="https://joss.blogs.lincoln.ac.uk/" />

<meta name="DC.title" content="Thinking the unthinkable" />

<meta name="DC.identifier" content="https://joss.blogs.lincoln.ac.uk/2009/10/08/thinking-the-unthinkable/" />

<meta name="DC.date.created" scheme="WTN8601" content="2009-10-08T16:14:54" />

<meta name="DC.creator" content="Joss" />

<meta name="DC.rights.rightsHolder" content="Joss" />

<meta name="DC.subject" content="Funding" />

<meta name="DC.rights.license" content="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/" />

<link rel="alternate" type="application/rss+xml" title="Comments: Thinking the unthinkable" href="https://joss.blogs.lincoln.ac.uk/2009/10/08/thinking-the-unthinkable/feed/" />

<!-- OAI-ORE -->

<link rel="resourcemap" type="application/atom+xml" href="https://joss.blogs.lincoln.ac.uk/wp-content/plugins/oai-ore/rem.php"/>

This is more like it. Click on the oai-ore link and look at the source code. It’s too big to display here, but it does what you’d expect and produces a OAI-ORE 1.0 compliant Atom/XML file. Contained within the file is a ‘resource map’ of all the WordPress posts and pages marked up with Dublin Core and FOAF terms. Thinking about how the course site might be represented in this way, it makes sense to atomise the course even further so that each of the sub-nodes of the Mind Map is a WordPress post. Using the current course structure, that would result in about 20 separate posts to represent the course. Each post would contain one or more resources such as a PDF, video, audio, slides, etc. Is it worth atomising it even further and creating a post for each of these resources, too, I wonder? Quite possibly.

Unfortunately, the resource map does not include media that are included in each post or page – apparently it’s on the developer’s list of things to do. Maybe we could use some of the project budget to ask Alex, who’s working on the JISCPress project with me, to extend the plugin in this way…

Finally, there’s also a MediaRSS plugin for WordPress, which could enhance the RSS feeds to include all the media used in the course. Here’s an example that’s including images by default. I’ve already written about the various feeds that are available for WordPress, with some careful categorisation and tagging, media rich feeds would be available for different points (‘nodes’) of entry into the course.

Once we are at this point, I guess we’re ready to think about broadcasting the course via Boxee and DeliTV (no time to dig into that now. Sorry!)

Metadata… arrghhh!

p.s. you’ve probably noticed that I’m a bit weak on the EPrints and OAI-ORE stuff, to say the least. Please do pick me up on where I’m going wrong with this. Thanks 🙂

Triplify: Make your blog mashable

Joss Winn / April 27, 2009August 30, 2009 / Commons, Data, Fun, Mashups, Standards & Specs, Web

Last week, I wrote about how it is relatively simple to ‘pimp your ride on the semantic web‘. Over the weekend, I stumbled upon Triplify, a small ‘plugin’ for pretty much any web publishing platform, that “reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.” What is so appealing about Triplify is how easy it is to implement, especially alongside a WordPress site.

I can confirm that the three-step installation process is all it takes, although I wouldn’t undertake implementing this blindly as you are, literally, exposing a semantic representation of your database content. In other words, you should look at the configuration file you’re using and check that it’s going to expose the right data and not clear text passwords and unpublished posts and comments. Before I implemented it, I realised that it would expose comments on a bunch of posts that I have since made private (they were imported from an old, private blog), so I had to ‘unapprove’ those comments so the script didn’t expose them to the public. A five minute job. Alternatively, the script could probably be modified to work around my problem, by only exposing comments after a certain date, for example.

The end result is that, with a WordPress site, you expose a semantic representation of your users, posts, pages, tags, categories, comments and attachments in RDF (N-Triples) and JSON formatted data (for JSON, just add ‘?t-output=json’ to the end of the URI). Like I said though, it could be used on any database driven web application. Here’s what you get when you expose the high level links to your content:


&lt;http://blog.josswinn.org/triplify/&gt; &lt;http://www.w3.org/2000/01/rdf-schema#comment&gt; "Generated by Triplify V0.5 (http://Triplify.org)" .
&lt;http://blog.josswinn.org/triplify/&gt; &lt;http://creativecommons.org/ns#license&gt; &lt;http://creativecommons.org/licenses/by/2.0/uk/&gt; .
&lt;http://blog.josswinn.org/triplify/post&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.w3.org/2002/07/owl#Class&gt; .
&lt;http://blog.josswinn.org/triplify/attachment&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.w3.org/2002/07/owl#Class&gt; .
&lt;http://blog.josswinn.org/triplify/tag&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.w3.org/2002/07/owl#Class&gt; .
&lt;http://blog.josswinn.org/triplify/category&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.w3.org/2002/07/owl#Class&gt; .
&lt;http://blog.josswinn.org/triplify/user&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.w3.org/2002/07/owl#Class&gt; .
&lt;http://blog.josswinn.org/triplify/comment&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.w3.org/2002/07/owl#Class&gt; .

Here’s an example of what you get when you expose the full content:


&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://rdfs.org/sioc/ns#Post&gt; .
&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://rdfs.org/sioc/ns#has_creator&gt; &lt;http://blog.josswinn.org/triplify/user/1&gt; .
&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://purl.org/dc/terms/created&gt; "2008-10-06T05:55:25"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt; .
&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://rdfs.org/sioc/ns#content&gt; "Up early to go to Sheffield for LPI exams. The last week has left me underprepared. Never mind." .
&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://purl.org/dc/terms/modified&gt; "2008-10-06T20:12:15"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt; .

...

&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag&gt; &lt;http://blog.josswinn.org/triplify/tag/27&gt; .

...

&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag&gt; &lt;http://blog.josswinn.org/triplify/tag/41&gt; .
&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag&gt; &lt;http://blog.josswinn.org/triplify/tag/42&gt; .

...

&lt;http://blog.josswinn.org/triplify/post/154&gt; &lt;http://sdp.iasi.rdsnet.ro/semantic-wordpress/vocabulary/belongsToCategory&gt; &lt;http://blog.josswinn.org/triplify/category/22&gt; .

...

&lt;http://blog.josswinn.org/triplify/tag/154&gt; &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://www.holygoat.co.uk/owl/redwood/0.1/tags/Tag&gt; .
&lt;http://blog.josswinn.org/triplify/tag/154&gt; &lt;http://www.holygoat.co.uk/owl/redwood/0.1/tags/tagName&gt; "valentine" .

You can choose to expose different levels of information in your HTML source. If you have more than a moderate amount of content, you’ll probably want to just expose the top level links as in the first example and let the users of your data dig deeper. You’ll also note that you can (and should) attach a license to your data.

A number of namespaces are recognised as well as a WordPress vocabulary.


$triplify['namespaces']=array(
'vocabulary'=&gt;'http://sdp.iasi.rdsnet.ro/semantic-wordpress/vocabulary/',
'rdf'=&gt;'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
'rdfs'=&gt;'http://www.w3.org/2000/01/rdf-schema#',
'owl'=&gt;'http://www.w3.org/2002/07/owl#',
'foaf'=&gt;'http://xmlns.com/foaf/0.1/',
'sioc'=&gt;'http://rdfs.org/sioc/ns#',
'sioctypes'=&gt;'http://rdfs.org/sioc/types#',
'dc'=&gt;'http://purl.org/dc/elements/1.1/',
'dcterms'=&gt;'http://purl.org/dc/terms/',
'skos'=&gt;'http://www.w3.org/2004/02/skos/core#',
'tag'=&gt;'http://www.holygoat.co.uk/owl/redwood/0.1/tags/',
'xsd'=&gt;'http://www.w3.org/2001/XMLSchema#',
'update'=&gt;'http://triplify.org/vocabulary/update#',
);

So, what’s the point in doing this? Well, it’s fairly trivial and if you think that structured, linked, machine-readable licensed data is a Good Thing, why not? The Triplify website lists an number of advantages:

Such a triplification of your Web application has tremendous advantages:

The installations of the Web application are better found and search engines can better evaluate the content.

Different installations of the Web application can easily syndicate arbitrary content without the need to adopt interfaces, content representations or protocols, even when the content structures change.

It is possible to create custom tailored search engines targeted at a certain niche. Imagine a search engine for products, which can be queried for digital cameras with high resolution and large zoom.

Ultimately, a triplification will counteract the centralization we faced through Google, YouTube and Facebook and lead to an increased democratization of the Web

The vision of the semantic web and semantic publishing is one of meaningfully identifying objects (and people) on the Internet and showing their relationships. This should improve searches for things on the web, but also improve how we exchange knowledge, re-use information and help clarify our identity on the web, too. It’s an ambitious task, but made easier with tools like Triplify. The semantic web also raises questions over individual privacy and, if data is well formed and accessible, it may be easier to control and therefore censor. The creator of Triplify recently gave a technical presentation on Triplify and how it is being used to publish data collected by the OpenStreetMap project. It shows how geodata exposed in this way can result in mashup applications that directly benefit you and me.

Pimping your ride on the semantic web

Joss Winn / April 21, 2009August 30, 2009 / Fun, Identity, Standards & Specs, Tips, Web

Yesterday, I wrote about how I’d marked up my home page to create a semantic profile of myself that is both auto-discoverable and portable. A place where my identity on the web can be aggregated; not a hole I’ve dug for myself, but an identity that reaches out across the web but always leads back home.

While I enjoy polishing my text editor regularly and hand-crafting beautifully formed, structured data, we all know it’s a fool’s game and that the semantic web is about machines doing all the work for us. So here’s a quick and dirty run down of how to pimp your ride on the semantic web with WordPress and a few plugins.

You’ll need a self-hosted WordPress site that allows you to install plugins. I’ve got one on Dreamhost that costs me $6 a month. Next, you’ll want to install some plugins. I’ll explain what they do afterwards. One thing to note here is that I’m using plugins from the official plugin repository whenever possible. It means that you can install them from the WordPress Dashboard and you’ll get automatic updates (and they’re all GPL compatible). In no particular order…

I think that’s quite enough. All but the SIOC plugin are available from the official WordPress plugin repository. Here’s what they provide:

APML: Attention Profile Markup Language

APML (Attention Profiling Mark-up Language) is an XML-based format for capturing a person’s interests and dislikes. APML allows people to share their own personal attention profile in much the same way that OPML allows the exchange of reading lists between news readers.

The plugin creates an XML file like this one that marks up and weighs your WordPress tags as a measure of your interests. It also lists your blogroll/links and any embedded feeds.

Extended Profile

This plugin adds additional fields in your user profile which is encoded with hCard semantic microformat markup and can then be displayed in a page or as a sidebar widget. You can import hCard data, too. There might also be another use for this, too. (see below)

Micro Anywhere

Provides a couple of additional editor functions that allow you to create an hCard or hCalendar events page. Here’s an example.

OpenID

This plugin allows users to login to their local WordPress account using an OpenID, as well as enabling commenters to leave authenticated comments with OpenID. The plugin also includes an OpenID provider, enabling users to login to OpenID-enabled sites using their own personal WordPress account. XRDS-Simple is required for the OpenID Provider and some features of the OpenID Consumer.

This is key to your identity. You can use your blog URL as your OpenID or delegate a third-party service, such as MyOpenID or ClaimID. In fact, you’ve almost certainly got an OpenID already if you have a Yahoo!, Google, MySpace or AIM account. It’s up to you which one you choose to use as your persistent ID. Read more about OpenID here. It’s important and so are the issues it addresses.

XRDS-Simple

This is required to add further functionality to the OpenID plugin. It adds Attribute Exchange (AX) to your OpenID which basically means that certain profile information can be passed to third-party services (less form filling for you!) Like a lot of these plugins, install it and forget about it.

SIOC

Provides auto-discoverable SIOC metadata. “A SIOC profile describes the structure and contents of a weblog in a machine readable form.”

wp-RDFa

Provides an auto-discoverable FOAF (Friend of a Friend) profile, based on the members of your blog. I’ve been in touch with the author of this plugin and suggested that the extended profile information could also be pulled into the FOAF profile. This is largely dependent on the FOAF specification being finalised, but expect this plugin to do more as FOAF develops.

OAI-ORE Map

Provides an auto-discoverable OAI-ORE resource map of your blog. It conforms to version 0.9 of the specification, which recently made it to v1.0, so I imagine it will be updated in the near future. OAI-ORE metadata describes aggregated resources, so instead of seeing your blog post permalink as the single identifier for, say, a collection of text and multimedia, it creates a map of those resources and links them.

LinkedIn hResume

LinkedIn hResume for WordPress grabs the hResume microformat block from your LinkedIn public profile page allowing you to add it to any WordPress page and apply your own styles to it.

I like this plugin because you benefit from all the features of LinkedIn, but can bring your profile home. Ideal for students or anyone who wants to create a portfolio of work and offer their resume/CV on a single site. Depending on the theme you use, it does require some additional styling.

Get_OPML

This is a nice way to create an OPML file of your sidebar links. If, like on my personal blog, your links point to resources related to you, you can easily create an OPML file like this one. There’s a couple of things to note about this plugin though. The instructions mention a Technorati API key. I didn’t bother with this. When you create your links, just scroll down the page to the ‘advanced’ section and add the RSS feed there. Secondly, the plugin author has, for some stupid reason, hard-coded the feed to their own site into the plugin. Assuming you don’t want this spamming your personal OPML file, download a modified version from here or comment out line 101 in get-opml.php. I guess the plugin author thinks that you’ll be using this to import the OPML into a feed reader and from there, you can delete his feed. That’s no good to us though. Finally, you’ll want to make your OPML file auto-discoverable. You can do this by adding a line of html in your header, using the Header-Footer plugin below.

Header-Footer

This simply allows you to add code to the header and footer of your blog. In our case, you can use it to add an auto-discovery link to the header of every page of your blog.


<link rel="outline" type="text/xml+opml" title="ADD YOUR TITLE HERE" href="http://YOUR_BLOG_ADDRESS/opml.xml" />

WP Calais * + tagaroo

These three plugins use the OpenCalais API to examine your blog posts and return a bunch of semantic tags. I’ve written about this in more detail here (towards the end).

The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.

It’s an easy way to add relevant tags to your content and broadcast your content for indexing by OpenCalais. They place an additional link in your header that lists the tags for web crawlers and, I guess, improves the SEO for your site.

Extra Feed Links

I’ve written about this plugin previously, too. It adds additional autodiscovery links to your blog for author, category and tag feeds. WordPress feed functionality is very powerful and this plugin makes it especially easy to make those feeds visible.

Lifestream

This isn’t a semantic web plugin, but is a powerful way of aggregating all of your activity across the web into a single activity stream. See my example, here. It also produces a single RSS feed from your aggregated activity. Nice 😉

Wrapping things up

If you set all of this up, you’ll have a WordPress site that can act as your primary identity across the web, aggregates much of your activity on the web into a single site and also offers multiple ways for people to discover and read your site. You also get a ‘well-formed’ portfolio that is enriched with semantic markup and links you to the wider online community in a way that you control.

Bear in mind that some of these plugins might not appear to do anything at all. The semantic web is about machines being able to read and link data, right? If you look closely in the source of your home page, you’ll see a few lines that speak volumes about you in machine talk.


<link rel="meta" href="./wp-content/plugins/wp-rdfa/foaf.php"type="application/rdf+xml" title="FOAF"/>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<link rel="meta" type="text/xml" title="APML" href="http://blog.josswinn.org/apml/" />
<link rel="alternate" type="application/rss+xml" title="NoteStream RSS Feed" href="http://blog.josswinn.org/feed/" />
<link rel="resourcemap" type="application/atom+xml" href="http://blog.josswinn.org/wp-content/plugins/oai-ore/rem.php"/>

If you do want a way to view the data, I recommend the following Firefox add-ons

Operator: Auto-discovers any embedded microformats and provides useful ways to search for similar data via third-party services elsewhere on the web.

OPML Reader: Auto-discovers an OPML file if you have one linked in your header. Allows you to either download the file or read it on Grazr.

Semantic Radar: Auto-discovers embedded RDF data. Displays custom icons to indicate the presence of FOAF, SIOC, DOAP and RDFa formats.

The Tabulator Extension: Auto-discovers and provides a table-based display for RDF data on the Semantic Web. Makes RDF data readable to the average person and shows how data are linked together across different sites.

As always, please let me know how this overview could be improved or if you know of other ways to add semantic functionality to your WordPress blog. Thanks.