Spinning a different kind of WPMU platform with JISCPress

We finished JISCPress. If you’re interested, I’ve written a long overview of the work we’ve done with WPMU as a document discussion platform, based on WriteToReply. You’ll see that the project has, among other things, produced three plugins: digress.it, and two Linked Data plugins that run as background services across the platform, create relationships between documents and document sections and post RDF to the Talis Data Store. Fancy!

The FolkSemantic Widget for OER discovery

I like this. A widget that analyses the content of your web page and suggest related Open Educational Resources (OERs), using FolkSemantic, a collaborative website that allows you to “browse and search over 110,000 OERs”. You can see the widget in the sidebar of this blog under the heading of ‘Related Educational Resources’ –>

So, if I dump a load of text relating to ‘physics’, say, you should see physics-related OERs… Does it work? 🙂 Some random tests on other blog posts, suggests it is a bit hit and miss, but is certainly matching some OERs to the content. I wonder if we could use this approach to find related documents on the JISCPress project?

Physics (Greekphysis – φύσις meaning “nature“) is a natural science; it is the study of matter[1] and its motion through spacetime and all that derives from these, such as energy and force.[2] More broadly, it is the general analysis of nature, conducted in order to understand how the world and universe behave.[3][4]

Physics is one of the oldest academic disciplines, perhaps the oldest through its inclusion of astronomy.[5] Over the last two millennia, physics had been considered synonymous with philosophychemistry, and certain branches of mathematics and biology, but during the Scientific Revolution in the 16th century, it emerged to become a unique modern science in its own right.[6] However, in some subject areas such as in mathematical physics and quantum chemistry, the boundaries of physics remain difficult to distinguish.

Physics is both significant and influential, in part because advances in its understanding have often translated into new technologies, but also because new ideas in physics often resonate with the other sciences, mathematics and philosophy.

For example, advances in the understanding of electromagnetism led directly to the development of new products which have dramatically transformed modern-day society (e.g., television, computers, and domestic appliances); advances in thermodynamics led to the development of motorized transport; and advances in mechanics inspired the development of calculus.

Physics covers a wide range of phenomena, from the smallest sub-atomic particles, to the largest galaxies. Included in this are the very most basic objects from which all other things are composed, and therefore physics is sometimes said to be the “fundamental science”.[7]

Physics aims to describe the various phenomena that occur in nature in terms of simpler phenomena. Thus, physics aims to both connect the things we see around us to root causes, and then to try to connect these causes together in the hope of finding anultimate reason for why nature is as it is.

For example, the ancient Chinese observed that certain rocks (lodestone) were attracted to one another by some invisible force. This effect was later called magnetism, and was first rigorously studied in the 17th century.

A little earlier than the Chinese, the ancient Greeks knew of other objects such as amber, that when rubbed with fur would cause a similar invisible attraction between the two. This was also first studied rigorously in the 17th century, and came to be calledelectricity.

Thus, physics had come to understand two observations of nature in terms of some root cause (electricity and magnetism). However, further work in the 19th century revealed that these two forces were just two different aspects of one force – electromagnetism. This process of “unifying” forces continues today (see section Current research for more information).

Physics uses the scientific method to test the validity of a physical theory, using a methodical approach to compare the implications of the theory in question with the associated conclusions drawn from experiments and observations conducted to test it. Experiments and observations are to be collected and matched with the predictions and hypotheses made by a theory, thus aiding in the determination or the validity/invalidity of the theory.

Theories which are very well supported by data and have never failed any competent empirical test are often called scientific laws, or natural laws. Of course, all theories, including those called scientific laws, can always be replaced by more accurate, generalized statements if a disagreement of theory with observed data is ever found [8]. ((Source: Wikipedia))

My revised ALT-C proposal

I’ve just re-submitted this proposal for a demonstration at ALT-C 2009. It’s called WordPress Multi-User: BuddyPress and Beyond. It won’t be confirmed until June, but for the record, here it is…

‘BuddyPress’ is a new social networking layer for WordPress Multi-User blogs. It provides familiar, easy to use social networking features in addition to a high-quality and popular blogging platform. The University of Lincoln have been trialing WordPress MU since May 2008 and have been using BuddyPress since February 2009 to promote an institutional social networking community built around personalised and collaborative web publishing.

This session will demonstrate the versatility of the WordPress MU platform. We’ll look at an installation that is enhanced with BuddyPress, LDAP authentication, mobile phone support and advanced privacy controls. You’ll see how simple it is to set up site-wide RSS syndication and aggregation, enhance your blog with semantic web tools, publish mathematical formulae with LaTeX, send realtime notifications to Facebook, Twitter and IM, publish podcasts to iTunes, and embed GPX and KML mapping files. We’ll also look at how to embed WordPress content in your VLE and other institutional websites. The use of a temporary ‘ALT-C 2009 BuddyPress’ installation will be encouraged.

There will be opportunities throughout for questions and answers and participants will leave with a good understanding of the advantages and disadvantages of WordPress and the resources and skills required to provide a social networking and blogging platform in your institution.

Getting your Triples into Talis Connected Commons

A few days ago, I wrote about adding Triplify to your web application. Specifically, I wrote about adding it to WordPress, but the same information can be applied to most web publishing platforms. Earlier this month, TALIS announced their Connected Commons platform and yesterday they announced a commercial version of their platform for the structured storage of Linked Data. Storage is all very well, but more importantly they have an API for developers, so that the data can be queried and creatively re-used or mashed up.

So this got me thinking about JISCPress, our recent JISC Rapid Innovation Programme bid, which proposes a WordPress Multi-User based platform for publishing JISC funding calls and the reports of funded projects. This is based on my experience of running WriteToReply with Tony Hirst.

Although a service for comment and discussion around documents, one of the things that interests me most about WriteToReply and, consequently the JISCPress proposal, is the cumulative storage of data on the platform and how that data might be used. No surprise really as my background is in archiving and collections management. As with the University of Lincoln blogs, WriteToReply and the proposed JISCPress platform, aggregate published content into a site-wide ‘tags’ site that allows anyone to search and browse through all content that has been published to the public. In the case of the university blogs, that’s a large percentage of blogs, but for WriteToReply and JISCPress, it would be pretty much every document hosted on the platform.

You can see from the WriteToReply tags site that over time, a rich store of public documents could be created for querying and re-use. The site design is a bit clunky right now but under the hood you’ll notice that you can search across the text of every document, browse by document type and by tag. The tags are created by publishing the content to OpenCalais, which returns a whole bunch of semantic keywords for each document section. You’ll also notice that an RSS feed is available for any search query, any category and any tag or combination of tags.

Last night, I was thinking about the WriteToReply site architecture (note that when I mention WriteToReply, it almost certainly applies to JISCPress, too – same technology, similar principles, different content). Currently, we categorise each document by document type so you’ll see ‘Consultations‘, ‘Action Plans‘ ‘Discussion Papers‘, etc.. We author all documents under the WriteToReply username, too and tag each document section both manually and via OpenCalais. However, there’s more that we could do, with little effort, to mark up the documents and I’ve started sketching it out.

You’ll see from the diagram that I’m thinking we should introduce location and subject categories. There will be formal classification schemes we could use. For example, I found a Local Government Classification Scheme, which provides some high level subjects that are the type of thing I’m thinking about. I’m not suggesting we start ‘cataloguing’ the documents, but simply borrow, at the top level, from recognised classification schemes that are used elsewhere. I’m also thinking that we should start creating a new author for each document and in the case of WriteToReply, the author would be the agency who issued the consultation, report, or whatever.

So following these changes, we would capture the following data (in bold), for example:

The Home Office created Protecting the public in a changing communications environment on April 27th which is a consultation document for England, Wales and Scotland, categorised under Information and communication technology with 18 sections.

Section one is tagged Governor, Home Department, Office of Public Sector Information, Secretary of State, Surrey.

Section two is tagged communications data, communications industry, emergency services, Home Secretary, Jacqui Smith MP, Rt Hon Jacqui Smith MP.

Section three is tagged Broadband, BT, communications, communications changes, communications data, communications data capability, communications data limits, communications environment, communications event, communications industry, communications networks, communications providers, communications service providers, communications services, emergency services, Her Majesty’s Revenue and Customs, Home Office, intelligence agencies, internet browsing, Internet Protocol, Internet Service, IP, mobile telephone system, physical networks, public telecommunications service, registered owner, Serious Organised Crime Agency, social networking, specified communications data, The communications industry, United Kingdom.

Section four is tagged …(you get the picture)

Section five, paragraph six, has the comment “fully compatible with the ECHR” is, of course, an assertion made by the government, about its own legislation. Has that assertion ever been tested in a court? authored by Owen Blacker on April 28th 11:32pm.

Selected text from Section five, paragraph eight, has the comment Over my dead body! authored by Mr Angry on April 28th 9:32pm

Note that every author, document, section, paragraph, text selection, category, tag, comment and comment author has a URI, Atom, RSS and RDF end point (actually, text selection and comment author feeds are forthcoming features).

Now, with this basic architecture mapped out, we might wonder what Triplify could add to this. I’ve already shown in my earlier post that, with little effort, it re-publishes data from a relational database as N-Triples semantic data, so everything you see above, could be published as RDF data (and JSON, too).

So, in my simple view of the world, we have a data source that requires very little effort to generate content for and manage (JISCPress/WriteToReply/WordPress), a method of automatically publishing the data for the semantic web (Triplify) and, with TALIS, an API for data storage, data access, query, and augmentation.  As always, my mantra is ‘I am not a developer’, but from where I’m standing, this high-level ‘workflow’ seems reasonable.

The benefits for the JISC community would primarily be felt by using the JISCPress website, in a similar way (albeit with better, more informed design) to the WriteToReply ‘tags’ site. We could search across the full text of funding calls, browse the reports by author, categories and tags and grab news feeds from favourite authors, searches, tags or categories. This is all in addition to the comment, feedback and discussion features we’ve proposed, too. Further benefits would be had from ‘re-publishing’ the site content as semantic data to a platform such as TALIS. Not only could there be further Rapid Innovation projects which worked on this data, but it would be available for any member of the public to query and re-use, too. No longer would our final project reports, often the distillation of our research, sit idle as PDF files on institutional websites and in institutional repositories. If the documentation we produce it worth anything, then it’s worth re-publishing openly as semantic data.

Finally, in order to benefit from the (free) use of TALIS Connected Commons, the data being published needs to be licensed under a public domain or Creative Commons ‘zero’ licence. I suspect Crown Copyright is not compatible with either of these licenses, although why the hell public consultation documents couldn’t be licensed this way, I don’t know. Do you? For JISCPress, this would be a choice JISC could make. The alternative is to use the commercial TALIS platform or something similar.

As usual, tell me what you think… Thanks.

Triplify: Make your blog mashable

Last week, I wrote about how it is relatively simple to ‘pimp your ride on the semantic web‘. Over the weekend, I stumbled upon Triplify, a small ‘plugin’ for pretty much any web publishing platform, that “reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.” What is so appealing about Triplify is how easy it is to implement, especially alongside a WordPress site.

I can confirm that the three-step installation process is all it takes, although I wouldn’t undertake implementing this blindly as you are, literally, exposing a semantic representation of your database content. In other words, you should look at the configuration file you’re using and check that it’s going to expose the right data and not clear text passwords and unpublished posts and comments. Before I  implemented it, I realised that it would expose comments on a bunch of posts that I have since made private (they were imported from an old, private blog), so I had to ‘unapprove’ those comments so the script didn’t expose them to the public. A five minute job. Alternatively, the script could probably be modified to work around my problem, by only exposing comments after a certain date, for example.

The end result is that, with a WordPress site, you expose a semantic representation of your users, posts, pages, tags, categories, comments and attachments in RDF (N-Triples) and JSON formatted data (for JSON, just add ‘?t-output=json’ to the end of the URI). Like I said though, it could be used on any database driven web application. Here’s what you get when you expose the high level links to your content:


<http://blog.josswinn.org/triplify/> <http://www.w3.org/2000/01/rdf-schema#comment> "Generated by Triplify V0.5 (http://Triplify.org)" .
<http://blog.josswinn.org/triplify/> <http://creativecommons.org/ns#license> <http://creativecommons.org/licenses/by/2.0/uk/> .
<http://blog.josswinn.org/triplify/post> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/attachment> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/tag> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/category> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/user> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://blog.josswinn.org/triplify/comment> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .

Here’s an example of what you get when you expose the full content:


<http://blog.josswinn.org/triplify/post/154> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdfs.org/sioc/ns#Post> .
<http://blog.josswinn.org/triplify/post/154> <http://rdfs.org/sioc/ns#has_creator> <http://blog.josswinn.org/triplify/user/1> .
<http://blog.josswinn.org/triplify/post/154> <http://purl.org/dc/terms/created> "2008-10-06T05:55:25"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
<http://blog.josswinn.org/triplify/post/154> <http://rdfs.org/sioc/ns#content> "Up early to go to Sheffield for LPI exams. The last week has left me underprepared. Never mind." .
<http://blog.josswinn.org/triplify/post/154> <http://purl.org/dc/terms/modified> "2008-10-06T20:12:15"^^<http://www.w3.org/2001/XMLSchema#dateTime> .

...

<http://blog.josswinn.org/triplify/post/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag> <http://blog.josswinn.org/triplify/tag/27> .

...

<http://blog.josswinn.org/triplify/post/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag> <http://blog.josswinn.org/triplify/tag/41> .
<http://blog.josswinn.org/triplify/post/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/taggedWithTag> <http://blog.josswinn.org/triplify/tag/42> .

...

<http://blog.josswinn.org/triplify/post/154> <http://sdp.iasi.rdsnet.ro/semantic-wordpress/vocabulary/belongsToCategory> <http://blog.josswinn.org/triplify/category/22> .

...

<http://blog.josswinn.org/triplify/tag/154> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/Tag> .
<http://blog.josswinn.org/triplify/tag/154> <http://www.holygoat.co.uk/owl/redwood/0.1/tags/tagName> "valentine" .

You can choose to expose different levels of information in your HTML source. If you have more than a moderate amount of content, you’ll probably want to just expose the top level links as in the first example and let the users of your data dig deeper. You’ll also note that you can (and should) attach a license to your data.

A number of namespaces are recognised as well as a WordPress vocabulary.


$triplify['namespaces']=array(
'vocabulary'=>'http://sdp.iasi.rdsnet.ro/semantic-wordpress/vocabulary/',
'rdf'=>'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
'rdfs'=>'http://www.w3.org/2000/01/rdf-schema#',
'owl'=>'http://www.w3.org/2002/07/owl#',
'foaf'=>'http://xmlns.com/foaf/0.1/',
'sioc'=>'http://rdfs.org/sioc/ns#',
'sioctypes'=>'http://rdfs.org/sioc/types#',
'dc'=>'http://purl.org/dc/elements/1.1/',
'dcterms'=>'http://purl.org/dc/terms/',
'skos'=>'http://www.w3.org/2004/02/skos/core#',
'tag'=>'http://www.holygoat.co.uk/owl/redwood/0.1/tags/',
'xsd'=>'http://www.w3.org/2001/XMLSchema#',
'update'=>'http://triplify.org/vocabulary/update#',
);

So, what’s the point in doing this? Well, it’s fairly trivial and if you think that structured, linked, machine-readable licensed data is a Good Thing, why not?  The Triplify website lists an number of advantages:

Such a triplification of your Web application has tremendous advantages:

  • The installations of the Web application are better found and search engines can better evaluate the content.
  • Different installations of the Web application can easily syndicate arbitrary content without the need to adopt interfaces, content representations or protocols, even when the content structures change.
  • It is possible to create custom tailored search engines targeted at a certain niche. Imagine a search engine for products, which can be queried for digital cameras with high resolution and large zoom.

Ultimately, a triplification will counteract the centralization we faced through Google, YouTube and Facebook and lead to an increased democratization of the Web

The vision of the semantic web and semantic publishing is one of meaningfully identifying objects (and people) on the Internet and showing their relationships. This should improve searches for things on the web, but also improve how we exchange knowledge, re-use information and help clarify our identity on the web, too. It’s an ambitious task, but made easier with tools like Triplify.  The semantic web also raises questions over individual privacy and, if data is well formed and accessible, it may be easier to control and therefore censor. The creator of Triplify recently gave a technical presentation on Triplify and how it is being used to publish data collected by the OpenStreetMap project. It shows how geodata exposed in this way can result in mashup applications that directly benefit you and me.

Pimping your ride on the semantic web

Yesterday, I wrote about how I’d marked up my home page to create a semantic profile of myself that is both auto-discoverable and portable. A place where my identity on the web can be aggregated; not a hole I’ve dug for myself, but an identity that reaches out across the web but always leads back home.

While I enjoy polishing my text editor regularly and hand-crafting beautifully formed, structured data, we all know it’s a fool’s game and that the semantic web is about machines doing all the work for us. So here’s a quick and dirty run down of how to pimp your ride on the semantic web with WordPress and a few plugins.

You’ll need a self-hosted WordPress site that allows you to install plugins. I’ve got one on Dreamhost that costs me $6 a month. Next, you’ll want to install some plugins. I’ll explain what they do afterwards. One thing to note here is that I’m using plugins from the official plugin repository whenever possible. It means that you can install them from the WordPress Dashboard and you’ll get automatic updates (and they’re all GPL compatible). In no particular order…

I think that’s quite enough. All but the SIOC plugin are available from the official WordPress plugin repository. Here’s what they provide:

APML: Attention Profile Markup Language

APML (Attention Profiling Mark-up Language) is an XML-based format for capturing a person’s interests and dislikes. APML allows people to share their own personal attention profile in much the same way that OPML allows the exchange of reading lists between news readers.

The plugin creates an XML file like this one that marks up and weighs your WordPress tags as a measure of your interests. It also lists your blogroll/links and any embedded feeds.

Extended Profile

This plugin adds additional fields in your user profile which is encoded with hCard semantic microformat markup and can then be displayed in a page or as a sidebar widget. You can import hCard data, too. There might also be another use for this, too. (see below)

Micro Anywhere

Provides a couple of additional editor functions that allow you to create an hCard or hCalendar events page. Here’s an example.

OpenID

This plugin allows users to login to their local WordPress account using an OpenID, as well as enabling commenters to leave authenticated comments with OpenID. The plugin also includes an OpenID provider, enabling users to login to OpenID-enabled sites using their own personal WordPress account. XRDS-Simple is required for the OpenID Provider and some features of the OpenID Consumer.

This is key to your identity. You can use your blog URL as your OpenID or delegate a third-party service, such as MyOpenID or ClaimID. In fact, you’ve almost certainly got an OpenID already if you have a Yahoo!, Google, MySpace or AIM account. It’s up to you which one you choose to use as your persistent ID. Read more about OpenID here. It’s important and so are the issues it addresses.

XRDS-Simple

This is required to add further functionality to the OpenID plugin. It adds Attribute Exchange (AX) to your OpenID which basically means that certain profile information can be passed to third-party services (less form filling for you!) Like a lot of these plugins, install it and forget about it.

SIOC

Provides auto-discoverable SIOC metadata. “A SIOC profile describes the structure and contents of a weblog in a machine readable form.”

wp-RDFa

Provides an auto-discoverable FOAF (Friend of a Friend) profile, based on the members of your blog. I’ve been in touch with the author of this plugin and suggested that the extended profile information could also be pulled into the FOAF profile. This is largely dependent on the FOAF specification being finalised, but expect this plugin to do more as FOAF develops.

OAI-ORE Map

Provides an auto-discoverable OAI-ORE resource map of your blog. It conforms to version 0.9 of the specification, which recently made it to v1.0, so I imagine it will be updated in the near future. OAI-ORE metadata describes aggregated resources, so instead of seeing your blog post permalink as the single identifier for, say, a collection of text and multimedia, it creates a map of those resources and links them.

LinkedIn hResume

LinkedIn hResume for WordPress grabs the hResume microformat block from your LinkedIn public profile page allowing you to add it to any WordPress page and apply your own styles to it.

I like this plugin because you benefit from all the features of LinkedIn, but can bring your profile home. Ideal for students or anyone who wants to create a portfolio of work and offer their resume/CV on a single site. Depending on the theme you use, it does require some additional styling.

Get_OPML

This is a nice way to create an OPML file of your sidebar links. If, like on my personal blog, your links point to resources related to you, you can easily create an OPML file like this one. There’s a couple of things to note about this plugin though. The instructions mention a Technorati API key. I didn’t bother with this. When you create your links, just scroll down the page to the ‘advanced’ section and add the RSS feed there. Secondly, the plugin author has, for some stupid reason, hard-coded the feed to their own site into the plugin. Assuming you don’t want this spamming your personal OPML file, download a modified version from here or comment out line 101 in get-opml.php. I guess the plugin author thinks that you’ll be using this to import the OPML into a feed reader and from there, you can delete his feed. That’s no good to us though. Finally, you’ll want to make your OPML file auto-discoverable. You can do this by adding a line of html in your header, using the Header-Footer plugin below.

Header-Footer

This simply allows you to add code to the header and footer of your blog. In our case, you can use it to add an auto-discovery link to the header of every page of your blog.


<link rel="outline" type="text/xml+opml" title="ADD YOUR TITLE HERE" href="http://YOUR_BLOG_ADDRESS/opml.xml" />

WP Calais * + tagaroo

These three plugins use the OpenCalais API to examine your blog posts and return a bunch of semantic tags. I’ve written about this in more detail here (towards the end).

The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.

It’s an easy way to add relevant tags to your content and broadcast your content for indexing by OpenCalais. They place an additional link in your header that lists the tags for web crawlers and, I guess, improves the SEO for your site.

Extra Feed Links

I’ve written about this plugin previously, too. It adds additional autodiscovery links to your blog for author, category and tag feeds. WordPress feed functionality is very powerful and this plugin makes it especially easy to make those feeds visible.

Lifestream

This isn’t a semantic web plugin, but is a powerful way of aggregating all of your activity across the web into a single activity stream. See my example, here. It also produces a single RSS feed from your aggregated activity. Nice 😉

Wrapping things up

If you set all of this up, you’ll have a WordPress site that can act as your primary identity across the web, aggregates much of your activity on the web into a single site and also offers multiple ways for people to discover and read your site. You also get a ‘well-formed’ portfolio that is enriched with semantic markup and links you to the wider online community in a way that you control.

Bear in mind that some of these plugins might not appear to do anything at all. The semantic web is about machines being able to read and link data, right? If you look closely in the source of your home page, you’ll see a few lines that speak volumes about you in machine talk.


<link rel="meta" href="./wp-content/plugins/wp-rdfa/foaf.php"type="application/rdf+xml" title="FOAF"/>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<link rel="meta" type="text/xml" title="APML" href="http://blog.josswinn.org/apml/" />
<link rel="alternate" type="application/rss+xml" title="NoteStream RSS Feed" href="http://blog.josswinn.org/feed/" />
<link rel="resourcemap" type="application/atom+xml" href="http://blog.josswinn.org/wp-content/plugins/oai-ore/rem.php"/>

If you do want a way to view the data, I recommend the following Firefox add-ons

Operator: Auto-discovers any embedded microformats and provides useful ways to search for similar data via third-party services elsewhere on the web.

OPML Reader: Auto-discovers an OPML file if you have one linked in your header. Allows you to either download the file or read it on Grazr.

Semantic Radar: Auto-discovers embedded RDF data. Displays custom icons to indicate the presence of FOAF, SIOC, DOAP and RDFa formats.

The Tabulator Extension: Auto-discovers and provides a table-based display for RDF data on the Semantic Web. Makes RDF data readable to the average person and shows how data are linked together across different sites.

As always, please let me know how this overview could be improved or if you know of other ways to add semantic functionality to your WordPress blog. Thanks.

Microformats and Firefox

When I have time, I like to read about new and developing web standards and specifications. Sad, you might think, but it’s a way of learning about some of the theoretical developments that eventually turn into practical functionality for all users of the Internet.  Also, I am an Archivist (film, audiovisual, multimedia) by trade, and am somewhat reassured by the development of standards and specifications as a way of achieving consensus among peers and avoiding wasted time and effort in managing ‘stuff’.

So, while poking around on Wikipedia last night, I came across ‘Operator‘, an add-on for Firefox that makes part of the ‘hidden’ semantic web immediately visible and useful to everybody. If you’re using Firefox, click here to install it. It’s been available for over a year now and is mature and extensible through the use of user scripts.  It’s been developed by Michael Kaply, who works on web browsers for IBM and is responsible for microformat support in Firefox.

Operator leverages microformats and other semantic data that are already available on many web pages to provide new ways to interact with web services.

In practice, Operator is a Firefox tool bar (and/or location/status bar icon) that identifies microformats and other semantic data in a web page and allows you to combine the value of that information with other web services such as search, bookmarking, mapping, etc. For example, this blog has tags. Operator identifies the tags and then offers the option of searching various services such as Amazon, YouTube, delicious and Upcoming, for a particular tag.  If Operator finds geo-data, it offers the option of mapping that to Google Maps and, on this page for example, it identifies me as author and allows you to download my contact details, which are embedded in the XHTML. Because it is extensible through user-scripts, there are many other ways that the microformat data can be used.

Of particular interest to students and staff are perhaps the microformat specifications for resumes and contact details. Potentially, a website, properly marked up (and WordPress allows for some of this already), could provide a rich and useful portfolio of their work and experience which is semantically linked to other services such as Institutional Repositories or other publications databases where their work is held.

After using it for a few hours, I now find myself disappointed when a website doesn’t offer at least one piece of semantic data that is found by Operator (currently, most don’t but some do). Microformat support will be included (rather than an add-on) in Firefox 3.1 and IE 8, so we can expect to see much more widespread adoption of it. A good thing.

There’s a nice demonstration of microformats here, using the Operator plugin.