Pimping your ride on the semantic web

Yesterday, I wrote about how I’d marked up my home page to create a semantic profile of myself that is both auto-discoverable and portable. A place where my identity on the web can be aggregated; not a hole I’ve dug for myself, but an identity that reaches out across the web but always leads back home.

While I enjoy polishing my text editor regularly and hand-crafting beautifully formed, structured data, we all know it’s a fool’s game and that the semantic web is about machines doing all the work for us. So here’s a quick and dirty run down of how to pimp your ride on the semantic web with WordPress and a few plugins.

You’ll need a self-hosted WordPress site that allows you to install plugins. I’ve got one on Dreamhost that costs me $6 a month. Next, you’ll want to install some plugins. I’ll explain what they do afterwards. One thing to note here is that I’m using plugins from the official plugin repository whenever possible. It means that you can install them from the WordPress Dashboard and you’ll get automatic updates (and they’re all GPL compatible). In no particular order…

I think that’s quite enough. All but the SIOC plugin are available from the official WordPress plugin repository. Here’s what they provide:

APML: Attention Profile Markup Language

APML (Attention Profiling Mark-up Language) is an XML-based format for capturing a person’s interests and dislikes. APML allows people to share their own personal attention profile in much the same way that OPML allows the exchange of reading lists between news readers.

The plugin creates an XML file like this one that marks up and weighs your WordPress tags as a measure of your interests. It also lists your blogroll/links and any embedded feeds.

Extended Profile

This plugin adds additional fields in your user profile which is encoded with hCard semantic microformat markup and can then be displayed in a page or as a sidebar widget. You can import hCard data, too. There might also be another use for this, too. (see below)

Micro Anywhere

Provides a couple of additional editor functions that allow you to create an hCard or hCalendar events page. Here’s an example.

OpenID

This plugin allows users to login to their local WordPress account using an OpenID, as well as enabling commenters to leave authenticated comments with OpenID. The plugin also includes an OpenID provider, enabling users to login to OpenID-enabled sites using their own personal WordPress account. XRDS-Simple is required for the OpenID Provider and some features of the OpenID Consumer.

This is key to your identity. You can use your blog URL as your OpenID or delegate a third-party service, such as MyOpenID or ClaimID. In fact, you’ve almost certainly got an OpenID already if you have a Yahoo!, Google, MySpace or AIM account. It’s up to you which one you choose to use as your persistent ID. Read more about OpenID here. It’s important and so are the issues it addresses.

XRDS-Simple

This is required to add further functionality to the OpenID plugin. It adds Attribute Exchange (AX) to your OpenID which basically means that certain profile information can be passed to third-party services (less form filling for you!) Like a lot of these plugins, install it and forget about it.

SIOC

Provides auto-discoverable SIOC metadata. “A SIOC profile describes the structure and contents of a weblog in a machine readable form.”

wp-RDFa

Provides an auto-discoverable FOAF (Friend of a Friend) profile, based on the members of your blog. I’ve been in touch with the author of this plugin and suggested that the extended profile information could also be pulled into the FOAF profile. This is largely dependent on the FOAF specification being finalised, but expect this plugin to do more as FOAF develops.

OAI-ORE Map

Provides an auto-discoverable OAI-ORE resource map of your blog. It conforms to version 0.9 of the specification, which recently made it to v1.0, so I imagine it will be updated in the near future. OAI-ORE metadata describes aggregated resources, so instead of seeing your blog post permalink as the single identifier for, say, a collection of text and multimedia, it creates a map of those resources and links them.

LinkedIn hResume

LinkedIn hResume for WordPress grabs the hResume microformat block from your LinkedIn public profile page allowing you to add it to any WordPress page and apply your own styles to it.

I like this plugin because you benefit from all the features of LinkedIn, but can bring your profile home. Ideal for students or anyone who wants to create a portfolio of work and offer their resume/CV on a single site. Depending on the theme you use, it does require some additional styling.

Get_OPML

This is a nice way to create an OPML file of your sidebar links. If, like on my personal blog, your links point to resources related to you, you can easily create an OPML file like this one. There’s a couple of things to note about this plugin though. The instructions mention a Technorati API key. I didn’t bother with this. When you create your links, just scroll down the page to the ‘advanced’ section and add the RSS feed there. Secondly, the plugin author has, for some stupid reason, hard-coded the feed to their own site into the plugin. Assuming you don’t want this spamming your personal OPML file, download a modified version from here or comment out line 101 in get-opml.php. I guess the plugin author thinks that you’ll be using this to import the OPML into a feed reader and from there, you can delete his feed. That’s no good to us though. Finally, you’ll want to make your OPML file auto-discoverable. You can do this by adding a line of html in your header, using the Header-Footer plugin below.

Header-Footer

This simply allows you to add code to the header and footer of your blog. In our case, you can use it to add an auto-discovery link to the header of every page of your blog.


<link rel="outline" type="text/xml+opml" title="ADD YOUR TITLE HERE" href="http://YOUR_BLOG_ADDRESS/opml.xml" />

WP Calais * + tagaroo

These three plugins use the OpenCalais API to examine your blog posts and return a bunch of semantic tags. I’ve written about this in more detail here (towards the end).

The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.

It’s an easy way to add relevant tags to your content and broadcast your content for indexing by OpenCalais. They place an additional link in your header that lists the tags for web crawlers and, I guess, improves the SEO for your site.

Extra Feed Links

I’ve written about this plugin previously, too. It adds additional autodiscovery links to your blog for author, category and tag feeds. WordPress feed functionality is very powerful and this plugin makes it especially easy to make those feeds visible.

Lifestream

This isn’t a semantic web plugin, but is a powerful way of aggregating all of your activity across the web into a single activity stream. See my example, here. It also produces a single RSS feed from your aggregated activity. Nice 😉

Wrapping things up

If you set all of this up, you’ll have a WordPress site that can act as your primary identity across the web, aggregates much of your activity on the web into a single site and also offers multiple ways for people to discover and read your site. You also get a ‘well-formed’ portfolio that is enriched with semantic markup and links you to the wider online community in a way that you control.

Bear in mind that some of these plugins might not appear to do anything at all. The semantic web is about machines being able to read and link data, right? If you look closely in the source of your home page, you’ll see a few lines that speak volumes about you in machine talk.


<link rel="meta" href="./wp-content/plugins/wp-rdfa/foaf.php"type="application/rdf+xml" title="FOAF"/>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<link rel="meta" type="text/xml" title="APML" href="http://blog.josswinn.org/apml/" />
<link rel="alternate" type="application/rss+xml" title="NoteStream RSS Feed" href="http://blog.josswinn.org/feed/" />
<link rel="resourcemap" type="application/atom+xml" href="http://blog.josswinn.org/wp-content/plugins/oai-ore/rem.php"/>

If you do want a way to view the data, I recommend the following Firefox add-ons

Operator: Auto-discovers any embedded microformats and provides useful ways to search for similar data via third-party services elsewhere on the web.

OPML Reader: Auto-discovers an OPML file if you have one linked in your header. Allows you to either download the file or read it on Grazr.

Semantic Radar: Auto-discovers embedded RDF data. Displays custom icons to indicate the presence of FOAF, SIOC, DOAP and RDFa formats.

The Tabulator Extension: Auto-discovers and provides a table-based display for RDF data on the Semantic Web. Makes RDF data readable to the average person and shows how data are linked together across different sites.

As always, please let me know how this overview could be improved or if you know of other ways to add semantic functionality to your WordPress blog. Thanks.

Microformats and Firefox

When I have time, I like to read about new and developing web standards and specifications. Sad, you might think, but it’s a way of learning about some of the theoretical developments that eventually turn into practical functionality for all users of the Internet.  Also, I am an Archivist (film, audiovisual, multimedia) by trade, and am somewhat reassured by the development of standards and specifications as a way of achieving consensus among peers and avoiding wasted time and effort in managing ‘stuff’.

So, while poking around on Wikipedia last night, I came across ‘Operator‘, an add-on for Firefox that makes part of the ‘hidden’ semantic web immediately visible and useful to everybody. If you’re using Firefox, click here to install it. It’s been available for over a year now and is mature and extensible through the use of user scripts.  It’s been developed by Michael Kaply, who works on web browsers for IBM and is responsible for microformat support in Firefox.

Operator leverages microformats and other semantic data that are already available on many web pages to provide new ways to interact with web services.

In practice, Operator is a Firefox tool bar (and/or location/status bar icon) that identifies microformats and other semantic data in a web page and allows you to combine the value of that information with other web services such as search, bookmarking, mapping, etc. For example, this blog has tags. Operator identifies the tags and then offers the option of searching various services such as Amazon, YouTube, delicious and Upcoming, for a particular tag.  If Operator finds geo-data, it offers the option of mapping that to Google Maps and, on this page for example, it identifies me as author and allows you to download my contact details, which are embedded in the XHTML. Because it is extensible through user-scripts, there are many other ways that the microformat data can be used.

Of particular interest to students and staff are perhaps the microformat specifications for resumes and contact details. Potentially, a website, properly marked up (and WordPress allows for some of this already), could provide a rich and useful portfolio of their work and experience which is semantically linked to other services such as Institutional Repositories or other publications databases where their work is held.

After using it for a few hours, I now find myself disappointed when a website doesn’t offer at least one piece of semantic data that is found by Operator (currently, most don’t but some do). Microformat support will be included (rather than an add-on) in Firefox 3.1 and IE 8, so we can expect to see much more widespread adoption of it. A good thing.

There’s a nice demonstration of microformats here, using the Operator plugin.

Bruce Sterling’s Preface to ‘The Hacker Crackdown’

Last night, I downloaded the ebook of Bruce Sterling’s The Hacker Crackdown to my iPod Touch. With the exception of poor battery life, the iPod Touch running the Stanza ebook reader software is a decent arrangment, even more so because you can easily download books from FeedBooks.

Anyway, opening Bruce Sterling’s book on my iPod this morning while walking to work, I had great pleasure reading his preface to the ebook version and his ‘license’ to the reader to distribute it non-commercially. In case you’re late to this book, as I am, here it is for your enjoyment, too:

Preface to the Electronic Release of THE HACKER CRACKDOWN

October 31, 1993-Austin, Texas
Hi, I’m Bruce Sterling, the author of this electronic book. Out in the traditional world of print, this book is still a part of the traditional commercial economy, because it happens to be widely available in paperback (for a while, at least).

Out in the world of print, THE HACKER CRACKDOWN is ISBN 0-553-08058-X, and is formally catalogued by the Library of Congress as “1. Computer crimes-United States. 2. Telephone-United States-Corrupt practices. 3. Programming (Electronic computers)-United States-Corrupt practices.” ‘Corrupt practices,’ I always get a kick out of that description. Librarians are very ingenious people.

If you go and buy the print version of THE HACKER CRACKDOWN, an action I encourage heartily, you may notice that in the front of the book, right under the copyright sign-“Copyright (C) 1992 by Bruce Sterling”-it has this little block of printed legal boilerplate from the publisher. It says, and I quote:

“No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without permission in writing from the publisher. For information address: Bantam Books.”

This is a pretty good disclaimer, as such disclaimers go. I collect intellectual-property disclaimers, and I’ve seen dozens of them, and this one is at least pretty straightforward. Unfortunately, it doesn’t have much to do with reality. Bantam Books puts that disclaimer on every book they publish, but Bantam Books does not, in fact, own the electronic rights to this book. I do. And I’ve chosen to give them away. Bantam Books is not going to fuss about this. They are not going to bother you for what you do with the electronic copy of this book. If you want to check this out personally, you can ask them; they’re at 1540 Broadway NY NY 10036. However, if you were so foolish as to print this book and start retailing it for money in violation of my copyright and the commercial interests of Bantam Books, then Bantam, a part of the gigantic Bertelsmann multinational publishing combine, would roust some of their heavy-duty attorneys out of hibernation and crush you like a bug. This is only to be expected. I didn’t write this book so that you could make money out of it. If anybody is gonna make money out of this book, it’s gonna be me and my publisher.

My publisher deserves to make money out of this book. Not only did the folks at Bantam Books commission me to write the book, and pay me a hefty sum to do so, but they bravely printed, in text, an electronic document the reproduction of which was once alleged to be a federal felony. Bantam Books and their numerous attorneys were very brave and forthright about this book. Furthermore, my former editor at Bantam Books, Betsy Mitchell, genuinely cared about this project, and worked hard on it, and had a lot of wise things to say about the manuscript. Betsy deserves genuine credit for this book, credit that editors too rarely get.

The critics were very kind to THE HACKER CRACKDOWN, and commercially the book has done well. On the other hand, I didn’t write this book in order to squeeze every last nickel and dime out of the mitts of impoverished sixteen-year-old cyberpunk high- school students. Teenagers don’t have any money-no, not even enough for HACKER CRACKDOWN. That’s a major reason why they sometimes succumb to th temptation to do things they shouldn’t, such as swiping my books out of libraries. Kids: this one is all yours, all right? Go give the paper copy back. *8-)

Well-meaning, public-spirited civil libertarians don’t have much money, either. And it seems almost criminal to snatch cash out of the hands of America’s grotesquely underpaid electronic law enforcement community.

If you’re a computer cop, a hacker, or an electronic civil liberties activist, you are the target audience for this book. I wrote this book because I wanted to help you, and help other people understand you and your unique, uhm, problems. I wrote this book to aid your activities, and to contribute to the public discussion of important political issues. In giving the text away in this fashion, I am directly contributing to the book’s ultimate aim: to help civilize cyberspace.

Information WANTS to be free. And the information inside this book longs for freedom with a peculiar intensity. I genuinely believe that the natural habitat of this book is inside an electronic network. That may not be the easiest direct method to generate revenue for the book’s author, but that doesn’t matter; this is where this book belongs by its nature. I’ve written other books-plenty of other books-and I’ll write more and I am writing more, but this one is special. I am making THE HACKER CRACKDOWN available electronically as widely as I can conveniently manage, and if you like the book, and think it is useful, then I urge you to do the same with it.

You can copy this electronic book. Copy the heck out of it, be my guest, and give those copies to anybody who wants them. The nascent world of cyberspace is full of sysadmins, teachers, trainers, cybrarians, netgurus, and various species of cybernetic activist. If you’re one of those people, I know about you, and I know the hassle you go through to try to help people learn about the electronic frontier. I hope that possessing this book in electronic form will lessen your troubles. Granted, this treatment of our electronic social spectrum not the ultimate in academic rigor. And politically, it has something to offend and trouble almost everyone. But hey, I’m told it’s readable, and at least the price is right.

You can upload the book onto bulletin board systems, or Internet nodes, or electronic discussion groups. Go right ahead and do that, I am giving you express permission right now.
Enjoy yourself.

You can put the book on disks and give the disks away, as long as you don’t take any money for it.

But this book is not public domain. You can’t copyright it in your own name. I own the copyright. Attempts to pirate this book and make money from selling it may involve you in a serious litigative snarl. Believe me, for the pittance you might wring out of such an action, it’s really not worth it. This book don’t “belong” to you. In an odd but very genuine way, I feel it doesn’t “belong” to me, either. It’s a book about the people of cyberspace, and distributing it in this way is the best way I know to actually make this information available, freely and easily, to all the people of cyberspace-including people far outside the borders of the United States, who otherwise may never have a chance to see any edition of the book, and who may perhaps learn something useful from this strange story of distant, obscure, but portentous events in so-called “American cyberspace.”

This electronic book is now literary freeware. It now belongs to the emergent realm of alternative information economics. You have no right to make this electronic book part of the conventional flow of commerce. Let it be part of the flow of knowledge: there’s a difference. I’ve divided the book into four sections, so that it is less ungainly for upload and download; if there’s a section of particular relevance to you and your colleagues, feel free to reproduce that one and skip the rest.

Just make more when you need them, and give them to whoever might want them.

Now have fun.

Bruce Sterling—

Zotero 1.5 Sync Preview

Zotero is a great Firefox extension but had one major limitation for me in that the data it collected was tied to the web browser it was installed on.  However, there is now a preview of the next version of Zotero, which includes syncing your Zotero data between browsers by storing the data on remote servers.  I encourage anyone who collects bibliographic information to give Zotero a try as an alternative to Refworks and other bibliographic tools. Note that the preview version is BETA software and you may lose information.

Zotero [zoh-TAIR-oh] is a free, easy-to-use Firefox extension to help you collect, manage, and cite your research sources. It lives right where you do your work — in the web browser itself.

Zotero is an easy-to-use yet powerful research tool that helps you gather, organize, and analyze sources (citations, full texts, web pages, images, and other objects), and lets you share the results of your research in a variety of ways. An extension to the popular open-source web browser Firefox, Zotero includes the best parts of older reference manager software (like EndNote)—the ability to store author, title, and publication fields and to export that information as formatted references—and the best parts of modern software and web applications (like iTunes and del.icio.us), such as the ability to interact, tag, and search in advanced ways. Zotero integrates tightly with online resources; it can sense when users are viewing a book, article, or other object on the web, and—on many major research and library sites—find and automatically save the full reference information for the item in the correct fields. Since it lives in the web browser, it can effortlessly transmit information to, and receive information from, other web services and applications; since it runs on one’s personal computer, it can also communicate with software running there (such as Microsoft Word). And it can be used offline as well (e.g., on a plane, in an archive without WiFi).