PDF – Joss Winn

I’m at JISC’s #dev8D conference. There’s no end of developer challenges but I’m not a developer. Still, here’s an idea that maybe someone will pick up and run with:

The use of eBook readers is on the rise. Anyone with an iPhone, Android phone, as well as Kindles and Sony Readers, has an eBook reader.

Institutional Repositories provide scholarly articles in PDF format, which eBook readers don’t handle very well at all, especially the phone versions.

Why not provide a Word-to-PDF conversion facility in your repository? EPrints currently offers Word-to-PDF conversion durinng the deposit process. Why not Word-to-ePub format, too?

Why not provide an ePub file as an alternative to the PDF download? ePub is a free, open, standards-based (XHTML/CSS) file format for eBook Readers. There are many advantages for the reader to having an ePub version rather than a PDF version when using an e-Book reader. i.e. better page navigation, search, bookmarks, variable font sizing.

There are PDF-to-ePub converters on the web, so technically it’s possible. They are a bit hit and miss, but so are the Word-to-PDF converters.

Anyone interested? I’d be keen to help if required.

This morning, I found myself on Baseline Scenario, a well-known site which discusses the economic crisis. I noticed that the authors of the site had laboured over producing a PDF version for each month of their archive, by copying and pasting to Word and producing a PDF. There’s a nicer way of doing this, I think. When you’ve done it once, it should take you no more than ten minutes to go through the whole process any other time.

WordPress provides a way to filter content by date. In our example, we’ll grab the RSS feed from the first month of publications: http://baselinescenario.com/2008/09/feed The permalink structure is clear enough on WordPress. For Blogger, it’s nowhere near as intuitive.
The feed will display the articles in descending date order. When you are reading the PDF or eBook version, you don’t want to read the last article first, as you would on the website. To reverse the order of the feed, use Yahoo Pipes (or for WordPress, see @mhawksey’s comment below). You can clone my example. If you’ve not used Yahoo Pipes before, don’t worry. You just need a Yahoo account. The example I give is as simple a pipe as you will see and should make sense as soon as you look at it.
Once you’ve created the pipe of the feed in ascending order, save and run the pipe. Look for the RSS icon and copy the pipe’s RSS link, which should look like this: http://pipes.yahoo.com/pipes/pipe.run?_id=cb438b51b2819eb1f4f5ec6f10daf09e&_render=rss
Next, go to FeedBooks. Sign up for an account if you don’t already have one. Now, we create a Newspaper.
Click on News in the menu and then on Create a newspaper. Give it a name and tag it. In our example, we’ll call it Baseline Scenario Archive.
Click on ‘Add a RSS feed’. Give it a name (in our case ‘September 2008’) and paste your RSS feed into the box. Once it’s found and accepted your feed, click ‘Publish’.
You can now click on the name of the specific feed and you’ll be presented with a page that offers an ePub, Kindle and PDF versions of your feed. Here’s the Baseline Scenario September 2008 example.
That’s it. You can do it with whole sites, too, if you like. Here’s one I did earlier (Blogger). The only thing you need to remember is to ensure that the RSS feed contains all the items you’re looking for. For the Blogger site, the source feed looks like this: http://www.blogger.com/feeds/27481991/posts/default?max-results=1000 A thousand items is more than enough to capture this site for quite some time. For WordPress, the site owner has to change their Reading Settings to include sufficient items. For the Baseline Scenario, they need to set this at a number high enough to ensure that a month’s worth of posts are included. I would just set it at 3000 and then forget about it. It would mean the entire site could be captured this way for the next year or so.

Having problems? Got a question? Leave a comment.

Tag: PDF

ePub downloads from EPrints

Creating a PDF or eBook from an RSS feed