Turn Your Roller Blog Into A Book
From an individual point of view, there are a couple places where it's not as simple and straight forward as I'd like.
- The handling of resources: all those files I've uploaded
(mostly small images that appear at the beginning of each post).
There are over a thousand of them now. They are managed in a flat
file system which makes the page updates slow and hard for me to find the
exact file you are looking for. You can't tag or annotate them.
It doesn't even prompt you if you are about to overwrite an
existing file. There is definitely room for improvement here although I'll
note that this is not a problem for most Sun bloggers who don't use a lot
of resources.
- Seeing all of your blog as a single entity: there is a limit on the number of blog entries that can appear on a single page. What I'd like to be able to do is view (and or save) the whole of my blog as a single file. I don't know of a way to do that.
Until now.
Over the Christmas break I wrote a very simple Python script that will take all of my blog entries that I'd backed up locally using Grabber and turn them into a single HTML file. Currently it's 2.7Mb for 916 blog entries over 2 ½ years.
(If others want to use the script, you will need to adjust the
blogPostsDir and title definitions near the
beginning. Note that there is minimal bullet-proofing in the script. If
you find any problems, please let me know.)
What this means now is that I can easily determine where broken links are. I've already fixed up the broken image links in my blog and I'll regenerate the single HTML file in a little while. Over the next few days, I also plan to find out how many broken web page links are there and see how easy it would be to fixup the important ones.
I also wanted to see what the blog looked like as a PDF file. I've no intention of self-publishing it, but I was curious to see if I'd written something that was novel length yet.
I converted it two ways:
- With cups2pdf:
This generated a 37Mb file, 404 pages long but didn't retain all the hyperlink (a known problem).
- With OpenOffice Writer:
This generated a 23Mb file, 813 pages long and kept all the hyperlinks.
It's not War and Peace but it's getting up there. It certainly has more laughs.
( Jan 02 2007, 08:18:44 AM PST ) [Listen] Permalink Comments [5]
Comments are closed for this entry.













Posted by John Clingan on January 02, 2007 at 09:00 AM PST #
Posted by Martin-Éric on January 02, 2007 at 10:42 AM PST #
Posted by Rich Burridge on January 02, 2007 at 02:25 PM PST #
Maybe I've just got bogus HTML, but that's not a great excuse. OOo does a much better job of the conversion.
Posted by Rich Burridge on January 02, 2007 at 05:47 PM PST #
Posted by Daniel on January 03, 2007 at 12:41 AM PST #