All | 43 Folders | Accessibility | BoingBoing | Books | Computer Related | Family | Films | General | Hacking | Hobbies | Humor | Java | Links | Omni | Puzzles and Games

« Linux Journal Orca... | Main | Updated: Be Informed... »
20080222 Friday February 22, 2008

Another Python Library Script

Something I've been meaning to do for a while.

When I go to the library, I'll first look in the "new books" section. If there is nothing there that I'm interested in, I'll look for books off one or more lists that I have. One of those lists is for the books on my Amazon Wish Lists.

My library is part of the Santa Clara County library system. Several branches work together. If the book is in the county library system, but not available at my local branch, I can put in a request, and they'll ship a copy to me as soon as one is available.

It's also possible that my local branch has a copy and it's out. What I really want to know is which books I'm interested in are available now in my local branch, so I can grab them if I immediately visit the library.

This script helps me do that. Here's how it works. For each of the books on each of the given Amazon Wish List ID's, it'll extract the ISBN and use that to query my libraries online catalog.

It'll first check the HTML reply, looking for the string "Sorry, could not find anything matching". If it finds, that, it'll go onto the next book. If it doesn't find it, then the county library has at least one copy of the book. It then looks for the string "Los Altos Library". If it finds that, it'll then grab the reply from that point upto the sub-string "Add Copy to MyList" and divide it into tokens, using "<" as a separator. It'll then look for tokens that start with "a class" and grab the sub-string from the ">" character to the end of the string. When that's complete for all the tokens, it'll check to see if the last one is "In". If it is, we've found a book that's in at my local library, and the book results are written to standard out. The script also writes a few messages to stderr that give information on the books that are in the county library system and/or are available from my local branch but are not currently in.

Here's a partial listing of what the program output to stderr looks like as it's running:

$ python ./check_los_altos.py >booklist.txt
...
Found in County Library: Watercolor: Painting Smart
Found in County Library: Chaos and Fractals: New Frontiers of Science
Los Altos Library has a copy
Found in County Library: Ships-In-Bottles: A Step-By-Step Guide to a Venerable Nautical Craft
Los Altos Library has a copy
Currently IN
Found in County Library: Complete Stories of Robert Bloch: Final Reckonings (Complete Stories of Robert Bloch)
Los Altos Library has a copy
Currently IN
Found in County Library: Looking for Jake: Stories
Los Altos Library has a copy
Found in County Library: 123 Robotics Experiments for the Evil Genius (TAB Robotics)
Los Altos Library has a copy
Found in County Library: The Art and Craft of Paper Sculpture: A Step-By-Step Guide to Creating 20 Outstanding and Original Paper Projects
Found in County Library: Bad Science: The Short Life and Weird Times of Cold Fusion
...
$

The booklist.txt entries for those two books above that are found in my local library look like:

Ships-In-Bottles: A Step-By-Step Guide to a Venerable Nautical Craft  Nonfiction Section  745.5928 HUBBARD  In
Complete Stories of Robert Bloch: Final Reckonings (Complete Stories of Robert Bloch)  Science Fiction Section  SF BLOCH ROBERT  In

As you can see, the output even tells me which section of the library to look in for each book.

For anybody who wants to use the script as the basis for doing something similar for their library, you are going to have to make small changes in four areas. Three are trivial. The fourth will take a little Python programming.

For the Python naming pedants, I've found that I simply like CamelCase variable names better that the current Python naming "standard", so you'll going to have to just deal with it. Other constructive Python criticisms are appreciated.

As I was writing this, I realized I didn't take into consideration the case where my local library might have multiple copies of the same book, and one or more of them might be in, even though the first one wasn't.

But that'll be a fix for the next version.

[]

[]

[]

( Feb 22 2008, 08:10:32 AM PST ) [Listen] Permalink Comments [3]

Comments:

Looks like a nice script (too bad I don't live in the bay area anymore....).

Two python comments:

* It makes sense to use camel case (for this script), since the amazon lib uses it

* You might want to check out "beautiful soup", if you are interested in screen scraping via a dom like object, rather than regex/text processing.

Posted by Matt Harrison on February 22, 2008 at 08:47 AM PST #

ok, two more comments:

* Why are you writing to stderr? Wouldn't print suffice

* You might want to change configurable global variables to UPPER_CASE_UNDERSCORE. But since you already don't like PEP8..... ;)

Posted by Matt Harrison on February 22, 2008 at 08:52 AM PST #

Hi Matt,

Note that this could be adjusted to work with any
library that makes their catalog available over
the web. Check Jon Udell's bookmarklet library
generator with your local library. Link in post.

I'll definitely check out Beautiful Soup:
http://www.crummy.com/software/BeautifulSoup/
in more detail.

I'm using both stdout and stderr. The books that
are in my local library get written to stdout
using "print". The books that are in the county
library system and/or are not currently available
in the Los Altos library branch are written to
sys.stderr. That allows me to easily get the
"in" book list be just redirecting stdout to a file.
I use sys.stderr.write because I've
never found the equivalent of "fprintf(stderr, ...)"
in Python.

Yup, I'll agree with you on using UPPER_CASE_UNDERSCORE
for configurable variables. I'll adjust that
for the next version.

Thanks!

Posted by Rich Burridge on February 22, 2008 at 09:41 AM PST #

Post a Comment:

Comments are closed for this entry.