All | 43 Folders | Accessibility | BoingBoing | Books | Computer Related | Family | Films | General | Hacking | Hobbies | Humor | Java | Links | Omni | Puzzles and Games

Main | Next page »
20080430 Wednesday April 30, 2008

BeautifulSoup - Get A 10 Day Weather Forecast For Your Zip Code

After Matt Harrison mentioned BeautifulSoup in a comment to an old Python script post of mine, I've been looking for somewhere where I could use it.

BeautifulSoup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping.

I initially played around with it, seeing if I could use it to get listings of when new episodes of my favorite TV programs were appearing, now that Zap2It Labs are no longer making their listing available for free. The problem there (I think) is that, because of the dynamically generated content on their TV listings website, I can't find a URL that BeautifulSoup can parse.

So I picked something different to cut my teeth on.

I often go to Weather.com and get a 10 day forecast for the city where I live. Easy to do, but I used this as an example of something to extract from a web page and then also email it to me so I have it handy.

This script does this. You will also need to get a copy of BeautifulSoup.py for it to work properly. I've simply put them both in the same directory and run it with:

  $ python ./get_weather.py

If others are interested in running this, then there are two variables that you will need to change in the script to meet your needs:

# Zip code to get 10 day forecast for.
#
zipCode = "94024"

# Email address to sent results to.
#
emailAddr = "someone@somewhere.com"

Just like my early attempts with using XPath in some of my JavaScript scripts, I suspect that I'm not doing it the best way.I predict that there are much nicer ways of writing the extractForecast() routine.

Still it works and that's the first step in programming.

[]

[]

[]

( Apr 30 2008, 08:58:05 AM PDT ) [Listen] Permalink Comments [5]

20080222 Friday February 22, 2008

Another Python Library Script

Something I've been meaning to do for a while.

When I go to the library, I'll first look in the "new books" section. If there is nothing there that I'm interested in, I'll look for books off one or more lists that I have. One of those lists is for the books on my Amazon Wish Lists.

My library is part of the Santa Clara County library system. Several branches work together. If the book is in the county library system, but not available at my local branch, I can put in a request, and they'll ship a copy to me as soon as one is available.

It's also possible that my local branch has a copy and it's out. What I really want to know is which books I'm interested in are available now in my local branch, so I can grab them if I immediately visit the library.

This script helps me do that. Here's how it works. For each of the books on each of the given Amazon Wish List ID's, it'll extract the ISBN and use that to query my libraries online catalog.

It'll first check the HTML reply, looking for the string "Sorry, could not find anything matching". If it finds, that, it'll go onto the next book. If it doesn't find it, then the county library has at least one copy of the book. It then looks for the string "Los Altos Library". If it finds that, it'll then grab the reply from that point upto the sub-string "Add Copy to MyList" and divide it into tokens, using "<" as a separator. It'll then look for tokens that start with "a class" and grab the sub-string from the ">" character to the end of the string. When that's complete for all the tokens, it'll check to see if the last one is "In". If it is, we've found a book that's in at my local library, and the book results are written to standard out. The script also writes a few messages to stderr that give information on the books that are in the county library system and/or are available from my local branch but are not currently in.

Here's a partial listing of what the program output to stderr looks like as it's running:

$ python ./check_los_altos.py >booklist.txt
...
Found in County Library: Watercolor: Painting Smart
Found in County Library: Chaos and Fractals: New Frontiers of Science
Los Altos Library has a copy
Found in County Library: Ships-In-Bottles: A Step-By-Step Guide to a Venerable Nautical Craft
Los Altos Library has a copy
Currently IN
Found in County Library: Complete Stories of Robert Bloch: Final Reckonings (Complete Stories of Robert Bloch)
Los Altos Library has a copy
Currently IN
Found in County Library: Looking for Jake: Stories
Los Altos Library has a copy
Found in County Library: 123 Robotics Experiments for the Evil Genius (TAB Robotics)
Los Altos Library has a copy
Found in County Library: The Art and Craft of Paper Sculpture: A Step-By-Step Guide to Creating 20 Outstanding and Original Paper Projects
Found in County Library: Bad Science: The Short Life and Weird Times of Cold Fusion
...
$

The booklist.txt entries for those two books above that are found in my local library look like:

Ships-In-Bottles: A Step-By-Step Guide to a Venerable Nautical Craft  Nonfiction Section  745.5928 HUBBARD  In
Complete Stories of Robert Bloch: Final Reckonings (Complete Stories of Robert Bloch)  Science Fiction Section  SF BLOCH ROBERT  In

As you can see, the output even tells me which section of the library to look in for each book.

For anybody who wants to use the script as the basis for doing something similar for their library, you are going to have to make small changes in four areas. Three are trivial. The fourth will take a little Python programming.

For the Python naming pedants, I've found that I simply like CamelCase variable names better that the current Python naming "standard", so you'll going to have to just deal with it. Other constructive Python criticisms are appreciated.

As I was writing this, I realized I didn't take into consideration the case where my local library might have multiple copies of the same book, and one or more of them might be in, even though the first one wasn't.

But that'll be a fix for the next version.

[]

[]

[]

( Feb 22 2008, 08:10:32 AM PST ) [Listen] Permalink Comments [3]

20080124 Thursday January 24, 2008

Take 3 - LifeHacker Category Viewer GreaseMonkey Script Working Again

You may remember a post from last November that described a GreaseMonkey script that would display the list of categories of the LifeHacker web site and allow you to dynamically display all the posts associated with each category.

Tyler Trafford, (who gave me lots of help getting that working), emailed me today with a change I would need to make because of the new security enhancements in the latest version of the GreaseMonkey Firefox add-on.

In testing it, we discovered that the script no longer worked with the LifeHacker site, with or without the suggested change.

Before I could get to it (darn RealWork™ getting in the way again), Tyler went and worked out what other changes had to be made to the GreaseMonkey script and sent them to me (thankyou!).

For you conspiracy theorists, I should let you know that when I posted the previous version back in November, I sent an email to the LifeHacker folks telling them about it. I was under the naïve impression that they might want to let their users know. Hah! Not a dickie bird. Nothing posted to their web site. No acknowledgment whatsoever.

And now we find that the old script doesn't work anymore because they've changed their website layout!

Coincidence? I think not. Let this be just our little secret this time.

[]

[]

[]

[]

[]

( Jan 24 2008, 01:21:12 PM PST ) [Listen] Permalink Comments [3]

20080123 Wednesday January 23, 2008

Your Ultimate Hacking Tools

Hack a Day have an interesting post today. It's a contest.

Here's the challenge: Given a budget of $600, put together the best hacking workbench you can. Don't include computers or the actual bench in your budget. Oh, and you have to spend it all.

This is for hardware hacking. See the comments to their post, for the replies so far. I should probably wait until they pick the five winners to see which tools I should add to my collection.

This post got me thinking about the ultimate tools for software hacking. Hacking in the nice sense of the word. If they are open source and/or freely available, then the cost would just be time not money.

So if you have any recommendations on your essential tools for your hacking arsenal (especially if your code in Python), please feel free to comment. If I get a sufficient response, I'll summarise in a future post.

[]

[]

[]

( Jan 23 2008, 02:32:11 PM PST ) [Listen] Permalink

20080116 Wednesday January 16, 2008

Roly Poly Pot Redux

I saw this post on Hackszine about a flower pot that will tilt over when it needs water.

That's cute, but why stop there? One article below it shows how you can use an Arduino board for helicopter control to stabalize the roll and pitch.

Why not combine the two? When the pot tips, the Arduino detects this and waters the plant. Hopefully the pot goes back to vertical and the water is turned off.

Now that would be a neat hack!

[]

( Jan 16 2008, 09:42:32 AM PST ) [Listen] Permalink

20080108 Tuesday January 08, 2008

Wii Remote and Nunchuck Projects

If you can pry the Wii Remote and/or Nunchuck from your child's hands, then you could possibly use it for one of these interesting projects:

Let's hope the Wii parts all go back together again afterwards.

[]

[]

( Jan 08 2008, 12:04:34 PM PST ) [Listen] Permalink Comments [2]

20071214 Friday December 14, 2007

ListsofBests List of Lists GreaseMonkey Script

Friday is hacking day, so here's another little GreaseMonkey hack.

Previously I'd created a GreaseMonkey script that would take one of the ListsofBests lists and turn it into a plain text list, making it easier to read.

This new script will take their list of lists, for the awards, definitive or personal lists categories, for Books, Music, Movies, Places, People or More, and turn it into a simple list of links. No having to click through numerous pages to get to something you might be interested in. No web site bling to distract you. Just the list.

If you're like me, (because these lists do take a while to regenerate, especially for the personal categories), you'll then save away a copy and bookmark it.

If I get enthused, the next step is to adjust the script so that clicking on a list entry will expand that list inline, rather than going off to the actual list web page and then using my other GreaseMonkey script.

But that's for another day. Back to RealWorkTM

[]

[]

[]

[]

[]

( Dec 14 2007, 09:02:54 AM PST ) [Listen] Permalink

20071126 Monday November 26, 2007

LifeHacker Category Viewer GreaseMonkey Script

Another GreaseMonkey script, this time to list out all the posts under all the categories at the LifeHacker web site.

If you running Firefox and have installed GreaseMonkey and this script and have it enabled, then if you visit their archives web page, it'll do its thing.

Note that there are a lot of posts there and most of them have been cross-categorized, so this will take a long time. It generates a new web page that's over 3.8Mb when saved. It also loads a lot of extra web pages very quickly which must be disruptive to their web servers.

If anybody can tell me how I can adjust this script to "throttle back", I'd very much appreciate it.

[]

[]

[]

[]

[]

( Nov 26 2007, 01:02:20 PM PST ) [Listen] Permalink

20071119 Monday November 19, 2007

Coloring Your Own Roller Blog Comments

Somebody was asking today on our internal blog users alias, on how they could color their own comments to make them stick out more. Seems that this is a built-in feature for WordPress. Here's an example (see comment #11).

GreaseMonkey to the rescue. If you are running Firefox and have GreaseMonkey installed, then install this script.

Unfortunately this isn't one of those script's that'll "just work". Each blog owner is going to have to customize it for their blog. There are two lines to change:

Now (hopefully), when you display one of your blog entries and there are comments by you, then you should see them with a light blue background and black text. If somebody can come up with some better CSS (which shouldn't be too hard), then they need to just adjust these lines:

        div.style.backgroundColor = "CEEBEB";
        div.style.color = "black";

But this was a thirty minute hack so that's what you get for a first version.

Hopefully a future version of the roller software will have a feature that just does this automatically.

[]

[]

[]

[]

( Nov 19 2007, 09:02:41 PM PST ) [Listen] Permalink Comments [5]

20071107 Wednesday November 07, 2007

Working GreaseMonkey Script For Expanding All Hackszine Categories

With great help again from Tyler Trafford (thanks!), there is now a GreaseMonkey script that will automatically expand all the categories on the Hackszine website, even those that have several pages worth.

This is useful to see all the posts they've created in the past, if you aren't exactly sure what you are looking for.

If you are running Firefox and have GreaseMonkey installed, then just install this script

Now if you go to the Hackszine website, it'll automatically reconfigure the page for you. Note that this is doing the equivalent of loading over 120 more web pages, so be patient with it.

If you just want the normal behavior when you visit the Hackszine site, then just disable the script (right click on the monkey icon on the right hand side of the Firefox status bar).

[]

[]

[]

[]

[]

( Nov 07 2007, 03:58:06 PM PST ) [Listen] Permalink

20071016 Tuesday October 16, 2007

GreaseMonkey Script To Improve Blogs.sun.com Recent Posts Display

Like several other people, I find that the current design of the blogs.sun.com home page frustrating, when it comes to trying to read the summaries of recent posted entries.

It initially only shows you ten entries, but you can click on a "See All" link, then you get the most recent 25 entries. If you want to see (say) the most recent 500 entries, you have to page through 19 more pages.

GreaseMonkey to the rescue. If you are running Firefox and have GreaseMonkey installed, then install this script

Now when you click on "See All", you will see the most recent 500 entries. Note that it'll take a few moments to construct the new page. It's interactively adding in the other 19 pages.

I'm sure this can be improved, but it was a quick-n-dirty hack.

Hopefully the blogs.sun.com people will come up with a proper fix.

[]

[]

[]

[]

( Oct 16 2007, 09:44:43 AM PDT ) [Listen] Permalink

20071007 Sunday October 07, 2007

Improved Lists Of Bests Book List To HTML Script

See yesterday's post for the background on this.

There is now an improved version of the script. See the Change Log at the end of the script for the changes made.

I've used it to regenerated the Pulitzer list. I then set it loose on the 1001 Books You Must Read Before You Die list. I've no idea who put this together, but it really needs work. Entries were incomplete or incorrect. Most promoted a specific version of the book. It's Persuasion by Jana Austen not Persuasion (Penguin Modern Classics). And so on...

After I did extensive editing of the list, and multiple runnings of the script to adjust the entries to make it quicker to select the correct title, I ended up with this text version, which created this HTML version. Even now, it's not always picking the best Amazon entry. Sometimes it selects one that doesn't have a book cover image.

It can process this list in about 32 minutes. Most of that time is spent doing the Amazon lookups to get the ISBNs. Even now there are still seven books on the list that I can't find:

  1. Adjunct: An Undigest - Peter Manson
  2. The Taebek Mountains - Jo Jung-Rae
  3. Disobedience - Alberto Moravia
  4. A Day Off - Storm Jameson
  5. The Last Days of Humanity
  6. The Stechlin - Theodore Fontane
  7. On the eve - Ivan S. Turgenev

I'll need to borrow Boxall's book from the library again to see if I can work out what they really should be.

[]

[]

[]

[]

( Oct 07 2007, 02:40:39 PM PDT ) [Listen] Permalink Comments [7]

20071006 Saturday October 06, 2007

Convert Your Lists Of Bests Book Lists To HTML Web Pages With Amazon Links

I've created a Python script that will take one of the Lists of Bests book lists generated by my GreaseMonkey script (see a previous post for more details on this), and convert it to an HTML web page with Amazon links for each book.

To use it you will need to have installed the Lists of Bests GreaseMonkey script. Then go to the Lists Of Bests book list you are interested in. The GreaseMonkey script will automatically convert it to a simple text-like list. You should then cut and paste it into a text file and save it.

You will also need to edit the make_HTML_list.py script and adjust the amazonAccessKey line to your Amazon Access License Key.

To use it simple run:

  % python make_HTML_list.py < <your-book-list.txt> > <your-book-list.html>

The script uses the Amazon web services to try to work out the book's ISBN from the title and the author. It tries to do as much as it can to make sure this works, but the simple fact is that some of the entries in these lists are incorrect or incomplete. If it can't find the ISBN, it uses the standard Amazon "no image available" image and doesn't generate a link.

Here's an example. Taking the Pulitzer prize list in text form, it generates this HTML web page.

There is a debug flag near the beginning of the script, that if set to True, will generate copious debug messages that should help you determine what the book title in the list should have been.

I have a problem with this script that I haven't been able to solve yet. Pointers on how to fix it would be most appreciated.

Tips on how to improve the script and/or the Python code are most welcome too.

[]

[]

[]

[]

( Oct 06 2007, 12:07:59 PM PDT ) [Listen] Permalink Comments [4]

20070614 Thursday June 14, 2007

An Alternate Del.icio.us Export / Backup Webpage

del.ic.ious allows you to export / backup your bookmarks to an HTML file. That file is just one long list of links.

What I wanted was a web page with my tagcloud at the top, and each tag in the tagcloud taking me to a section of the same web page which contained all my bookmarks with that tag. I realize that del.icio.us sort of gives you this with the tag cloud on the righthand side of their web site, but having a local copy would be much faster.

So I wrote a small Python script that does this. It writes the new web page to standard out. You can then just view it in your web browser.

If others want to use it, you'll need to adjust the delBookmarkFile definition of line 33 to point to the location of your exported del.icio.us bookmark file before you run it.

It doesn't save or use all the information that del.icio.us exports, but it does generate a smaller faster web page.

The look & feel could be improved with some nicer CSS.

Suggestions welcome.

[]

[]

[]

( Jun 14 2007, 07:45:08 AM PDT ) [Listen] Permalink

20070418 Wednesday April 18, 2007

New Version Of the Get TV Listings Script

For background on this, see a previous post.

I've adjusted the Python script to now automatically email you the results rather than pump the results out to standard output which then had to be piped to the mailx program.

If you are interested in using this script, you will first need to do the setup as described in the previous post. You will also need to adjust the emailAddr variable in the script (as well as the TV_GRAB, XMLTV_FILE and programs ones), to suit your needs.

[]

[]

[]

( Apr 18 2007, 11:29:10 AM PDT ) [Listen] Permalink