BeautifulSoup - Get A 10 Day Weather Forecast For Your Zip Code
|
After Matt Harrison mentioned BeautifulSoup in a comment to an old Python script post of mine, I've been looking for somewhere where I could use it. |
BeautifulSoup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping.
I initially played around with it, seeing if I could use it to get listings of when new episodes of my favorite TV programs were appearing, now that Zap2It Labs are no longer making their listing available for free. The problem there (I think) is that, because of the dynamically generated content on their TV listings website, I can't find a URL that BeautifulSoup can parse.
So I picked something different to cut my teeth on.
I often go to Weather.com and get a 10 day forecast for the city where I live. Easy to do, but I used this as an example of something to extract from a web page and then also email it to me so I have it handy.
This script does this. You will also need to get a copy of BeautifulSoup.py for it to work properly. I've simply put them both in the same directory and run it with:
$ python ./get_weather.py
If others are interested in running this, then there are two variables that you will need to change in the script to meet your needs:
# Zip code to get 10 day forecast for. # zipCode = "94024" # Email address to sent results to. # emailAddr = "someone@somewhere.com"
Just like my early attempts with using XPath in some of my JavaScript scripts, I suspect that I'm not doing it the best way.I predict that there are much nicer ways of writing the extractForecast() routine.
Still it works and that's the first step in programming.
[Technorati Tag: BeautifulSoup]
( Apr 30 2008, 08:58:05 AM PDT ) [Listen] Permalink Comments [5]
Comments are closed for this entry.














With BeautifulSoup, you can do things like this:
tag.div.strong
- To get the first strong tag in the first div tag in tag.
As such, my modified version of your extractForecast (indentation spaces replaced with "~"):
def extractForecast(soup):
~~~ tdDates = soup.findAll("div", { "class" : "tdDate" })
~~~ tdForecasts = soup.findAll("div", { "class" : "tdForecast" })
~~~ tdTemps = soup.findAll("div", { "class" : "tdTemps" })
~~~ tdPrecips = soup.findAll("div", { "class" : "tdPrecip" })
~~~ results = []
~~~ for item in zip(tdDates, tdForecasts, tdTemps, tdPrecips):
~~~~~~~ date = item[0].p.contents[-1]
~~~~~~~ forecast = item[1].p.contents[-1]
~~~~~~~ temperature = item[2].strong.string.replace("°", " degrees F")
~~~~~~~ precipitation = item[3].p.string.replace("%", "% chance of rain")
~~~~~~~ results.append([ date, forecast, temperature, precipitation ])
~~~ return results
Posted by Stefan Stuhr on April 30, 2008 at 10:05 AM PDT #
Thanks Stefan! That's much nicer.
Posted by Rich Burridge on April 30, 2008 at 10:49 AM PDT #
Hi Rich,
I made a similar forecast here: http://code.google.com/p/coopera-weather/
Uses pygtk and returns forecast in trayicon :-)
Posted by Leonardo Gregianin on May 01, 2008 at 05:52 AM PDT #
Nice. Maybe I should have googled before I
started on this. Still, it was a learning
experience and that's what counts.
Posted by Rich Burridge on May 01, 2008 at 06:42 AM PDT #
Thanks for the pointer to beautifulsoup. I've rolled my own parser for an application like this-- I think I'll switch it over to beautifulsoup.
Posted by Dan Price on May 01, 2008 at 12:48 PM PDT #