This is another something I haven't been able to blog about lately, so finally doing so now.
Previously,
I had created a simple google map with locations of few CA's in India
to start with and using the "collaborate" feature, many other CA's were
able to placemark their locations on the map too.
Recently, Sun created a completely
new map of all Sun Campus Ambassadors
on the Sun Developer Network site, showing all 500 of the CA's in a
long table, mapped by region. But it still wasn't a map showing each CA
dotted on to it. It is mapping the regions where we have CA's.
Owing to my recent discovery of 2 great web2.0'ish tools - namely
Dapper.net and
Yahoo Pipes,
and a hackish mindset, I figured I could use the information given on
this page, scrape it out somehow, and feed it to a map. That's exactly
what I've done now :)
First I'll tell you a bit about the tools I've used:
- Dapper.net
- In the broadest sense, what Dapper does is, take web content from
normal web pages, and be able to output that content in the form of RSS
feeds, widgets and API's. You can scrape off a list from any web page
and create an RSS feed, XML, Google / Yahoo Map, Flash Widget, Google
Gadget, Netvibes module, Pageflake module, Facebook application, CSV,
JSON, etc or even set to get email alerts on updates.. Or, link it's
output to another Dapp. It's like an ETL tool, only having a much
different utility than data mining.
It
is usually meant for creating feeds out of a website / web page which
don't have feeds, without any programming ofcourse. You can do stuff
like create a search for movie results with a specific keyword at IMDB
and create a "Dapp" out of it, and show it on your blog, create google
map mashups, get alerted about updates of a particular web page, scrape
content off of web pages (without feeds) and show it on your own
(though that has some issues, mostly ethical). You can use the same
dapp with a different input URL later. Oh did I mention, Dapper also
has an API to programmatically call previously created Dapps from your
applications (obviously, what doesn't have an API these days!).
Dapper.net also allows you to share your Dapps. There you go, a
community too. Most of all, Dapper is free and it's UI rocks! Know more..
- Yahoo Pipes
- is an amazing free online web service, (with an awesome UI), which
allows you to "remix popular feed types and create data mashups using a
visual editor" (from their website).
It is a feed remixing/processing and web service engine, in my terms.
It is an online IDE to take data of any format, from multiple sources,
mix and mash them, process them, aggregate them, and create something
useful out of it. It has a drag and drop UI. There are some preset
modules layed out on the left hand side of the editor. First you add a
"Source", which can be anything, an RSS feed, XML, web page, flickr
feed, yahoo search result or output from another "Pipe". After fetching
the data you can process it using "Operators" - split, sort, merge,
reverse, rename, loop, truncate, union, unique, filter, count, geocode,
loop, etc. These operators operate on each item in the source. Now
comes the best part, you can connect all these modules to each other
using "pipes", an actual flexible visual representation of which is
shown in the editor (you have to see to believe this!)..
The above pipe monitors Twitter for tweets with your name :)
You can do a lot of cool things with Yahoo Pipes. Browse the popular pipes
at their website. See the current featured pipe, which allows you to
find an apartment at some location "near" a park, school, or your
favorite restaurant! : Apartment near something.
- Feedburner - I assume you know all about it ;)
- Yahoo GeoCoder REST API
- a little web service from Yahoo which lets you get location data
(lattitude, longitude, etc) by a place's name, steet, city, country,
zip code, etc.
What I've done On May 19th 2008:
- Created a Dapp, to extract the names and city names of all the 500+ CA's from the SDN webpage on which it is shown in plain HTML. My dapp exctracts the table and converts it into an RSS feed here.
- Next,
I created a feedburner feed from the "Sun CA Location RSS" created in
Step 1, to speed things up a bit. If we directly use the dapper feed to
show the map, it takes a lot of time, since in that case, whenever
someone sees the map, dapper is called to scrape off of that web page,
which is a huge overhead, plus dapper has restrictions on the number of
hits, etc, so I thought it's much better off to create a "cache" of the
dapper feed using our good ol' Feedburner. Here is that feedburner feed.
- Now we finally come to the exciting part. Creating the Yahoo Pipe. Open up the yahoo pipe i created
and read on (yes it's open source): It first fetches the feedburner
feed created in step 2, splits it into 2. Why split into 2? Well, let
me explain. The CA list on the SDN student zone webpage has 2 types of
records, ones with city names and ones without. I could have stayed
with showing only those CA's on the map whose locations are there in
that list, but later I went a bit further and also cooked up a way to
show those who don't have locations, thus showing all the CA's that are
there. Using filters, we make sure one split contains those that have
locations and the other contains those that don't. We then loop through
each "item", a CA's university name and location in this specific case,
and pass the "location identifier" (as I like to call it) to another
submodule I created called "Get geodate (coordinates) from city name".
The submodule uses the Yahoo Geocoder web service to get the latt-long
data from the "location identifier". Location identifier is the city
name in split-1 and the CA's university name in split-2. After that, we
just rename the XML field names to convert it into a Yahoo Map
compatible geoRSS feed. Finally, we use the union operator to merge
both the splits into one final feed.
Data Flow: SDN CA
list (web page) > Dapper.net (feed) > Feedburner.com (cached
feed) > Yahoo Pipes (geoRSS feed) > Map > You website / blog /
pipe / another mashup
The end result is a Yahoo Map
showing the locations of all the 500+ Sun Campus Ambassadors, each
having their own placemark. You may even click on the university name
to go to the respective CA's landing page, where you can see his/her
blog address, and signup for upcoming events in their university's sun
club, speaking of which, mine's here ;)
So now we have a dynamically updating Sun Campus Ambassador map 2.0 showing all the CA's in the world, updating from the official Sun Developer Network Ambassadors Map page,
which in turn is, I guess, updating from ambassadorzone.com!. You can
very well do anything and everything with it: show it on your blog (get
as a badge feature), add it to Google Fusion, My Yahoo, get email
alerts about its changes, simply get it as RSS, JSON, PHP or even KML.
I used the google mashup editor to create a google map out of this yahoo pipe. Here's the google mashup map. (you have to zoom in to see all the CA's unlike the Yahoo map which shows them all at once)
I also created a version
of the above yahoo pipe which takes input from a CSV file instead of
feedburned' dapper feed, for fast loadup (though, remember this one's
not dynamic)
All Links at once place:
I tried this out to see what the Canada Campus Ambassador landscape looks like...
Interesting in how it places U.B.C. somewhere in the frozen tundra in the far north of Canada's Nunavut territory... home to only the most hardy Inuit peoples...and polar bears.
UBC is also identified as being in Waterloo, Canada.
UBC is actually in Vancouver, Canada where the tempature is much more moderate, and NEVER hits the -50C that symbol location would lead me to believe :)
Posted by Rob on June 12, 2008 at 04:27 PM IST #
Hey Rob,
At the point of time when I had made this, the state/province column showed the state/province, so the yahoo geocoder service easily provided the lattitude and longitude for the locations. But now, someone has edited the page and added country names as a prefix to every state/province, crippling the yahoo geocoder service ability to return correct coordinates. E.g., It's "India (Noida)" instead of "Noida". That is why the symbol locations are now all pointing to the same place, that is the country's location.
Nonetheless, I had foreseen this and that is why I also backed up that state in a CSV file uploaded to mediacast.sun.com. See the static map which takes its input from the CVS file instead of the dynamic Dapp at: http://pipes.yahoo.com/pipes/pipe.info?_id=302b65971c90577cbf7dc0cfa1228fae
Making this dynamic again is beyond the ability of dapper or yahoo pipes. It would require custom coding.
Posted by Angad Singh on June 12, 2008 at 05:40 PM IST #