The Sun BabelFish Blog
Don't panic !
excel and rdf
Scott McNealy, rarely had much nice to say about spreadsheet software, when it was not web enabled. And indeed there are huge numbers of problems with them. Off the top of my head, some of these are:
- Hidden formula that nobody looks at and that get tweaked without alerting people
- Data that is never synchronized, with parts of it that is out of date
- Data that cannot be merged
- Some products even had virus problems...
And yet they are immensely popular, especially with the people who never see the problems that they lead to.
As it happens these are problems within the scope of the semantic web. Every spread sheet is like a mini SQL database. As long as you query the information inside of one database owned by one administrator all is fine. But what when you want to merge information from different databases? Ouch! That's really tough, because there is usually no clear understanding of which pieces should fit together. Do the columns in each database mean the same thing? Well if you have just a few big databases you can link them tediously together, but what if you have thousands of such databases? And each person wielding it is a complete novice to this problem? What if someone just renames a column in one spread sheet? What does that mean?
The topic of spreadsheets and the semantic web came to be one of the highlights of the conferences I went to in May. Dean Allemang in his talk at JavaOne ( the sound track enhanced slides are now online! ), used this problem in one of his examples. Eric Miller, talked about a solution that involved using the momentum behind spreadsheets to help build ontologies (I think, it's a while back now). This is not all new of course. In a reply to this post Mike Bergman pointed to his year old article entitled "RDF123 Makes Generating Flexible RDF a Snap".
But often a demo helps a lot, and the one that made me see the light was given by Lee Feigenbaum of Cambridge Semantics just before the end of the Semantic Tech Conference. Lee, who had been working on semantic web tools at IBM before going to start his own company, gave me a quick summary of the benefits of his SHAPE middleware. Essentially by adding URLs into the spreadsheet you can tie their meaning down a lot more carefully. By writing a plugin for Microsoft Excel ( they had a prototype working for openoffice before deciding to focus on M$ tools) that works together with the middleware, users can keep on behaving as they are used to, whilst helping link all the information together. Instead of working against each other, people in a company can build a web of information together. Here is a highlight from Lee's talk entitled Getting to Web Semantics for Spreadsheets in the U.S. Government:
- Tight integration into Excel allows semantic concepts to be dragged and dropped from the semantic repository onto data tables
- The data table's implicit row/column relations are explicitly stored in an RDF semantic database
- Cells, columns, and regions are tagged with explicit semantics
- Publish the data tables on the Web
Spreadsheets may yet be back again, but for the good.
PS. Please send me further links on this so I can flesh out this story better.
Update
13 September 2008:
- Sean Martin, President and CTO of Cambridge Semantics talked to Paul Miller in a long and instructive podcast. The discussion takes a bit of time to get going, and so for those who would like to zoom straight into the matters discussed here you should skip to about the 34th minute of the interview.
- By the way this is different but complimentary to the idea of linking office documents to the web. Here is a video of how one can use Open Office as a wiki, where one can edit a spreadsheet and publish it directly to the web.
Posted at 07:12PM Aug 29, 2008 [permalink/trackback] by Henry Story in SemWeb | Comments[5]
Note on comments:
- I know the forms below are a little small. We have asked for years for this to be changed, but I don't think it's going to happen soon. In Apple's Safari you can resize the entry box with you mouse. For people using other browsers click on this javascript link, that should allow you to resize your form.
- Comments are moderated, so they will take a little time to appear. Currently moderation means I have to read them personally. Hopefully with OpenId deployment, this will become more automated.
- HTML markup no longer works here, due to some decision made somewhere. Sorry about that.
- If you are having trouble posting, it may be that you need javascript to be enabled. I don't think javascript should be needed for submitting a form, but that's the way it is here.
- Check your comments by using the preview button...

Hi Henry,
I have been hearing about Lee's Excel app for some time and am anxious to see it myself. Maybe this will prompt an update or an access link! :)
I have not seen any recent updates, but about a year ago (http://www.mkbergman.com/?p=394) I covered RDF123 from Tim Finin's group at UMBC Ebiquity. It, too, was trying to move toward spreadsheets as RDF input frameworks. My write-up has some links to still earlier initiatives.
Maybe Tim can provide an update as well?!
Thanks, Mike
Posted by Mike Bergman on August 29, 2008 at 11:44 PM CEST #
There is a very interesting blog post with video showing how SDS allows one to integrate two or more spread sheets, and even spread sheets and databases
http://www.readwriteweb.com/archives/the_semantic_desktop_sds_brings_semantics_to_excel.php
I have not tried it, as I don't work a lot with spread sheets. But those who do will find this alluring.
Posted by Henry Story on September 02, 2008 at 11:32 PM CEST #
Hi Mike,
There is some new update from rdf123. look at this website: http://logos.cs.umbc.edu:8080/termpredict/wordswithtype.html
It will try to find a most standard and consistent schema for a set of English words/phrases (concepts)
Best,
Lushan
Posted by Lushan on September 11, 2008 at 08:49 PM CEST #
I am not sure exactly how that relates to spread sheets. Can you develop your thought a little?
Posted by Henry Story on September 11, 2008 at 08:58 PM CEST #
First, people need create a semantic graph to describe the relations of the columns in a spreadsheet. However, people are allowed to use English words for the name of classes an properties occuring in the semantic graph. The above web service will try to map the set of names to the most standard and consistent rdf schema. What is important is that if the sets of names could reflect the same domain/context informatoin, they should be mapped to the same rdf schema in spite of different ways people may give names to their concepts.
Posted by Lushan on September 11, 2008 at 11:02 PM CEST #