The Sun BabelFish Blog
Don't panic !
Web Finger proposals overview
If all you had was an email address, would it not be nice to be able to have a mechanism to find someone's home page or OpenId from it? Two proposals have been put forward to show how this could be done. I will look at them and add a sketch of my own that hopefully should lead us to a solution that takes the best of both proposals.
The WebFinger GoogleCode page explains what webfinger is very well:
Back in the day you could, given somebody's UNIX account (email address), type$ finger email@example.comand get some information about that person, whatever they wanted to share: perhaps their office location, phone number, URL, current activities, etc.
The new ideas generalize this to the web, by following a very simple insight: If you have an email address like henry.story@sun.com, then the owner of sun.com is responsible for managing the email. That is the same organization responsible for managing the web site http://sun.com. So all that is needed is some machine readable pointer from http://sun.com/ to a lookup giving more information about owner of the email address. That's it!
The WebFinger proposal
The WebFinger proposed solution showed the way so I will start from here. It is not too complicated, at least as described by John Panzer's "Personal Web Discovery" post.
John suggests that there should be a convention that servers have a file in the /host-meta root location of the HTTP server to describe metadata about the site. (This seems to me to break web architecture. But never mind: the resource http://sun.com/ can have a link to some file that describes a mapping from email ids to information about it.) The WebFinger solution is to have that resource be in a new application/host-meta file format. (not xml btw). This would have mapping of the form
So if you wanted to find out about me, you'd be able to do a simple HTTP GET request onLink-Pattern: <http://meta.sun.com/?q={%uri} >; rel="describedby";type="application/xrd+xml"
http://meta.sun.com/?q=henry.story@sun.com, which will return a representation in another new application/xrd+xml format about the user.
The idea is really good, but it has three more or less important flaws:
- It seems to require by convention all web sites to set up a
/host-metalocation on their web servers. Making such a global requirement seems a bit strong, and does not in my opinion follow web architecture. It is not up to a spec to describe the meaning of URIs, especially those belonging to other people. - It seems to require a non xml
application/host-metaformat - It creates yet another file format to describe resources the
application/xrd+xml. It is better to describe resources at a semantic level using the Resouces Description Framework, and not enter the format battle zone. To describe people there is already the widely known friend of a friend ontology, which can be clearly extended by anyone. Luckily it would be easy for the XRD format to participate in this, by simply creating a GRDDL mapping to the semantics.
All these new format creation's are a real pain. They require new parsers, testing of the spec, mapping to semantics, etc... There is no reason to do this anymore, it is a solved problem.
But lots of kudos for the good idea!
The FingerPoint proposal
Toby Inkster, co inventor of foaf+ssl, authored the fingerpoint proposal, which avoids the problems outlined above.
Fingerpoint defines one useful relation sparql:fingerpoint relation (available at the namespace of the relation of course, as all good linked data should), and is defined as
sparql:fingerpoint
a owl:ObjectProperty ;
rdfs:label "fingerpoint" ;
rdfs:comment """A link from a Root Document to an Endpoint Document
capable of returning information about people having
e-mail addresses at the associated domain.""" ;
rdfs:subPropertyOf sparql:endpoint ;
rdfs:domain sparql:RootDocument .
It is then possible to have the root page link to a SPARQL endpoint that can be used to query very flexibily for information. Because the link is defined semantically there are a number of ways to point to the sparql endpoint:
- Using the up and coming HTTP-Link HTTP header,
- Using the well tried html <link> element.
- Using RDFa embedded in the html of the page
- By having the home page return any other represenation that may be popular or not, such as rdf/xml, N3, or XRD...
So Toby gets more power as the WebFinger proposal, by only inventing 1 new relation! All the rest is already defined by existing standards.
The only problem one can see with this is that SPARQL, though not that difficult to learn, is perhaps a bit too powerful for what is needed. You can really ask anything of a SPARQL endpoint!
A possible intermediary proposal: semantic forms
What is really going on here? Let us think in simple HTML terms, and forget about machine readable data a bit. If this were done for a human being, what we really would want is a page that looks like the webfinger.org site, which currently is just one query box and a search button (just like Google's front page). Let me reproduce this here:
Here is the html for this form as its purest, without styling:
<form action='/lookup' method='GET'>
<img src='http://webfinger.org/images/finger.png' />
<input name='email' type='text' value='' />
<button type='submit' value='Look Up'>Look Up</button>
</form>
What we want is some way to make it clear to a robot, that the above form somehow maps into the following SPARQL query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
[] foaf:mbox ?email;
foaf:homepage ?homepage
}
Perhaps this could be done with something as simple as an RDFa extension such as:
<form action='/lookup' method='GET'>
<img src='http://webfinger.org/images/finger.png' />
<input name='email' type='text' value='' />
<button type='submit' value='homepage'
sparql='PREFIX foaf: <http://xmlns.com/foaf/0.1/>
GET ?homepage
WHERE {
[] foaf:mbox ?email;
foaf:homepage ?homepage
}">Look Up</button>
</form>
When the user (or robot) presses the form, the page he ends up on is the result of the SPARQL query where the values of the form variables have been replaced by the identically named variables in the SPARQL query. So if I entered henry.story@sun.com in the form, I would end up on the page
http://sun.com/lookup?email=henry.story@sun.com, which could perhaps just be a redirect to this blog page... This would then be the answer to the SPARQL query
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?homepage
WHERE {
[] foaf:mbox "henry.story@bblfish.net";
foaf:homepage ?homepage
}
(note: that would be wrong as far as the definition of foaf:mbox goes, which relates a person to an mbox, not a string... but let us pass on this detail for the moment)
Here we would be defining a new GET method in SPARQL, which find the type of web page that the post would end up landing on: namely a page that is the homepage of whoever's email address we have.
The nice thing about this is that as with Toby Inkster's proposal we would only need one new relation from the home page to such a finder page, and once such a sparql form mapping mechanism is defined, it could be used in many other ways too, so that it would make sense for people to learn it. For example it could be useful to make web sites available to shopping agents, as I had started thinking about in RESTful semantic web services before RDFa was out.
But most of all, something along these lines, would allow services to have a very simple CGI to answer such a query, without needing to invest in a full blown SPARQL query engine. At the same time it makes the mapping to the semantics of the form very clear. Perhaps someone has a solution to do this already. Perhaps there is a better way of doing it. But it is along these lines that I would be looking for a solution...
(See also an earlier post of mine SPARQLing AltaVista: the meaning of forms)
How this relates to OpenId and foaf+ssl
One of the key use cases for such a Web Finger comes from the difficulty people have of thinking of URLs as identifiers of people. Such a WebFinger proposal if successful, would allow people to type in their email address into an OpenId login box, and from there the Relying Party (the server that the user wants to log into), could find their homepage (usually the same as their OpenId page), and from there find their FOAF description (see "FOAF and OpenID").
Of course this user interface problem does not come up with foaf+ssl, because by using client side certificates, foaf+ssl does not require the user to remember his WebID. The browser does that for him - it's built in.
Nevertheless it is good that OpenId is creating the need for such a service. It is a good idea, and could be very useful even for foaf+ssl, but for different reasons: making it easy to help people find someone's foaf file from the email address could have many very neat applications, if only for enhancing email clients in interesting new ways.
Updates
It was remarked in the comments to this post that the format for the /host-meta format is now XRD. So that removes one criticism of the first proposal. I wonder how flexible XRD is now. Can it express everything RDF/XML can? Does it have a GRDDL?
Posted at 11:10PM Nov 29, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[9]
http://openid4.me/ -- OpenId ♥ foaf+ssl
OpenId4.me is the bridge between foaf+ssl and OpenId we have been waiting for.
OpenId and foaf+ssl have a lot in common:
- They both allow one to log into a web site without requiring one to divulge a password to that web site
- They both allow one to have a global identifier to log in, so that one does not need to create a username for each web site one wants to identify oneself at.
- They also allow one to give more information to the site about oneself, automatically, without requiring one to type that information into the site all over again.
OpenId4.me allows a person with a foaf+ssl profile to automatically login to the millions of web sites that enable authentication with OpenId. The really cool thing is that this person never has to set up an OpenId service. OpenId4.me does not even store any information about that person on it's server: it uses all the information in the users foaf profile and authenticates him with foaf+ssl. OpenId4.me does not yet implement attribute exchange I think, but it should be relatively easy to do (depending on how easy it is to hack the initial OpenId code I suppose).
If you have a foaf+ssl cert (get one at foaf.me) and are logging into an openid 2 service, all you need to type in the OpenId box is openid4.me. This will then authenticate you using your foaf+ssl certificate, which works with most existing browsers without change!
If you then want to own your OpenId, then just add a little html to your home page. This is what I placed on http://bblfish.net/:
<link rel="openid.server" href="http://openid4.me/index.php" />
<link rel="openid2.provider openid.server" href="http://openid4.me/index.php"/>
<link rel="meta" type="application/rdf+xml" title="FOAF" href="http://bblfish.net/people/henry/card%23me"/>
And that's it. Having done that you can then in the future change your openid provider very easily. You could even set up your own OpenId4.me server, as it is open source.
More info at OpenId4.me.
Posted at 07:57PM Nov 19, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[3]
November 2nd: Join the Social Web Camp in Santa Clara
The W3C Social Web Incubator Group is organizing a free Bar Camp in the Santa Clara Sun Campus on November 2nd to foster a wide ranging discussion on the issues required to build the global Social Web.
Imagine a world where everybody could participate easily in a distributed yet secure social web. In such a world every individual will control their own information, and every business could enter into a conversation with customers, researchers, government agencies and partners as easily as they can now start a conversation with someone on Facebook. What is needed to go in the direction of The Internet of Subjects Manifesto? What existing technologies can we build on? What is missing? What could the W3C contribute? What could others do? To participate in the discussion and meet other people with similar interests, and push the discussion further visit the Santa Clara Social Web Camp wiki and
If you are looking for a reason to be in the Bay Area that week, then here are some other events you can combine with coming to the Bar Camp:
- The W3C is meeting in Santa Clara for its Technical Plenary that week in Santa Clara.
- The following day, the Internet Identity Workshop is taking place in Mountain View until the end of the week. Go there to push the discussion further by meeting up with the OpenId, OAuth, Liberty crowd, which are all technologies that can participate in the development of the Social Web.
- You may also want to check out ApacheCon which is also taking place that week.
If you can't come to the west coast at all due to budget cuts, then not all is lost. :-) If you are on the East coast go and participate in the ISWC Building Semantic Web Applications for Government tutorial, and watch my video on The Social Web which I gave at the Free and Open Source Conference this summer. Think: if the government wants to play with Social Networks, it certainly cannot put all its citizens information on Facebook.
Posted at 12:35AM Oct 16, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[1]
One month of Social Web talks in Paris
As I was in Berlin preparing to come to Paris, I wondered if I would be anywhere near as active in France as I had been in Germany. I had lived for 5 years in Fontainebleau, an hour from Paris, close but just too far to be in the swing of things. And from that position, I got very little feel for what was happening in the capital. This is what had made me long to live in Paris. So this was the occasion to test it out: I was going to spend one month in the capital. On my agenda there was just a Social Web Bar Camp and a few good contacts.
The Social Web Bar Camp at La Cantine which I blogged about in detail, was like a powder keg for my stay here. It just launched the whole next month of talks, which I detail below. It led me to make a very wide range of contacts, which led to my giving talks at 2 major conferences, 2 universities, one other Bar Camp, present to a couple of companies, get one implementation of foaf+ssl in Drupal, and meet a lot of great people.
Through other contacts, I also had an interview with a journalist from Le Monde, and met the very interesting European citizen journalism agency Cafe Babel (for more on them see this article).
Here follows a short summary of each event I presented the Social Web at during my short stay in Paris.
- Friday, 18 September 2009
- Arrived in plane from Berlin, and met the journalists at the Paris offices of Cafe Babel, after reading an article on them in the July/August issue of Internationale Politik, "Europa aus Erster Hand".
- Saturday, 19 September 2009
- Went to the Social Web Bar Camp at La Cantine which I blogged about in detail. Here I met a many people, who connected me up with the right people in the Paris conference scene, where I was then able to present. A couple of these did not work out due to calendar clashes, such as an attempted meeting with engineers and users of Elgg a distributed Open Source Social Networking Platform popular at Universities here in France and the UK.
- Monday, 21 September 2009
- Visited the offices of Le Monde, and had lunch with a journalist there. I explain my vision of the Social Web and the functioning of foaf+ssl. He won't be writing about it directly he told me, but will develop these ideas over time in a number of articles. ( I'll post updates here, though it is sadly very difficult to link to articles in Le Monde, as they change the URLs for their articles, make them paying only after a period of time, and then don't even make an abstract available for non paying members).
- Friday, 25 September 2009
- I visited the new offices of af83.com a startup with a history: they participated in the building of the web site of Ségolène Royal the contender with Nicholas Sarkozi, during the last French Presidential Elections.
There I met up with Damien Tournoud, and expert Drupal Developer, explained the basics of foaf+ssl, pointed him to the Open Source project foaf.me, and let him work on it. With a bit of help from Benjamin Nowack the creator of the ARC2 Semantic Web library for PHP, Damien had a working implementation the next day. We waited a bit, before announcing it the following Wednesday on the foaf-protocols mailing list. - Tuesday 29 September, 2009
- La Cantine organised another Bar Camp, on a wide range of topics, which I blogged about in detail. There I met people from Google, Firefox, and reconnected up with others. We also had a more open round table discussion on the Social Web.
- Thursday 1st and Friday 2nd October, 2009
- I visited the Open World Forum, which started among others with a track on the Semantic Desktop "Envisioning the Open Desktop of the future", headed by Prof Stefan Decker, with examples of implementations in the latest KDE (K Desktop Environment).
I met a lot of people here, including Eric Mahé, previously Technology Advisor at Sun Microsystems France. In fact I met so many people that I missed most of the talks. One really interesting presentation by someone from a major open source code search engine, explained that close to 60% of Open Source software came from Eastern and Western Europe combined. (anyone with a link to the talk?) - Saturday, 3rd October 2009
- I presented The Social Web in French at the Open Source Developer Conference France which took place in La Villette.
I was really happily surprised to find that I was part of a 3 hour track dedicated to the Semantic Web. This started with a talk by Oliver Berger "Bugtracking sur le web sémantique. Oliver has been working on the Baetle ontology as part of the 2 year government financed HELIOS project. This is something I talked about a couple of years ago and wrote about here in my presentation Connecting Software and People. It is really nice to see this evolving. I really look forward to seeing the first implementations :-)
Oliver's was followed by a talk by Jean-Marc Vanel, introducing Software and Ontology Development, who introduced many of the key Semantic Web concepts. - Tuesday 6th October, morning
- Milan Stankovitch whom I had met at the European Semantic Web Conference, and again at the Social Web Bar Camp, invited me to talk to the developers of hypios.com, a very interesting web platform to help problem seekers find problem solvers. The introductory video is really worth watching. I gave them the talk I keep presenting, but with a special focus on how this could help them in the longer term make it easier for people to join and use their system.
- Tuesday 6th September, afternoon
- I talked and participated in a couple of round table talks at the 2nd Project Accelerator on Identity at the University of Paris 1, organised by the FING. Perhaps the most interesting talk there was the one by François Hodierne , who works for the Open Source Web Applications & Platforms company h6e.net, and who presented the excellent project La Distribution whose aim it is to make installing the most popular web applications as easy as installing an app on the iPhone. This is the type of software needed to make The Internet of Subjects Manifesto a reality. In a few clicks everyone should be able to get a domain name, install their favorite web software on it - Wordpress, mail, wikis, social network, photo publishing tool - and get on with their life, whilst owning their data, so that if they at a later time find the need to move, they can, and so that nobody can kick them off their network. This will require rewriting a little each of the applications so as to enable them to work with the distributed secure Social Web, made possible by foaf+ssl: an application without a social network no longer being very valuable.
- Thurday 9th October, 2009
- Pierre Antoine Champin from the CNRS, the National French Research organisation, had invited me to Lyon to present The Social Web. So I took the TGV from Paris at 10:54 and was there 2 hours later, which by car would have been a distance of 464km (288.3 miles) according to Google Maps. The talk was very well attended with close to 50 students showing up, and the session lasted two full hours: 1 hour of talks and by many good questions.
After a chat and a few beers, I took the train back to Paris where the train arrived just after 10pm. - Saturday October 10, 2009
- I gave a talk on the Social Web at Paris-Web, on the last day of a 3 day conference. This again went very well.
After lunch I attended two very good talks that complemented mine perfectly:- David Larlet had a great presentation on Data Portability, which sparked a very lively and interesting discussion. Issues of Data ownership, security, confidentiality, centralization versus decentralization came up. One of his slides made the point very well: by showing the number of Web 2.0 sites that no longer exist, some of them having disappeared by acquisition, others simply technical meltdown, leaving the data of all their users lost forever. (Also see David's Blog summary of Paris-Web. )
- Right after coffee we had a great presentation on the Semantic Web by Fabien Gandon, who managed to give in the limited amount of time available to him an overview of the Semantic Web stack from bottom to top, including OWL 1 and 2, Microformats, RDFa, and Linked data, and various very cool applications of it, that even I learned a lot. His slides are available here. He certainly inspired a lot of people.
- Tuesday, 13 October 2009
- Finally I presented at the hacker space La suite Logique, which takes place in a very well organized very low cost lodging space in Paris. They had presentations on a number of projects happening there:
- One project is to build a grid by taking pieces from the remains of computers that people have brought them. They have a room stashed full of those.
- Another projects is to add wifi to the lighting to remotely control the projectors for theatrical events taking place there.
- There was some discussion on how to add sensors to dancers, as one Daito Manabe a Japanese artist has done, in order to create a high tech butoh dance (see the great online videos).
- Three engineers presented the robots they are constructing for a well known robot fighting competition
Posted at 07:16PM Oct 12, 2009 [permalink/trackback] by Henry Story in travel | Comments[0]
Sketch of a RESTful photo Printing service with foaf+ssl
Let us imagine a future where you own your data. It's all on a server you control, under a domain name you own, hosted at home, in your garage, or on some cloud somewhere. Just as your OS gets updates, so all your server software will be updated, and patched automatically. The user interface for installing applications may be as easy as installing an app on the iPhone ( as La Distribution is doing).
A few years back, with one click, you installed a myPhoto service, a distributed version of fotopedia. You have been uploading all your work, social, and personal photos there. These services have become really popular and all your friends are working the same way too. When your friends visit you, they are automatically and seamlessly recognized using foaf+ssl in one click. They can browse the photos you made with them, share interesting tidbits, and more... When you organize a party, you can put up a wiki where friends of your friends can have write access, leave notes as to what they are going to bring, and whether or not they are coming. Similarly your colleagues have access to your calendar schedule, your work documents and your business related photos. Your extended family, defined through a linked data of family relationship (every member of your family just needs to describe their relation to their close family network) can see photos of your family, see the videos of your new born baby, and organize Christmas reunions, as well as tag photos.
One day you wish to print a few photos. So you go to web site we will provisionally call print.com. Print.com is neither a friend of yours, nor a colleague, nor family. It is just a company, and so it gets minimal access to the content on your web server. It can't see your photos, and all it may know of you is a nickname you like to use, and perhaps an icon you like. So how are you going to allow print.com access to the photos you wish to print? This is what I would like to try to sketch a solution for here. It should be very simple, RESTful, and work in a distributed and decentralized environment, where everyone owns and controls their data, and is security conscious.
Before looking at the details of the interactions detailed in the UML Sequence diagram below, let me describe the user experience at a general level.
- You go to print.com site after clicking on a link a friend of your suggested on a blog. On the home web page is a button you can click to add your photos.
- You click it, and your browser asks you which WebID you wish to use to Identify yourself. You choose your personal ID, as you wish to print some personal photos of yours. Having done that, your are authenticated, and print.com welcomes you using your nicknames and displays your icon on the resulting page.
- When you click a button that says "Give Print.com access to the pictures you wish us to print", a new frame is opened on your web site
- This frame displays a page from your server, where you are already logged in. The page recognized you and asks if you want to give print.com access to some of your content. It gives you information about print.com's current stock value on NASDAQ, and recent news stories about the company. There is a link to more information, which you don't bother exploring right now.
- You agree to give Print.com access, but only for 1 hour.
- When your web site asks you which content you want to give it access to, you select the pictures you would like it to have. Your server knows how to do content negotiation, so even though copying each one of the pictures over is feasible, you'd rather give print.com access to the photos directly, and let the two servers negotiate the best representation to use.
- Having done that you drag and drop an icon representing the set of photos you chose from this frame to a printing icon on the print.com frame.
- Print.com thanks you, shows you icons of the pictures you wish to print, and tells you that the photos will be on their way to your the address of your choosing within 2 hours.
In more detail then we have the following interactions:
- Your browser GETs print.com's home page, which returns a page with a "publish my photos" button.
- You click the button, which starts the foaf+ssl handshake. The initial ssl connection requests a client certificate, which leads your browser to ask for your WebID in a nice popup as the iPhone can currently do. Print.com then dereferences your WebId in (2a) to verify that the public key in the certificate is indeed correct. Your WebId (Joe's foaf file) contains information about you, your public keys, and a relation to your contact addition service. Perhaps something like the following:
:me xxx:contactRegistration </addContact> .
Print.com uses this information when it creates the resulting html page to point you to your server. - When you click the "Give Print.com access to the pictures you wish us to print" you are sending a POST form to the
<addContact>resource on your server, with the WebId of Print.com<https://nasdaq.com/co/PRNT#co>in the body of the POST. The results of this POST are displayed in a new frame. - Your web server dereferences Print.com, where it gets some information about it from the NASDAQ URL. Your server puts this information together (4a) in the html it returns to you, asking what kind of access you want to give this company, and for how long you wish to give it.
- You give print.com access for 1 hour by filling in the forms.
- You give access rights to Print.com to your individual pictures using the excellent user interface available to you on your server.
- When you drag and drop the resulting icon depicting the collection of the photos accessible to Print.com, onto its "Print" icon in the other frame - which is possible with html5 - your browser sends off a request to the printing server with that URL.
- Print.com dereferences that URL which is a collection of photos it now has access to, and which it downloads one by one. Print.com had access to the photos on your server after having been authenticated with its WebId using foaf+ssl. (note: your server did not need to GET print.com's foaf file, as it still had a fresh version in its cache). Print.com builds small icons of your photos, which it puts up on its server, and then links to in the resulting html before showing you the result. You can click on those previews to get an idea what you will get printed.
So all the above requires very little in addition to foaf+ssl. Just one relation, to point to a contact-addition POST endpoint. The rest is just good user interface design.
What do you think? Have I forgotten something obvious here? Is there something that won't work? Comment on this here, or on the foaf-protocols mailing list.
Notes

print.com sequence diagram by Henry Story is licensed under a Creative Commons Attribution 3.0 United States License.
Based on a work at blogs.sun.com.
Posted at 09:15PM Oct 07, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[0]
The foaf+ssl world tour
As you can see from the map here I have been cycling from Fontainebleau to Vienna (covering close to 1000km of road), and now around Cyprus in my spare time. On different occasions along my journey I had the occasion to present foaf+ssl and combine it with a hands on session, where members of the audience were encouraged to create their own foaf file and certificates, and also start looking into what it takes to develop foaf+ssl enabled services. This seems like a very good way to proceed: it helps people get some hands on experience which they can then hopefully pass on to others, it helps me prioritize what need to be done next, and should also lead to the development of foaf+ssl services that will increase the network value of the community, creating I hope a viral effect.
I started this cycle tour in order to loose some weight. I still have 10kg to loose or so, which at the rate of 3kg per 1000km will require me to cycle another 3000km. So that should enable me to visit quite a few places yet. I will be flying back to Vienna where I will stay 10 days or so, after which I will cycle to Prague for a Kiwi meeting on the 3rd of July. After that I could cycle on to Berlin. But really it's up to you to decide. If you know a good hacker group that I can present to and cycle to, let me know, and I'll see how I can fit it into my timetable. So please get in contact! :-)
Posted at 12:21PM Jun 11, 2009 [permalink/trackback] by Henry Story in travel | Comments[5]
You are a Terrorist!
Every country in Europe seems to be on the verge of introducing extremely powerful legislation for state monitoring of the internet, bringing us a lot closer to the dystopia described in George Orwell's novel Nineteen Eighty Four. Under the guise of laws to help combat terrorism or pedophilia - emotional subjects that immediately get everybody's unthinking assent - massive powers are to be given to the state, which could very easily be misused. As internauts we all need to make it our duty to follow very closely these debates, and participate actively in them, if we do not want to find ourselves waking up one morning in a world that is the exact opposite of what we have been dreaming of.
Germany
In Germany a new Data Retention law passed already it seems in 2008, allows the state (quote)
to trace who has contacted whom via telephone, mobile phone or e-mail for a period of six months. In the case of mobile calls or text messages via mobile phone, the user's location is also logged. Anonymising services will be prohibited as of 2009.To increase awareness of this law Alexander Lehmann put together this excellent presentation, with English subtitles, Du bist Terrorist!:
Du bist Terrorist (You are a Terrorist) english subtitles from lexela on Vimeo.
France
The passage of the hadopi law in France, will create a strong incentive for citizens to place state built snooper software on each their computers in order to make it possible to defend themselves against accusations of copyright infringement. But that is nothing compared to the incredibly broad powers the state wishes to give itself with Loppsi 2 law (detailed article in Le Monde, and Ars Technica) which would give the president the power to insert spyware onto users computers (which could record anything being done of course), create a very large database of people's activities, help link information from various databases, and much more... The recent case of the sacking of the web site director of the once national, now private, TF1 television channel for having communicated his doubts on Hadopi privately to his Member of Parliament - as reported on Slashdot recently - does not give one much faith in the way privacy is being handled currently by the government.
The United Kingdom
In the UK the Home Secretary Jaqui Smith had proposed to create a database dubbed Big Brother to log every single activity of every one of it's citizens - in order of course to root out the very 21 century crimes of pedophilia and terrorism (did the IRA not operate before the internet? Are pedophile rings something that only emerged with the internet, or is it that they just became more visible?). She had to pull back somewhat from the initial proposal, and now wishes all that information still to be tracked, but only to be kept on the service provider's databases as reported by the Daily Mail, The Telegraph, The Independent...
Conclusion
So are we now all suspected terrorists, pornographers, pedophiles, murderers, subversives, ... that the governments must know all about us? We may have voted for the current government and have complete faith in their use of these tools. But what when the opposition comes in, and takes hold of those same powers? Will we be as comfortable then? The excellent 2006 film The Lives of Others shows just how intrusive the East German state was on its own citizens during the cold war - and that with the very limited tools they had available. With modern computing tools, that type of spy operation could be done at much much lower cost and so perhaps even be viable for the state.
If you feel things just can't go this wrong, then I would also recommend watching Julie Taymor's adaptation of Shakespear's Titus Andronicus. It really is important to realize that things can go badly, very very badly wrong. Ignoring a problem, not taking responsibilities in fighting them will lead to disaster, as the current economic crisis - predicted years before it occurred, but without any action being taken - should have amply proven by now. Sadly for people who predict danger, if people do act on the danger and avoid it, nobody may even notice how close to danger they really were. So our actions may remain unsung. But at least we may put some chances on our side not to wake up in a new form of dictatorship, worse than any ever dreamed of by our those who helped forge our democracies.
Posted at 09:39AM May 20, 2009 [permalink/trackback] by Henry Story in Art | Comments[0]
FOAF+SSL: RESTful Authentication for the Social Web
The European Semantic Web Conference (ESWC) will be held in Heraklion on the Island of Crete in Greece from 31 May to 4 June. I will be presenting the paper "FOAF+SSL: RESTful Authentication for the Social Web" which I co-authored with Bruno Harbulot, Ian Jacobi and Mike Jones. Here is the abstract:
We describe a simple protocol for RESTful authentication, using widely deployed technologies such as HTTP, SSL/TLS and Semantic Web vocabularies. This protocol can be used for one-click sign-on to web sites using existing browsers — requiring the user to enter neither an identifier nor a password. Upon this, distributed, open yet secure social networks and applications can be built. After summarizing each of these technologies and how they come together in FOAF+SSL, we describe declaratively the reasoning of a server in its authentication decision. Finally, we compare this protocol to others in the same space.
The paper was accepted by the Trust and Privacy on the Social and Semantic Web track of the ESWC. There are quite a number of interesting papers there.
I have never been to Greece, so I have a feeling I will really enjoy this trip. Hope to see many of you there.
Posted at 11:54PM May 14, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[4]
A Simple foaf+ssl Identity Provider (IdP)
In order to help people get started with foaf+ssl, we have put together a very simple Identity Provider service (IdP). This removes the need for web services to have to deal with setting up https certificates and changing much to their current web setup. With a few lines of server side code any server can now easily find the WebId of a user, and try out some interesting ideas at little cost. If the experiment is useful, for extra security and reliability a business case can then be made for integrating a full foaf+ssl stack.
The protocol is very much as we outlined in a earlier post entitled "Sketch of a foaf+ssl+openid service". The details of the API are listed directly on the root of the first foaf+ssl IdP serviced, available here: https://foafssl.org/srv/idp. All the Service Provider - that is the consumer of the IdP - needs to do is to add a login button or link to his web page that points to the above IdP with a authreqissuer=$url parameter that points back to a CGI controlled by the Service Provider that can parse the redirect containing the user's WebId. That url comes with a timestamp to avoid replay attacks, and is signed to assure authenticity.
Bruno Harbulot wrote the code and published it under a BSD licence by the University of Manchester where he studies. The code is available on the So(m)mer Subversion repository. You can download it with:
and start your own IdP if you want. Please feel free to contribute back improovements, or ping us for missing features.
$ svn checkout https://sommer.dev.java.net/svn/sommer/foafssl/trunk foafssl --username guest
Update September 14, 2009
The IdP is now RDFa enabled, using Damian Steer's RDFa parser for Jena which I ported to Sesame. The war file can be downloaded directly from the dev.java.net Maven repository. To set up your own IdP use that WAR and follow the foaf+ssl setup instructions for Tomcat. This war may only work for Tomcat 7.
Posted at 12:56PM May 12, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[0]
Sun Initiates Social Web Interest Group
I am very pleased to announce that Sun Microsystems is one of the initiating members of the Social Web Incubator Group launched at the W3C.
Quoting from the Charter:
The mission of the Social Web Incubator Group, part of the Incubator Activity, is to understand the systems and technologies that permit the description and identification of people, groups, organizations, and user-generated content in extensible and privacy-respecting ways.
The topics covered with regards to the emerging Social Web include, but are not limited to: accessibility, internationalization, portability, distributed architecture, privacy, trust, business metrics and practices, user experience, and contextual data. The scope includes issues such as widget platforms (such as OpenSocial, Facebook and W3C Widgets), as well as other user-facing technology, such as OpenID and OAuth, and mobile access to social networking services. The group is concerned also with the extensibility of Social Web descriptive schemas, so that the ability of Web users to describe themselves and their interests is not limited by the imagination of software engineers or Web site creators. Some of these technologies are independent projects, some were standardized at the IETF, W3C or elsewhere, and users of the Web shouldn't have to care. The purpose of this group is to provide a lightweight environment designed to foster and report on collaborations within the Social Web-related industry or outside which may, in due time affect the growth and usability of the Social Web, rather than to create new technology.
I am glad we are supporting this along with these other prestigious players:
- ASemantics
- Boeing
- Cisco
- DERI Galway at the National University of Ireland, Galway, Ireland
- Garlik
- Institut National de Recherche en Informatique et en Automatique (INRIA)
- Institute of Informatics and Telecommunications (IIT), NCSR
- NICTA
- Rochester Institute of Technology
- SUN Microsystems
- Talis
- Telecom Italia
- University of Bristol
- University of Edinburgh
- Universidad Politécnica de Madrid
- University of Versailles
- Vrije Universiteit
- Vodafone
This should certainly help create a very interesting forum for discussing what I believe is one of the most important issue on the web today.
Posted at 10:22AM Apr 07, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[4]
sketch of a foaf+ssl+openid service
Discussing foaf+ssl with Melvin Carvalho he pointed out that we need a service to help non https enabled servers to participate in our distributed open secure social network. This discussion led me to sketch out the following simple protocol, where I make use of parts of the OpenId protocol at key points. This results in something that does what OpenId does, but without the need for users to remember their URL, and so without many of the problems that plague that protocol. And all this with minimal protocol invention.
So first here is the UML sequence diagram for what I am calling here tentatively foaf+ssl+openid.
- First Romeo arrives on a public page with a login button.
- On an OpenId server there would be a field for the user to enter their ID, with foaf+ssl this is not needed. So we have a simple login button.
- That button's action attribute points to some foaf+ssl+openid service that server trusts (it is therefore an https URL). It can be any such service. In OpenId the Id entered by the user points the server to a web page that points the service to an openid server the user (Romeo here) trusts. All of this is no longer needed with this protocol. The html for the login button can be static.
- The URL has to encode information for the foaf+ssl service to know who to contact back. One should use exactly the same URL format here as OpenId does. (minus the need to encode User's URL since that will be in the X509 certificate)
- When Romeo clicks the login button he opens an https request to the foaf+ssl+openid service.
- The foaf+ssl+openid service on opening the connection asks for the client's certificate after sending its own. This would contain
- The User's Public key
Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00:b6:bd:6c:e1:a5:ef:51:aa:a6:97:52:c6:af:2e: 71:94:8a:b6:da:9e:5a:5f:08:6d:ba:75:48:d8:b8: 01:50:d3:92:11:7d:90:13:89:48:06:2e:ec:6e:cb: 57:45:a4:54:91:ee:a0:3a:46:b0:a1:c2:e6:32:4d: 54:14:4f:42:cd:aa:05:ca:39:93:9e:b9:73:08:6c: fe:dc:8e:31:64:1c:f7:f2:9a:bc:58:31:0d:cb:8e: 56:d9:e6:da:e2:23:3a:31:71:67:74:d1:eb:32:ce: d1:52:08:4c:fb:86:0f:b8:cb:52:98:a3:c0:27:01: 45:c5:d8:78:f0:7f:64:17:af Exponent: 65537 (0x10001) - The Subject's Alternative Name WebId
X509v3 extensions: ... X509v3 Subject Alternative Name: URI:http://romeo.net/#romeo
- The User's Public key
- The server looks in the client certificate for the
Subject Alternative Namein the SSLv3 extensions, and fetches the foaf file at that URL - The service then does a simple match on the information from the foaf file and the information from the certificate. If they match the foaf+ssl+openid service knows that the user <http://romeo.net/#rome> controls <http://romeo.net/> web page. This is enough for simple authentication. (For more on this see Creating a Web of Trust withouth Key Signing Parties )
- Depending on the result, the foaf+ssl+openid service can return a redirect with an authentication token to the original service Romeo wanted to log into. This can also be done using the patterns developed in the OpenId community.
- The browser then redirects to the Original service.
- The service now has Romeo's URL. But to avoid a man in the middle attack, or replay attacks it follows the OpenId protocol and does a little check with its service on a token sent to it in the redirect in step 6.
((Perhaps this step could be avoided if the foaf+ssl+openid service made public it's public key, and encrypted some token sent to by the client to the server. But we could just stick closely to the well trodden OpenId path and just reuse their libraries.)) - Having verified the identity of the user, the service could optionally GET the user's foaf file, for public information about him.
- Or it could check the relation that user has to it's trusted graph of friends,
- and return a presonalised resource
One could also imagine a foaf+ssl+openid server enabled with attribute exchange functionality, which it could get access to simply by reading the foaf file.
I am not sure how much of a problem it really is for servers not to have SSL access. But this could easily fill that gap.
Posted at 06:35PM Feb 12, 2009 [permalink/trackback] by Henry Story in SemWeb | Comments[1]
Typealizer: analyzing your personality through your blog
Thanks to Mark Dixon I discovered Typealizer, a service that reads your blog and finds your psychological type. So of course I tried it on my own blog, as you will on yours shortly :-) . This is what it had to say:
INTJ - The Scientists
The long-range thinking and individualistic type. They are especially good at looking at almost anything and figuring out a way of improving it - often with a highly creative and imaginative touch. They are intellectually curious and daring, but might be pshysically hesitant to try new things.
The Scientists enjoy theoretical work that allows them to use their strong minds and bold creativity. Since they tend to be so abstract and theoretical in their communication they often have a problem communcating their visions to other people and need to learn patience and use conrete examples. Since they are extremly good at concentrating they often have no trouble working alone.
Well that not bad for flattery. So I reward them with this blog post.
They accompany their analysis with a brain activity diagram. This is the one I got:
There is a lot in the cross section intuition and thinking, with some but not a lot of positioning in the practical.
So being all happily scientifical I decided to try out what it would say if I pointed Typealiser to the Travel category on this blog. This is what it has to say on that aspect of my personality, perhaps it is true, a little in retreat recently.
ESTP - The Doers
The active and play-ful type. They are especially attuned to people and things around them and often full of energy, talking, joking and engaging in physical out-door activities.
The Doers are happiest with action-filled work which craves their full attention and focus. They might be very impulsive and more keen on starting something new than following it through. They might have a problem with sitting still or remaining inactive for any period of time.
This also came with a brain activity diagram for that part of the blog
So clearly a lot more biased towards action, as a travel blog should.
Still both of these blogs are not allowing me to capture around half of my brain activity. The spiritual idealistic side is not very visible. I wonder if that means I should speak more about open source and linux? ;-) I tried the Art category of my blog but that did not move me more to the feeling type, nor did the philosophy section make me more idealistic, just again more of a thinker, which they characterise like this:
INTP - The Thinkers
The logical and analytical type. They are espescially attuned to difficult creative and intellectual challenges and always look for something more complex to dig into. They are great at finding subtle connections between things and imagine far-reaching implications.
They enjoy working with complex things using a lot of concepts and imaginative models of reality. Since they are not very good at seeing and understanding the needs of other people, they might come across as arrogant, impatient and insensitive to people that need some time to understand what they are talking about.
Now what could be interesting would be some way then to do the inverse search. Find out what your brains activity diagram should look like, and ask to find blogs that fit those categories, which one could then use as a guide to help one develop that aspect of one's personality - or find a partner :-)
Ps. a thought: after categorizing people into 16 different groups this still leave you with 8 billion people/16 = 500 million people to chose from and if every person just had 1000 web pages that would leave you with half a trillion pages to look at. So this character analysis can be useful, but there still has to be a lot of other criteria to make a good judgement call.
PPS. Oddly enough - or not - Ken Wilber's blog is categorised as being of the "executive type".
Posted at 11:02AM Dec 13, 2008 [permalink/trackback] by Henry Story in General | Comments[2]
variation on @timoreilly: hyperdata is the new intel outside
Context: Tim O'Reilly said "Data is the new Intel Inside".
Recently in a post "Why I love Twitter":
What's different, of course, is that Twitter isn't just a protocol. It's also a database. And that's the old secret of Web 2.0, Data is the Intel Inside. That means that they can let go of controlling the interface. The more other people build on Twitter, the better their position becomes.
The meme was launched in the well known "What is Web 2.0" paper in the section entitled "Data is the next Intel Inside"
Applications are increasingly data-driven. Therefore: For competitive advantage, seek to own a unique, hard-to-recreate source of data.
Most of the data is outside your database. It can only be that way, the world is huge, and you are just one small link in the human chain. Linking that data is knowledge and value creation. Hyperdata is the foundation of Web 3.0.
Posted at 03:19PM Nov 30, 2008 [permalink/trackback] by Henry Story in SemWeb | Comments[0]
REST APIs must be hypertext driven
Roy Fielding recently wrote in "REST APIs must be hypertext-driven"
I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today's example is the SocialSite REST API. That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.
That was pretty much my thought when I saw that spec. In a comment to his post he continues.
The OpenSocial RESTful protocol is not RESTful. It could be made so with some relatively small changes, but right now it is just wrapping RPC results in common Web media types.
Clarification of Roy's points
Roy then goes on to list some key criteria for what makes an application RESTful.REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification.
In section 2.2 of the O.S. protocol we have the following JSON representation for a Person.
{ "id" : "example.org:34KJDCSKJN2HHF0DW20394", "displayName" : "Janey", "name" : {"unstructured" : "Jane Doe"}, "gender" : "female" }Note that the id is not a URI. Further down in the XML version of the above JSON, it is made clear that by appending "urn:guid:" you can turn this string into a URI. By doing this the protocol has in essence tied itself to a URI scheme, since there is no way of expressing another URI type in the JSON - the JSON being the key representation in this Javascript specific API by the way, the aim of the exercise being to make the writing of social network widgets interoperable. Furthermore this scheme has some serious limitations such as for example that it limits one to 1 social network per internet domain, is tied to a quite controversial XRI spec that has been rejected by OASIS, and does not provide a clear mechanism for retrieving information about it. But that is not the point. The definition of the format is tying itself unnecessarily to a URI scheme, and moreover one that ties one to what is clearly a client/server model.
A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP's PATCH method or Link header field.
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]
Most of these so called RESTful APIs spend a huge amount of time specifying what response a certain resource should give to a certain message. Note for example section 2.1 entitled Responses
A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].
In section 6.3 one sees this example:
/activities/{guid}/@self -- Collection of activities generated by given user /activities/{guid}/@self/{appid} -- Collection of activities generated by an app for a given user /activities/{guid}/@friends -- Collection of activities for friends of the given user {guid} /activities/{guid}/@friends/{appid} -- Collection of activities generated by an app for friends of the given user {guid} /activities/{guid}/{groupid} -- Collection of activities for people in group {groupid} belonging to given user {uid} /activities/{guid}/{groupid}/{appid} -- Collection of activities generated by an app for people in group {groupid} belonging to given user {uid} /activities/{guid}/@self/{appid}/{activityid} -- Individual activity resource; usually discovered from collection /activities/@supportedFields -- Returns all of the fields that the container supports on activity objects as an array in json and a repeated list in atom.For some reason it seems that this protocol does require a very precise lay out of the patterns of URLs. Now it is true that this is then meant to be specified in an XRDS document. But this document is not linked to from any of the representations as far as I can see. So there is some "out of band" information exchange that has happened and on which the rest of the protocol relies. Furthermore it ties the whole service again to one server. How open is a service which ties you to one server?
A REST API should never have "typed" resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation's media type and standardized relation names. [ditto]
Now clearly one does want to have URIs name resources, things, and these things have types. I think Roy is here warning against the danger that expectations are placed on types that depend on the resources themselves. This seems to be tied to the previous point that one should not have fixed resource names or hierarchies as we saw above. To see how this is possible check out my foaf file:
$ cwm http://bblfish.net/people/henry/card --ntriples | grep knows | head <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://axel.deri.ie/~axepol/foaf.rdf#me> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://b4mad.net/FOAF/goern.rdf#goern> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://bigasterisk.com/foaf.rdf#drewp> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://crschmidt.net/foaf.rdf#crschmidt> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://danbri.org/foaf.rdf#danbri> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://data.boab.info/david/foaf.rdf#me> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://davelevy.info/foaf.rdf#me> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://dblp.l3s.de/d2r/page/authors/Christian_Bizer> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/James_Gosling> . <http://bblfish.net/people/henry/card#me> <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/Roy_Fielding> .Notice that there is no pattern in the URIs to the right. (As it happens there are no ftp URLs there, but it would work just as well if there were). Yet the Tabulator extension for Firefox knows from the relations above alone that (if it believes my foaf file of course) the URIs to the right refer to people. This is because the foaf:knows relation is defined as
@prefix foaf: <http://xmlns.com/foaf/0.1/> . foaf:knows a rdf:Property, owl:ObjectProperty; :comment "A person known by this person (indicating some level of reciprocated interaction between the parties)."; :domain <http://xmlns.com/foaf/0.1/Person>; :isDefinedBy <http://xmlns.com/foaf/0.1/>; :label "knows"; :range foaf:Person .This information can then be used by a reasoner (such as the javascript one in the tabulator) to deduce that the resources pointed to by the URIs to the right and to the left of the foaf:knows relation are members of the foaf:Person class.
Note also that there is no knowledge as to how those resources are served. In many cases they may be served by simple web servers sending resources back. In other cases the RDF may be generated by a script. Perhaps the resources could be generated by java objects served up by Jersey. The point is that the Tabulator does not need to know.
Furthermore, the ontology information above is not out of band. It is GETable at the foaf:knows URIs itself. The name of the relation links to the information about the relations, which gives us enough to be able to deduce further facts. This is hypertext - hyperdata in this case - at its best. Compare that with the JSON example given above. There is no way to tell what that JSON means outside of the context of the totally misnamed 'Open Social RESTful API'. This is a limitation of JSON, or at least this name space less version. One would have to add a mime type to the JSON to make it clear that the JSON had to be interpreted in a particular manner for this application, but I doubt most JSON tools would know what to do with mime typed JSON versions. And do you really want to go through a mime type registration process every time a social networking application wants to add a new feature or interact with new types of data?
as Roy summarizes in one one of the replies to this blog post:
When representations are provided in hypertext form with typed relations (using microformats of HTML, RDF in N3 or XML, or even SVG), then automated agents can traverse these applications almost as well as any human. There are plenty of examples in the linked data communities. More important to me is that the same design reflects good human-Web design, and thus we can design the protocols to support both machine and human-driven applications by following the same architectural style.
To get a feel of this it really helps to play with other hyperdata applications, other than ones residing in web browsers The semantic address book is one such, that I spent some time writing.
A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user’s manipulation of those representations. The transitions may be determined (or limited by) the client's knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]
That is the out of band point made previously, and confirms the point made about the danger of protocols that depend on URI patterns or resources that are somehow typed at the protocol level. You should be able to pick up a URI and just go from there. With the tabulator plugin you can in fact do just that on any of the URLs listen in my foaf file, or in other RDF.
What's the point?
Engineers under the spell of the client/server architecture, will find some of this very counter intuitive. This is indeed why Roy's thesis, and the work done by the people who engineered the web before that and whose wisdom is distilled in various writings by the Technical Architecture Group did something that was exceedingly original. These very simple principles that can feel unintuitive to someone who is not used to thinking at a global information scale, make a lot of sense when you do come to think at that level. When you do write such an Open system, that can allow people to access information globally, you want it to be such that you can send people a URI to any resource you are working with, so that both of you can speak about the same resource. Understanding what the resource that URL is about should be found by GETting the meaning of the URL. If the meaning of that URL depends on the way you accessed it, then you will no longer be able to just send a URL, but you will have to send 8 or 9 URLs with explanations on how to jump from one representation to the other. If some out of band information is needed to understand that one has to inspect the URL itself to understand what it is about, then you are not setting up an Open protocol, but a secret one. Secret protocols may indeed be very useful in some circumstances, and so as Roy points out may non RESTful ones be:
That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style.but note: it is much more difficult for them to make use of the network effect: the value of information grows exponentially with its ability to be linked to other information. In another reply to a comment Roy puts this very succinctly:
encoding knowledge within clients and servers of the other side’s implementation mechanism is what we are trying to avoid.
Posted at 02:02PM Nov 11, 2008 [permalink/trackback] by Henry Story in SemWeb | Comments[0]
Possible Worlds and the Web
Tim Berner's Lee pressed to define his creation said recently (from memory): "...my short definition is that the web is a mapping from URI's onto meaning".
Meaning is defined in terms of possible interpretations of sentences, also known as possible worlds. Possible Worlds under the guise of the 5th and higher dimensions are fundamental components of contemporary physics. When logic and physics meet we are in the realm of metaphysics. To find these two meet the basic architecture of the web should give anyone pause for thought.
The following extract from RDF Semantics spec is a good starting point:
The basic intuition of model-theoretic semantics is that asserting a sentence makes a claim about the world: it is another way of saying that the world is, in fact, so arranged as to be an interpretation which makes the sentence true. In other words, an assertion amounts to stating a constraint on the possible ways the world might be. Notice that there is no presumption here that any assertion contains enough information to specify a single unique interpretation. It is usually impossible to assert enough in any language to completely constrain the interpretations to a single possible world, so there is no such thing as 'the' unique interpretation of an RDF graph. In general, the larger an RDF graph is - the more it says about the world - then the smaller the set of interpretations that an assertion of the graph allows to be true - the fewer the ways the world could be, while making the asserted graph true of it.
A few examples may help here. Take the sentence "Barack Obama is the 44th president of the U.S.A". There are many many ways the world/universe/complete 4 dimensional space time continuum from the beginning of the universe to the end if there is one, yes, there are many ways the world could be and that sentence be true. For example I could not have bothered to write this article now, I could have written it just a little later, or perhaps even not at all. There is a world in which you did not read it. There is a world in which I went out this morning to get a baguette from one of the many delicious local french bakeries. The world could be all these ways and yet still Barack Obama be the 44th president of the United States.
In N3 we speak about the meaning of a sentence by quoting it with '{' '}'. So for our example we can write:
@prefix dbpedia: <http://dbpedia.org/resource/> .
{ dbpedia:Barack_Obama a dbpedia:President_of_the_United_States . } = :g1 .
:g1 is the set of all possible worlds in which Obama is president of the USA. The only worlds that are not part of that set, are the worlds where Obama is not President, but say McCain or Sarah Palin is. That McCain might have become president of the United States is quite conceivable. Both those meanings are understandable, and we can speak about both of them
@prefix dbpedia: <http://dbpedia.org/resource/> .
{ dbpedia:Barack_Obama a dbpedia:President_of_the_United_States . } = :g1 .
{ dbpedia:John_McCain a dbpedia:President_of_the_United_States . } = :g2 .
:g1 hopedBy :george .
:g2 feardedBy :george .
:g1 fearedBy :jane .
Ie. we can say that George hopes Barack Obama to be the 44th president of the United States, but that Jane fears it.
Assume wikipedia had a resource for each member of the list of presidents of the USA, and that we were pointing to the 44th element above. Then even though we can speak about :g1 and :g2, there is no world that fits them both: The intersection of both :g1 and :g2 is { } , the empty set, whose extension according to David Lewis' book on Mereology is the fusion of absolutely all possibilities. The thing that is everything and everywhere and around at all times. Ie. you don't make any distinction when you say that: you don't say anything.
The definition of meaning in terms of possible worlds, make a few things very simple to explain. Implication being one of them. If every president has to be human, then
@prefix log: <http://www.w3.org/2000/10/swap/log#> .
{ dbpedia:Barack_Obama a dbpedia:President_of_the_United_States . } log:implies { dbpedia:Barack_Obama a dbpedia:Human . }
Ie the set of possible worlds in which Obama is a president of the United States is a subset of the set of worlds in which he is Human. There are worlds after all where Barack is just living a normal Lawyer's life.
So what is this mapping from URIs to meaning that Tim Berners Lee is talking about? I interpret him as speaking of the log:semantics relation.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
log:semantics a rdf:Property;
:label "semantics";
:comment """The log:semantics of a document is the formula.
achieved by parsing representation of the document.
For a document in Notation3, log:semantics is the
log:parsedAsN3 of the log:contents of the document.
For a document in RDF/XML, it is parsed according to the
RDF/XML specification to yield an RDF formula [snip]""";
:domain foaf:Document;
:range log:Formula .
Of course it is easier to automate the mapping from resources that return RDF based representations, but log:semantics can be applied to any document. Any web page, even those written in natural languages, have some semantics. It is just that they currently require very advanced wetware processors to interpret them. These can indeed be very specialised wetware processors, as for example those that one meets at air ports.
Posted at 12:14PM Nov 10, 2008 [permalink/trackback] by Henry Story in Philosophy | Comments[0]
RDF: Reality Distortion Field
Here is Kevin Kelly's presentation on the next 5000 days on the web, in clear easy English that every member of the family can watch and understand. It explains what the semantic web, also known as Web 3.0, is about and how it will affect technology and life on earth. Where is the web going? I can find no fault in this presentation.
This is a great introduction. He explains how Metcalf's law brought us to the web of documents and is leading us inexorably to a web of things, in which we will be the eyes and the hands of this machine called the internet that never stops running. For those with a more technical mind, who want to see how this is possible, follow this up with a look at the introductory material to RDF.
Warning: This may change the way you think. Don't Panic! Things will seem normal after a while.
Posted at 12:32PM Sep 12, 2008 [permalink/trackback] by Henry Story in SemWeb | Comments[5]
My Semantic Web BlogRoll
I have not had time to automate my blog roll publication yet. Here is the first step down that path. The following are the semantic web blogs I follow closely. I am sure I must be missing many others that are interesting. Though I already am way past the point of information overload. (For those in the same position here are some tips (via Danny))
- AI3:::Adaptive Information - Atom
- Mike Bergman on the semantic Web and structured Web
- About the social semantic web - RSS
- Web 2.0 - what's next?
- Bnode - atom
- bobdc.blog - RSS
- Bob DuCharme's weblog, mostly on technology for representing and linking information.
- Bill de hOra - atom
- Bill de HOra's blog
- captsolo weblog - RSS 1.0
- CaptSolo weblog
- connolly's blog - RSS
- Dan Connolly's blog
- Cloudlands - RSS
- John Breslin's Blog
- Daniel Lewis - RSS
- A technological, personal, spiritual, and academic blog.
- Dave Beckett - Journalblog - RSS 1.0
- RDF and free software hacking
- David Seth - RSS
- Semantic Web & my backyard
- dowhatimean.net - RSS
- Richard Cyganiak's Weblog
- Elastic Grid Blog - RSS
- The ultimate blog about the Elastic Grid solution...
- Elias Torres - RSS
- I'm working on a tagline. I promise.
- Inchoate Curmudgeon - RSS
- I'm getting there. What's the rush? It's about the journey, right?
- Internet Alchemy - RSS
- Seeing the world through RDF goggles since 2007
- Kashori - RSS
- Kingsley Idehen's Blog Data Space - RSS atom
- Data Space Endpoint for - Knowledge, Information, and Raw Data
- Les petites cases - Fourre-tout personnel virtuel de Got - RSS
- Lost Boy - RSS 1.0
- A journal of no fixed aims or direction by Leigh Dodds. If you see him wandering, point him in the direction of home.
- Mark Wahl, CISA - RSS
- Discussions on organizing principles for identity systems
- Michael Levin's Weblog and Swampcast! - RSS
- Software development, technobuzz, and everything else.
- Minding the Planet - RSS
- Nova Spivack's Journal of Unusual News & Ideas
- More News - RSS
- Nodalities - RSS
- From Semantic Web to Web of Data
- opencontentlawyer.com - RSS
- copyright, content, and you
- Perspectives - RSS
- Interfaces, web sémantique, hypermédia
- Planet Kiwi - RSS
- ... where all the KiwiKnows is!
- Planet RDF - RSS
- It's triples all the way down
- Planete Web Semantique - RSS
- French Semantic Web planet
- Raw - RSS 1.0
- Danny's linkiness
- Rinke Hoekstra - RSS
- "Time is nature's way to keep everything from happening at once." - John Wheeler
- S is for Semantics - Atom
- Dean Allemang's Blog - Check out our new book on the Semantic Web!
- Semantic Focus - RSS
- On the Semantic Web, Semantic Web technology and computational semantics
- Semantic Wave - RSS
- News feeds and commentary maintained by semantic web developer Jamie Pitts.
- Semantic Web Interest Group Scratchpad - RSS
- Semantic Web Interest Group IRC scratchpad where items mentioned and commented on in IRC get collected.
- Semantic Web Wire - RSS
- Comprehensive News Feed for Semantic Web.
- semantic weltbild 2.0 (Building the Semantic Web is easier together) - RSS 1.0
- Building the Semantic Web is easier together
- SemanticMetadata.net - Atom
- Speaking my mind - RSS
- The whole is more than the sum
- TagCommons - RSS
- toward a basis for sharing tag data
- TechBrew - RSS
- Informative geekery on software and technology
- Technical Ramblings - RSS
- Ramblings of a GIS Hacker
- Thinking Clearly - RSS
- Make lots of money through stealth in shadows
- W3C Semantic Web Activity News - RSS
I automated the creation of this blogroll by transforming the opml of my blog reader with the following xquery
declare namespace loc = "http://test.org/";
declare function loc:string($t as xs:string) {
$t
};
<html>
<body>
<dl>
{
for $outline in //outline
order by $outline/@title
return
<span>
<dt><a href="{ $outline/@htmlUrl}">{ loc:string($outline/@text) }</a> - <a href="{ $outline/@xmlUrl}">{ loc:string($outline/@version)}</a> </dt>
<dd>{ loc:string($outline/@description) }</dd>
</span>
}
</dl>
</body>
</html>
I then had to edit a bit of the generated html by hand to make it presentable.
Thanks to the Oxygen editor for making this really easy to do.
Posted at 05:52PM Jul 24, 2008 [permalink/trackback] by Henry Story in General | Comments[3]
Firefox 3 is out
Firefox 3.0 is out. It looks really, really good! Get it here! and help set a world record :-)
Posted at 02:18PM Jun 18, 2008 [permalink/trackback] by Henry Story in General | Comments[1]
FOAF & SSL: creating a global decentralised authentication protocol
Following on my previous post RDFAuth: sketch of a buzzword compliant authentication protocol, Toby Inkster came up with a brilliantly simple scheme that builds very neatly on top of the Secure Sockets Layer of https. I describe the protocol shortly here, and will describe an implementation of it in my next post.
Simple global ( passwordless if using a device such as the Aladdin USB e-Token ) authentication around the web would be extremely valuable. I am currently crumbling under the number of sites asking me for authentication information, and for each site I need to remember a new id and password combination. I am not the only one with this problem as the data portability video demonstrates. OpenId solves the problem but the protocol consumes a lot of ssl connections. For hyperdata user agents this could be painfully slow. This is because they may need access to just a couple of resources per server as they jump from service to service.
As before we have a very simple scenario to consider. Romeo wants to find out where Juliette is. Juliette's hyperdata Address Book updates her location on a regular basis by PUTing information to a protected resource which she only wants her friends and their friends to have access to. Her server knows from her foaf:PersonalProfileDocument who her friends are. She identifies them via dereferenceable URLs, as I do, which themselves usually (the web is flexible) return more foaf:PersonalProfileDocuments describing them, and pointing to further such documents. In this way the list of people able to find out her location can be specified in a flexible and distributed manner. So let us imagine that Romeo is a friend of a friend of Juliette's and he wishes to talk to her. The following sequence diagram continues the story...
The stages of the diagram are listed below:
First Romeo's User Agent HTTP GETs Juliette's public foaf file located at
http://juliette.net/. The server returns a representation ( in RDFa perhaps ) with the same semantics as the following N3:@prefix : <#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix todo: <http://eg.org/todo#> . @prefix openid: <http://eg.org/openid/todo#> . <> a foaf:PersonalProfileDocument; foaf:primaryTopic :juliette ; openid:server <https://aol.com/openid/service>; # see The Openid Sequence Diagram . :juliette a foaf:Person; foaf:name "Juliette"; foaf:openid <>; foaf:blog </blog>; rdfs:seeAlso <https://juliette.net/protected/location>; foaf:knows <http://bblfish.net/people/henry/card#me>, <http://www.w3.org/People/Berners-Lee/card#i> . <https://juliette.net/protected/location> a todo:LocationDocument .
Romeo's user agent receives this representation and decides to follow the https protected resource because it is a todo:LocationDocument.
- The todo:LocationDocument is at an https URL, so Romeo's User Agent connects to it via a secure socket. Juliette's server, who wishes to know the identity of the requestor, sends out a Certificate Request, to which Romeo's user agent responds with an X.509 certificate. This is all part of the SSL protocol.
In the communication in stage 2, Romeo's user agent also passes along his foaf id. This can be done either by:
- Sending in the HTTP header of the request an
Agent-Idheader pointing to the foaf Id of the user. Like this:This would be similar to the currentAgent-Id: http://romeo.net/#romeo
From:header, but instead of requiring an email address, a direct name of the agent would be required. (An email address is only an indirect identifier of an agent). -
The Certificate could itself contain the Foaf ID of the Agent in the X509v3 extensions section:
X509v3 extensions: ... X509v3 Subject Alternative Name: URI:http://romeo.net/#romeoI am not sure if it would be correct use of the X509 Alternative names field. So this would require more standardization work with the X509 community. But it shows a way where the two communities could meet. The advantage of having the id as part of the certificate is that this could add extra weight to the id, depending on the trust one gives the Certificate Authority that signed the Certificate.
- Sending in the HTTP header of the request an
-
At this point Juliette's web server knows of the requestor (Romeo in this case):
- his alleged foaf Id
- his Certificate ( verified during the ssl session )
If the Certificate is signed by a CA that Juliette trusts and the foaf id is part of the certificate, then she will trust that the owner of the User Agent is the entity named by that id. She can then jump straight to step 6 if she knows enough about Romeo that she trusts him.
Having Certificates signed by CA's is expensive though. The protocol described here will work just as well with self signed certificates, which are easy to generate.
- Juliette's hyperdata server then GETs the foaf document associated with the foaf id, namely
<http://romeo.net/>. Romeo's foaf server returns a document containing a graph of relations similar to the graph described by the following N3:@prefix : <#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix wot: <http://xmlns.com/wot/0.1/> . @prefix wotodo: <http://eg.org/todo#> . <> a foaf:PersonalProfileDocument; foaf:primaryTopic :romeo . :romeo a foaf:Person; foaf:name "Romeo"; is wot:identity of [ a wotodo:X509Certificate; wotodo:dsaWithSha1Sig """30:2c:02:14:78:69:1e:4f:7d:37:36:a5:8f:37:30:58:18:5a: f6:10:e9:13:a4:ec:02:14:03:93:42:3b:c0:d4:33:63:ae:2f: eb:8c:11:08:1c:aa:93:7d:71:01""" ; ] ; foaf:knows <http://bblfish.net/people/henry/card#me> . - By querying the semantics of the returned document with a SPARQL query such as
PREFIX wot: <http://xmlns.com/wot/0.1/> PREFIX wotodo: <http://eg.org/todo#> SELECT { ?sig } WHERE { [] a wotodo:X509Certificate; wotodo:signature ?sig; wot:identity <http://romeo.net/#romeo> . }Juliette's web server can discover the certificate signature and compare it with the one sent by Romeo's user agent. If the two are identical, then Juliette's server knows that the User Agent who has access to the private key of the certificate sent to it, and who claims to be the person identified by the URI
http://romeo.net/#romeo, is in agreement as to the identity of the certificate with the person who has write access to the foaf filehttp://romeo.net/. So by proving that it has access to the private key of the certificate sent to the server, the User Agent has also proven that it is the person described by the foaf file. - Finally, now that Juliette's server knows an identity of the User Agent making the request on the protected resource, it can decide whether or not to return the representation. In this case we can imagine that my foaf file says that
As a result of the policy of allowing all friends of Juliette's friends to be able to read the location document, the server sends out a document containing relations such as the following:@prefix foaf: <http://xmlns.com/foaf/0.1/> . <http://bblfish.net/people/henry/card#me> foaf:knows <http://romeo.net/#romeo> .
@prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> . @prefix : <http://juliette.org/#> . :juliette contact:location [ contact:address [ contact:city "Paris"; contact:country "France"; contact:street "1 Champs Elysees" ] ] .
Todo
- Create an ontology for X509 certificates.
- test this. Currently there is some implementation work going on in the so(m)mer repository in the misc/FoafServer directory.
- Can one use the Subject Alternative name of an X509 certificate as described here?
- For self signed certificates, what should the X509 Distinguished Name (DN) be? The DN is really being replaced here by the foaf id, since that is where the key information about the user is going to be located. Can one ignore the DN in a X509 cert, as one can in RDF with blank nodes? One could I imagine create a dummy DN where one of the elements is the foaf id. These would at least, as opposed to DN, be guaranteed to be unique.
- what standardization work would be needed to make this
Discussion on the Web
- Peter Williams is very positive, in his response on the OpenId mailing list where he gives a short overview of the history of the URI Subject Alternative name in the X509 spec.
- Paul Madsen gives a short description of how this would be implemented in the Liberty stack.
- The foaf+ssl proposal here is placed in the larger context in the audio presentation "Building Secure, Open and Distributed Social Network Applications".
Posted at 02:00PM Apr 21, 2008 [permalink/trackback] by Henry Story in SemWeb | Comments[4]
The OpenId Sequence Diagram
OpenId very neatly solves the global identity problem within the constraints of working with legacy browsers. It is a complex protocol though as the following sequence diagram illustrates, and this may be a problem for automated agents that need to jump around the web from hyperlink to hyperlink, as hyperdata agents tend to do.
The diagram illustrates the following scenario. Romeo wants to find the current location of Juliette. So his semantic web user agent GET's her current foaf file. But Juliette wants to protect information about her current whereabouts and reveal it only to people she trusts, so she configures her server to require the user agent to authenticate itself in order to get more information. If the user agent can prove that is is owned by one of her trusted friends, and Romeo in particular, she will deliver the information to it (and so to him).
The steps numbered in the sequence diagram are as follows:
- A User Agent fetches a web page that requires authentication. OpenId was designed with legacy web browsers in mind, for which it would return a page containing an OpenId login box such as the one to the right.
In the case of a hyperdata agent as in our use case, the agent would GET a public foaf file, which might contain a link to an OpenId authentication endpoint. Perhaps with some rdf such as the following N3:
Perhaps some more information would indicate which resources were protected.<> openid:login </openidAuth.cgi> .
-
In current practice a human user notices the login box and types his identifying URL in it, such as http://openid.sun.com/bblfish This is the brilliant invention of OpenId: getting hundreds of millions of people to find it natural to identify themselves via a URL, instead of an email. The user then clicks the "Login button".
In our semantic use case the hyperdata agent would notice the above openid link and would deduce that it needs to login to the site to get more information. Romeo's Id (http://romeo.net/perhaps ) would then be POSTed to the/openidAuth.cgiauthentication endpoint. - The OpenId authentication endpoint then fetches the web page by GETing Romeo's url
http://romeo.net/. This returned representation contains a link in the header of the page pointing Romeo's OpenId server url. If the representation returned is html then this would contain the following in the header<link rel="openid.server" href="https://openid.sun.com/openid/service" />
- The representation returned in step 3, could contain a lot of other information too. A link to a foaf file may not be a bad idea as I described in foaf and openid. The returned representation in step 3 could even be RDFa extended html, in which case this step may not even be necessary. For a hyperdata server the information may be useful, as it may suggest a connection Romeo could have to some other people that would allow it to decide whether it wishes to continue the login process.
- Juliette's OpenId authentication endpoint then sends a redirect to Romeo's user agent, directing it towards his OpenId Identity Provider. The redirect also contains the URL of the OpenId authentication cgi, so that in step 8 below the Identity Provider can redirect a message back.
- Romeo user agent dutifully redirects romeo to the identity provider, which then returns a form with a username and password entry box.
- Romeo's user agent could learn to fill the user name password pair in automatically and even skip the previous step 6 . In any case given the user name and password, the Identity Provider then sends back some cryptographic tokens to the User Agent to have it redirect to the OpenId Authentication cgi at
http://juliette.net/openidAuth.cgi. - Romeo's Hyperdata user agent then dutifully redirects back to the OpenId authentication endpoint
- The authentication endpoint sends a request to the Openid Identity provider to verify that the cryptographic token is authentic. If it is, a conventional answer is sent back.
- The OpenId authentication endpoint finally sends a response back with a session cookie, giving access to various resources on Juliette's web site. Perhaps it even knows to redirect the user agent to a protected resource, though that would have required some information concerning this to have been sent in stage 2.
- Finally Romeo's user agent can GET Juliette's protected information if Juliette's hyperdata web server permits it. In this case it will, because Juliette loves Romeo.
All of the steps above could be automatized, so from the user's point of view they may not be complicated. The user agent could even learn to fill in the user name and password required by the Identity Provider. But there are still a very large number of connections between the User Agent and the different services. If these connections are to be secure they would need to protected by SSL (as hinted at by the double line arrows). And SSL connections are not cheap. So the above may be unacceptably slow. On the other hand it would work with a protocol that is growing fast in acceptance.
It is is certainly worth comparing this sequence diagram with the very light weight one presented in "FOAF & SLL: creating a global decentralised authentication protocol".
Thanks again to Benjamin Nowack for bringing the discussion on RDFAuth to thinking about using the OpenId protocol directly as described above. See his post on the semantic web mailing list. Benjamin also pointed to the HTTP OpenID Authentication proposal, which shows how some of the above can be simplified if certain assumptions about the capabilities of the client are made. It would be worth making a sequence diagram of that proposal too.
Posted at 06:31PM Apr 18, 2008 [permalink/trackback] by Henry Story in SemWeb | Comments[8]


