The Sun BabelFish Blog

Don't panic !

Friday Jan 16, 2009

The W3C Workshop on the Future of Social Networking Position Papers

picture by Salvadore Dali

I am in Barcelona, Spain (the country of Dali) for the W3C Workshop on the Future of Social Networking. To prepare for this I decided to read through the 75 position papers. This is the conference I have been the best prepared for ever. It really changes the way I can interact with other attendees. :-)

I wrote down a few notes on most paper I read through, to help me remember what I read. This took me close to a week, a good part of which I spent trying to track down the authors on the web, find their pictures, familiarise myself with their work, and fill out my Address Book. Anything I could do to help me find as many connections as possible to help me remember the work. I used delicious to save some subjective notes, which can be found on under the w3csn tag. I was going to publish this on Wednesday, but had not quite finished reading through all the papers. I got back to my hotel this evening to find that Libby Miller, who co-authored the foaf ontology, had beat me to it with the extend and quality of her reviews which she published in a two parts:

Amazing work Libby!

70 papers is more than most people can afford to read. If I were to recommend just a handful of papers that stand out in my mind for now these would be:

  • Paper 36 by Ching-man Au Yeung, Laria Liccardi, Kanghao Lu, Oshani Seneviratne and Tim Berners Lee wrote the must read paper entitled "Decentralization: The Future of Online Social Networking". I completely agree with this outlook. It also mentions my foaf+ssl position paper, which of course gives it full marks :-) I would use "distribution" perhaps over "decentralisation", or some word that better suggests that the social network should be able to be as much of a peer to peer system as the web itself.
  • "Leveraging Web 2.0 Communities in Professional Organisations" really prooves why we need distributed social networks. The paper focuses on the problem faced by Emergency Response organisation. Social Networks can massively improove the effectiveness of such responses, as some recent catastrophes have shown. But ER teams just cannot expect everyone they deal with to be part of just one social network silo. They need to get help from anywhere it can come from. From professional ER teams, from people wherever they are, from infromation wherever it finds itself. Teams need to be formed ad hoc, on the spot. Not all data can be made public. Distributed Open Secure Social Networks are what is needed in such situations. Perhaps the foaf+ssl proposal (wiki page) can help to make this a reality.
  • In "Social networking across devices: opportunity and risk for the disabled and older community", Henni Swan explains how much social networking information could be put to use to help make better user interface for the disabled. Surprisingly enough none of the web sites, so taken by web 2.0 technologies, seem to put any serious, effort in this space. Aparently though this can be done with web 2.0 technologies, as Henny explains in her blog. The semantic Web could help even further I suggested to her at her talk today, by splitting the data from the user interface. Specialised browsers for the disabled could adapt the information for their needs, making it easy for them to navigate the graph.
  • "Trust and Privacy on the Social Web" starts the discussion in this very important space. If there are to be distributed social networks, they have to be secure, and the privacy and trust issues need to be looked at carefully.
  • On a lighter note, Peter Ferne's very entertaining paper "Collaborative Filtering and Social Capital" comes with a lot of great links and is a pleasure to read. Did you know about the Whuffie Index or CELEBDAQ? Find out here.
  • Many of the telecoms papers, of which Telefonica's "The social network behind telecom networks" reveal the elephant in the room that nobody saw in social networking: the telecoms. Who has the most information about everyone's social network? What could they do with this information? How may people have phones, compared to internet access? Something to think about.
  • Nokia's position paper can then be seen in a different light. How can handset manufacturers help put to use the social networking and location information contemporay devices are able to access? The Address Book in cell phones is the most important application in a telephone. But do people want to only connect to other Nokia users? This has to be another reason for distributed social networks.

    I will blog about other posts as the occasion presents itself in future blogs. This is enough for now. I have to get up early and be awake for tomorrow's talks which start at 8:30 am.

    In the mean time you can follow a lively discussion of the ongoing conference on twitter under the w3csn tag.

  • Tuesday Dec 30, 2008

    foaf+ssl, pki and the duck-rabbit

    In part II §xi of the "Philosophical Investigations", Ludwig Wittgenstein introduces the duck-rabbit figure:

    I shall call the following figure derived from Jastrow, the duck-rabbit. It can be seen as a rabbit's head or as a duck's. And I must distinguish between the 'continuous seeing' of an aspect and the 'dawning' of an aspect.

    The picture might have been shewn me, and I never have seen anything but a rabbit in it.

    It is worth stopping here and considering that illustration carefully, making sure you can see it one way then the other. There is no illusion here notice. There is not one correct way to see the line. The figure itself is ambiguous. The duck-rabbit therefore shows very simply how the way we perceive the world can change without any new fact appearing in the world.

    Is that not what magic does?

    Much more complex examples of this phenomenon can be found. In some cases it is much more difficult to switch between meanings. I find this for the Young Woman Old Woman image for example. I really need to work hard there to see the other interpretation, and when I find that interpretation I find switching back very difficult.

    Recently I have felt that the foaf+ssl protocol does something similar to Public Key Cryptography (PKI). We use a tool that was always meant to be used one way, in a completely different way, a way of course that was always permitted, but that nobody saw (or if they did they did not pursue it openly).

    To perceive this different way of using this tool one has to - just as with the duck-rabbit - look at it differently. One has to see it in a new way, or perhaps even use it in a new way. Whereas PKI is used for hierarchical trust, we use it to build a web of trust. Where X509 certs built up a lot on the Distinguished Name hierarchy, we nearly ignore it. Where X509 tried to place information in the certificate, we place it outside at the name location. Even though SSL can request client certificates in the browser, nobody does this, yet we build on this little known feature. Self signed client certificates, which would not have made sense in traditional PKI infrastructure, because they proove nearly nothing about the client, is what we build everything on....

    All the usual X509 and ssl tools work just as they should, but magically it seems they are suddenly found to be doing something completely different.

    Friday Dec 19, 2008

    what does foaf+ssl give you that openid does not?

    Jason Kolb asked on Twitter "what does foaf+ssl give you that openid does not?". I can make the answer short but not short enough for a tweet. So here are my initial thoughts on this.

    • foaf+ssl gives people and other agents a URL for Identification, just like OpenId does. But in the case of foaf+ssl the user does not need to remember the URL, the browser or keychain does. A login button on a foaf+ssl web site is just a button. No need to enter any identifier. Just click the button. Your browser will then ask you what identity you wish to use. The user does not need to remember the password either (except perhaps that of the keychain if the browser requires it).
    • The foaf+ssl protocol requires minum 1 to 2 network connections. Compare this to the much more complex OpenId sequence diagram. In a world of distributed data where each site can point to data on any other site, this can become really important.
    • the description of foaf+ssl holds on one page. A page is required to list the OpenId specs.
    • foaf+ssl builds on well established standards: REST, RDF, SSL, X509. That is why of course it takes much less space to explain. It does not invent anything new.
    • foaf+ssl is clearly RESTful. You can GET your foaf file, and if you needed update it with PUT. You could create it with POST. No need to reinvent those verbs as OpenId has to do in OpenId Attribute Exchange spec
    • It is easy to add new attributes to the rdf file. It is easy to extend, and to give the extensions meaning. Every attribute is a URI, which when clicked on can give you yet more information about the relation, and participate in the Linked Data cloud. New classes can be created. You can add relations to objects, and those objects themselves can have yet more relations (see my foaf file, and how it relates me to an address, which is related to a country). The complex OpenId attribute exchange spec does not offer any of this.
    • You can reason about the foaf. Well that just comes for free with RDF and OWL. (So you can do this too with OpenId, but you'd have to treat it as a special case of RDF for that to work.)
    • Being simpler, it will be easier to
    • With foaf+ssl you get a web of trust. With OpenId you only get trust indirectly if you trust the OpenId provider. So for example you may trust the information gathered by the foaf+ssl attribute exchange mechanism of someone who has an OpenId provider at the url http://openid.sun.com/, because you trust Sun Microsystems. With foaf+ssl you can get trust of some file on some web server you never heard about because all your friends point to his foaf file.
    • Foaf+ssl is distributed. There is no need for a OpenId provider. You just need a web server, ideally your own at your own domain name. Yes you can run your OpenId server locally too, but then you loose the trust that might have been associated with that domain name. Have you ever wondered why there were so many very large OpenId providers, and not many small ones?
    • Foaf+ssl requires no HTTP redirects: these are problematic on many cell phones I am told, in part often because telecoms proxys get in the way.

    OpenId is very well known and widely used now. It has made people aware of the power of a URL for identifying people, and is what helped me find this solution. Furthermore it would be quite easy to create a foaf+openid service as I proposed some time ago, simplifying OpenId in the process. So the two technologies are not incompatible.

    More on foaf+ssl on the esw wiki

    foaf+ssl user story 1: web site personalisation

    In Agile development one creates simple User Stories. Here is the simplest one I can think of for foaf+ssl. It only uses the authentication piece, not the authorization part, so all the steps up to and including 5 in the sequence diagram.

    Prerequisite: A User has a foaf+ssl certificate in his browser and corresponding foaf file.

    The User arrives at a new web site he has never been to before. An https connection is made and the server asks for the client certificate. The User chooses one. The web site fetches the users foaf file at the URI contained in the certificate and uses this to personalise the site. Some things it could do would be

    • Welcome the user by name
    • List friends the user may know on the site
    • List projects the user may be interested in
    • Create an account for the user, ie, some space on the server dedicated to the user.
    What it can do will depend on the site, the information in the foaf file, and the location of the user's URI in the social network known to the web service.

    Thursday Dec 18, 2008

    python and php implementations of foaf+ssl

    We now have two new implementations of foaf+ssl authentication protocol, in addition to the java one I blogged about earlier. If you have followed the procedure there to create your certificate, add it to your browser, and publish a minimal foaf file you can then try out these two servers.

    Melvin Carvhalo, who owns the great domain name foaf.me, has implemented this in PHP in a very nicely layered fashion. In recent mail to the foaf protocols list he published the following end points:

    1. a test ssl resource will from a simple ssl connection that asks for the client certificate:
      • Display the output of the $_SERVER global variable
      • Display the details in the supplied Client Certificate
      • Display the Client Public Key info
      • Function returning the Client Public Key info in HEX
      • Function returning the subjectAltName in the Client Certificate
    2. foaf tester that after getting the URI in your certificate from the X509 v3 extensions section will fetch the foaf at that URL and
      • Convert the FOAF into an array of triples which it displays
      • Find the RSA Key of the declared subject ("owner") within a FOAF file
      • Get the list of friends in a FOAF file
    3. and finally the foaf+ssl tester, which Melvin pointed to in another email to the list, which will use the foaf+ssl protocol to log you into a server in one https connection. The server only does authentication and the minimal authorization: if it can authenticate you, then you are authorized

    These three minimal services are very helpful as they allow us to detect and debug each stage in the protocol carefully. I highly recomment this step by step approach (and will therefore have to add this to my own examples!)

    Ian Jacobi from MIT, has worked on extending authorization more with his python based server to also check your identity in a social network. See his detailed post on this "TAAC in action". Ian was in fact the first to have a running implementation I'd like to point out.

    Keep these coming!

    In the meantime I am working on authorization schemes, and am currently reading a complex paper Vladimir Kolovski, James Hendler, and Bijan Parsia entitled "Formalizing XACML Using Defeasible Description Logics". Clark Kendall is blogging about this under the policy management tag, which contains a less mathematical overview of the paper. I'll report back when I have managed to digest this. Read it if you need an antidote to twitter.

    Thursday Dec 04, 2008

    JavaOne 2009 call for papers

    Picture of JavaOne2008 keynote conference room

    The JavaOne 2009 call for papers is now open (direct link to form). The deadline for paper submissions is December 19th.

    Last year we had three Semantic Web related talks: one panel presentation, an introduction by Dean Allemang, and a small Birds of A Feather session. The talks went very well and were very well attended, surprisingly so given that they were somewhat in the wrong logical order, starting with the panel discussion, and ending with theory. Dean Allemang had over 300 attendees at his talk ( slides ). JavaOne is compared to most developer conferences huge. There are usually over 15 thousand attendees, so it is an excellent venue to speak to and convert a very large crowd to something new in one go.

    I don't expect us to grow at the same rate as we did last year (we had a 200% increase in the number of talks). But I think we really should fit in some presentations on Java Semantic Web Frameworks, such as Sesame, Mulgara, Jena, or something that gives an overview on all of them. But I am not here to decide what goes in these talks. The track to look at is probably services track which covers a huge swath from cloud computing to web 2.0 SOA and more.

    Remember that JavaOne attendees are practical people most of all. There is also a very large space for businesses to introduce attendees to their products. So we are here at the point where research meets business.

    I know this clashes with the 6th European Semantic Web Conference in Greece, so I myself may have to do the impossible task of being at both simultaneously. On the other hand it is only one week before the Semantic Technology Conference in San Jose, so it can be a good time to visit the Bay Area, and meet the companies here, or vacation in the sun. :-)

    See: JavaOne2008 or JavaOne tagged photos on flickr.

    video on distributed social network platform NoseRub

    I just came across this video on Twitter by pixelsebi explaining Distributed social networks in a screencast, and especially a php application NoseRub. Here is the video.


    Distributed Social Networking - An Introduction from pixelsebi on Vimeo.

    On a "Read Write Web" article on his video, pixelsebi summarizes how all these technologies fit together:

    To sum it up - if I would have to describe it somebody who has no real clue about it at all:
    1. Distributed Social Networking is an architecture approach for the social web.
    2. DiSo and Noserub are implementations of this "social web architecture"
    3. OpenSocial REST API is one of many ways to provide data in this distributed environment.
    4. OpenOScial based Gadgets might run some time at any node/junction of this distributed environment and might be able to handle this distributed social web architecture.

    So I would add that foaf provides semantics for describing distributed social networks, foaf+ssl is one way to add security to the system. My guess is that the OpenSocial Javascript API can be decoupled from the OpenSocial REST API and produce widgets however the data is produced (unless they made the mistake of tying it too closely to certain URI schemes)

    Tuesday Dec 02, 2008

    foaf+ssl: adding security to open distributed social networks

    For the "W3C Workshop on the Future of Social Networking", taking place in Barcelona January 2009

    Attending:
    Henry Story
    Contributors:
    Bruno Harbulot, Ian Jacobi, Toby Inkster
    Enthusiastic:
    Melvin Carvalho

    Semantic Web vocabularies such as foaf permit distributed hyperlinked social networks to exist. We would like to discuss a group of related ways we are exploring (mailing list) to add information and services protection to such distributed networks.

    One major criticism of open networks is that they seem to have no way of protecting the personal information distributed on the web or limiting access to resources. Few people are willing to make all their personal information public, many would like large pieces to be protected, making it available only to a select group of agents. Giving access to information is very similar to giving access to services. There are many occasions when people would like services to only be accessible to members of a group, such as allowing only friends, family members, colleagues to post a blog, photo or comment on a site. How does one do this in a maximally flexible way, without requiring any central point of access control?

    Using an intuition made popular by OpenID we show how one can tie a User Agent to a URI by proving that he has write access to it. foaf+ssl is architecturally a simpler alternative to OpenID (fewer connections), that uses X.509 certificates to tie a User Agent (Browser) to a Person identified via a URI. However, foaf+ssl can provide additional features, in particular, some trust management, relying on signing FOAF files, in conjunction with set of locally trusted keys, as well as a bridge with traditional PKIs. By using the existing SSL certificate exchange mechanism, foaf+ssl integrates more smoothly with existing browsers (pictures with Firefox) including mobile devices, and permits automated sessions in addition to interactive ones.

    The steps in the protocol can be summarised simply:

    1. A web page points to a protected resources using a https URL, e.g. https://juliette.net/location
    2. The client fetches the secure http URL .
    3. As part of that exchange the server requests the client certificate. The client returns Romeo's (possible self signed) certificate, containing the little known X.509 v3 extensions section:
              X509v3 extensions:
                 ...
                 X509v3 Subject Alternative Name: 
                                 URI:http://romeo.net/#romeo
      
      Because the connection is encrypted, Juliet's server knows that Romeo's client knows the private key of the public key that is also passed in the certificate. Something like:
            Subject Public Key Info:
                  Public Key Algorithm: rsaEncryption
                  RSA Public Key: (1024 bit)
                      Modulus (1024 bit):
                          00:b6:bd:6c:e1:a5:ef:51:aa:a6:97:52:c6:af:2e:
                          71:94:8a:b6:da:9e:5a:5f:08:6d:ba:75:48:d8:b8:
                          01:50:d3:92:11:7d:90:13:89:48:06:2e:ec:6e:cb:
                          57:45:a4:54:91:ee:a0:3a:46:b0:a1:c2:e6:32:4d:
                          54:14:4f:42:cd:aa:05:ca:39:93:9e:b9:73:08:6c:
                          fe:dc:8e:31:64:1c:f7:f2:9a:bc:58:31:0d:cb:8e:
                          56:d9:e6:da:e2:23:3a:31:71:67:74:d1:eb:32:ce:
                          d1:52:08:4c:fb:86:0f:b8:cb:52:98:a3:c0:27:01:
                          45:c5:d8:78:f0:7f:64:17:af
                      Exponent: 65537 (0x10001)
      
    4. Juliet's server dereferences the URI found in the certificate, fetching a document .
    5. The document's log:semantics is queried for information regarding the public key contained in the previously mentioned X.509. This can be done in part with a SPARQL query such as:
      PREFIX cert: <http://www.w3.org/ns/auth/cert#>
      PREFIX rsa: <http://www.w3.org/ns/auth/rsa#>
      SELECT ?modulus ?exp
      WHERE { 
         ?key cert:identity <http://romeo.net/#romeo>;
              a rsa:RSAPublicKey;
              rsa:modulus [ cert:hex ?modulus; ];
              rsa:public_exponent [ cert:decimal ?exp ] .   
      }                   
      
      If the public keys in the certificate is found to be identical to the one published in the foaf file, the server knows that the client has write access over the http://romeo.net/ resource.
    6. Romeo's identity is then checked as to its position in a graph of relations (including frienship ones) in order to determine trust according to some criteria . Juliet's server can get this information by crawling the web starting from her foaf file, or by other means.
    7. Access is granted or denied .

    We have tested this on multiple platforms in a number of different languages, (Java™, Python, ...) and across a number of existing web browsers (Firefox, Safari, more to come).

    foaf+ssl is one protocol that we would like to concentrate on due to its simplicity. But there are a number of other ways of achieving the same thing, by using OpenID for example. All of them require some extra pieces:

    • An ontology to describe what can be done with the data (copied, republished,...) or what obligations incur in using a service .
    • An ontology to describe who has access to the service. This would be useful to help people decide if they should bother trying to access it, or what else they need to do such as become friends with someone, or reveal a bug in the software somewhere .
    • Other things that might come up .

    We will discuss our experience implementing this, the problems we have encountered and where we think this is leading us to next.

    Sunday Nov 30, 2008

    personalising my blog

    image of the sidebar of my blog

    Those who read me via news feeds (I wonder how many those are), may not have seen the recent additions I have made to my blog pages. I have added a view onto:

    This is quite a lot of personal info. With my friend of a friend network it should be clear how you have more and more of the type of information you could find in social networking sites such as facebook on my blog. And this could keep growing of course.

    The current personalization is mostly powered by JavaScript (with one flash application for last.fm ). Here is the code I added to my blog template, pieces of which I found here and there on the web, often in templates provided by the web services themselves.

     <h2>Recent Photos</h2><!-- see http://veerle.duoh.com/blog/comments/fickr_badge_w3c_valid/ -->
        <div id="flickr"><script type="text/javascript" 
       src="http://www.flickr.com/badge_code_v2.gne?count=6&display=latest&size=s&layout=x&source=user&user=88952050%40N00">
      </script>
        </div>
        <div class="recentposts">
         <script type="text/javascript" 
         src="http://feeds.delicious.com/v2/js/bblfish?title=My%20Recent%20Bookmarks&icon=s&count=5&sort=date&tags&extended">
        </script>   
        </div>
        <h2>Twittering</h2>
        <div id="twitter_div" class="recentposts">
        <a href="http://twitter.com/bblfish">last 5 entries:</a><br/>
    <ul id="twitter_update_list"></ul>
    </div>
    <script src="http://twitter.com/javascripts/blogger.js" type="text/javascript"></script>
    <script src="http://twitter.com/statuses/user_timeline/bblfish.json?callback=twitterCallback2&count=5" type="text/javascript">
    </script>
      <h2>Listening To</h2>
    <!-- I am looking for something lighter than this! -->
    <style type="text/css">table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 td 
      {margin:0 !important;padding:0 !important;border:0 !important;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmHead 
      a:hover
     {background:url(http://cdn.last.fm/widgets/images/en/header/chart/recenttracks_regular_blue.png) 
         no-repeat 0 0 !important;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmEmbed object {float:left;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmFoot td.lfmConfig a:hover 
        {background:url(http://cdn.last.fm/widgets/images/en/footer/blue.png) no-repeat 0px 0 !important;;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmFoot td.lfmView a:hover 
        {background:url(http://cdn.last.fm/widgets/images/en/footer/blue.png) no-repeat -85px 0 !important;}
     table.lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484 tr.lfmFoot td.lfmPopup a:hover 
        {background:url(http://cdn.last.fm/widgets/images/en/footer/blue.png) no-repeat -159px 0 !important;}
    </style>
    <table class="lfmWidgetchart_0bbc5b054e26d39362c0a10c7761f484" cellpadding="0" cellspacing="0" border="0" 
       style="width:184px;"><tr class="lfmHead">
       <td><a title="bblfish: Recently Listened Tracks" href="http://www.last.fm/user/bblfish" target="_blank" 
           style="display:block;overflow:hidden;height:20px;width:184px;background:url(http://cdn.last.fm/widgets/images/en/header/chart/recenttracks_regular_blue.png)
             no-repeat 0 -20px;text-decoration:none;border:0;">
       </a></td></tr>
       <tr class="lfmEmbed"><td>
       <object type="application/x-shockwave-flash" data="http://cdn.last.fm/widgets/chart/friends_6.swf" 
         codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,0,0" 
         id="lfmEmbed_210272050" width="184" height="199"> 
       <param name="movie" value="http://cdn.last.fm/widgets/chart/friends_6.swf" /> 
      <param name="flashvars" value="type=recenttracks&user=bblfish&theme=blue&lang=en&widget_id=chart_0bbc5b054e26d39362c0a10c7761f484" /> 
       <param name="allowScriptAccess" value="always" /> 
        <param name="allowNetworking" value="all" /> 
        <param name="allowFullScreen" value="true" /> 
        <param name="quality" value="high" /> <param name="bgcolor" value="6598cd" /> 
        <param name="wmode" value="transparent" /> <param name="menu" value="true" /> 
        </object></td></tr><tr class="lfmFoot">
        <td style="background:url(http://cdn.last.fm/widgets/images/footer_bg/blue.png) repeat-x 0 0;text-align:right;">
        <table cellspacing="0" cellpadding="0" border="0" style="width:184px;">
        <tr><td class="lfmConfig">
       <a href="http://www.last.fm/widgets/?colour=blue&chartType=recenttracks&user=bblfish&chartFriends=1&from=code&widget=chart" 
        title="Get your own widget" target="_blank" 
       style="display:block;overflow:hidden;width:85px;height:20px;float:right;background:url(http://cdn.last.fm/widgets/images/en/footer/blue.png)
              no-repeat 0px -20px;text-decoration:none;border:0;">
        </a></td><td class="lfmView" 
         style="width:74px;">
        <a href="http://www.last.fm/user/bblfish" title="View bblfish's profile" 
         target="_blank" style="display:block;overflow:hidden;width:74px;height:20px;background:url(http://cdn.last.fm/widgets/images/en/footer/blue.png)
            no-repeat -85px -20px;text-decoration:none;border:0;">
        </a>
        </td><td class="lfmPopup"
         style="width:25px;">
        <a href="http://www.last.fm/widgets/popup/?colour=blue&chartType=recenttracks&user=bblfish&chartFriends=1&from=code&widget=chart&resize=1" 
           title="Load this chart in a pop up" 
           target="_blank" 
           style="display:block;overflow:hidden;width:25px;height:20px;background:url(http://cdn.last.fm/widgets/images/en/footer/blue.png) 
                 no-repeat -159px -20px;text-decoration:none;border:0;" 
           onclick="window.open(this.href + '&resize=0','lfm_popup','height=299,width=234,resizable=yes,scrollbars=yes'); return false;"
    ></a></td>
       </tr></table>
       </td></tr>
       </table>
    

    So that as you can see is quite a lot of extra html every time someone wants to download my web page. This would not be too bad, but the above javascript widgets themselves go and fetch a lot of html, javascript, code and other content to further slow down the responsiveness of the web pages. This data is served to everyone whether they want to see all that information or not. Well, if they don't they can subscribe to the rss feed by dragging this page into a feed reader. In which case they will just see the blog posts themselves, and not the sidebar.

    Why add this information to my blog? Well it gives people an idea of where they can find out more about me. A lot of people don't know that I have a del.icio.us feed, so they may not know that they can follow what I am reading over there. This gives the initial feeling of what it would be like to have a deeper view on my activities.

    But as mentioned previously, there are a few problems with this.

    • This makes this page heavier.
    • Every page view on my blog will download that information and start those applets. ( A great way for those services to track the number of people directly visiting these pages btw. )
    • This can become tedious. People who want to follow me can do so by coming to this web page from time. But with enough sites like that this is going to become a bit difficult to do. One does not want to spend all day reading the different feeds of information of one's friends. This is what Facebook does for people: it is a giant web based feed reader of social information.
    • Difficult to track change: If I switch to a different book marking service, perhaps a semantic one like faviki, I will have to redo this page, and all my friends are going to have to update their feeds.
    • If I add more of the resources I am working on this page is going to become unmaintainably long
    • People who read my feed will not notice the changes occurring here.

    So those are the problems that Web 3.0, the semantic web is going to solve. By just downloading my foaf file, you should have access to my network of friends via linked data, and via pointers to all the other resources on the web that I may be using. Whatever tool you use will be able to then keep all this data easily up to date, and with great search tools, enhance your view of the many linked networks you will be part of and tracking.

    The whole code you see above could then be replaced with one link to my foaf file. That foaf file can itself be point to further resources in case it becomes large. To give a list of some of my the most interesting accounts I have I added the following N3 to my foaf file today:

    @prefix : <http://bblfish.net/people/henry/card#> .
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
    
    :me foaf:holdsAccount 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's skype account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://www.skype.com/>
                  ],
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's flickr pictures account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://www.flickr.com/>
                    foaf:accountProfilePage <http://www.flickr.com/people/bblfish>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's last.fm music account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://www.last.fm/>
                    foaf:accountProfilePage <http://www.last.fm/user/bblfish>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's delicious bookmarking account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://delicious.com/>
                    foaf:accountProfilePage <http://delicious.com/bblfish>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's java.net developer account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://java.net/>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's twitter micro blogging account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://twitter.com/>
                    foaf:accountProfilePage <http://twitter.com/bblfish>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's twine semantic aggregation account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://twine.com/>
                    foaf:accountProfilePage <http://www.twine.com/user/bblfish>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's facebook social networking account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://www.facebook.com/>
                  ], 
                  [ a foaf:OnlineAccount ;
                    rdfs:label "Henry Story's linked in business social network account"@en;
                    foaf:accountName "bblfish";
                    foaf:accountServiceHomepage <http://www.linkedin.com/>
                    foaf:accountProfilePage <http://www.linkedin.com/pub/0/482/680>
                  ] .
    

    First of all it should be clear that the above is a lot more readable that the javascript code shown earlier in this post. Secondly I listed over twice as many online accounts there than I currently have in my side bar. And finally this is in a file that a client would not need to download unless it had an interest in knowing more about me. This could easily be cached over a period of time, and need not be served up again on each page request.

    Again for one possible view on the above data it is worth installing the Tabulator Firefox extension and then clicking on my foaf icon. There are of course many more things specialized software could do with that infomation than present it like that.

    On this topic, you may want to continue by looking at the recently published, excellent and beautiful presentation on the subject of the Social Semantic Web, by John Breslin.

    variation on @timoreilly: hyperdata is the new intel outside

    Context: Tim O'Reilly said "Data is the new Intel Inside".

    Recently in a post "Why I love Twitter":

    What's different, of course, is that Twitter isn't just a protocol. It's also a database. And that's the old secret of Web 2.0, Data is the Intel Inside. That means that they can let go of controlling the interface. The more other people build on Twitter, the better their position becomes.

    The meme was launched in the well known "What is Web 2.0" paper in the section entitled "Data is the next Intel Inside"

    Applications are increasingly data-driven. Therefore: For competitive advantage, seek to own a unique, hard-to-recreate source of data.

    Most of the data is outside your database. It can only be that way, the world is huge, and you are just one small link in the human chain. Linking that data is knowledge and value creation. Hyperdata is the foundation of Web 3.0.

    Tuesday Nov 11, 2008

    REST APIs must be hypertext driven

    Roy Fielding recently wrote in "REST APIs must be hypertext-driven"

    I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today's example is the SocialSite REST API. That is RPC. It screams RPC. There is so much coupling on display that it should be given an X rating.

    That was pretty much my thought when I saw that spec. In a comment to his post he continues.

    The OpenSocial RESTful protocol is not RESTful. It could be made so with some relatively small changes, but right now it is just wrapping RPC results in common Web media types.

    Clarification of Roy's points

    Roy then goes on to list some key criteria for what makes an application RESTful.

    • REST API should not be dependent on any single communication protocol, though its successful mapping to a given protocol may be dependent on the availability of metadata, choice of methods, etc. In general, any protocol element that uses a URI for identification must allow any URI scheme to be used for the sake of that identification.

      In section 2.2 of the O.S. protocol we have the following JSON representation for a Person.

      {
          "id" : "example.org:34KJDCSKJN2HHF0DW20394",
          "displayName" : "Janey",
          "name" : {"unstructured" : "Jane Doe"},
          "gender" : "female"
      }
      

      Note that the id is not a URI. Further down in the XML version of the above JSON, it is made clear that by appending "urn:guid:" you can turn this string into a URI. By doing this the protocol has in essence tied itself to a URI scheme, since there is no way of expressing another URI type in the JSON - the JSON being the key representation in this Javascript specific API by the way, the aim of the exercise being to make the writing of social network widgets interoperable. Furthermore this scheme has some serious limitations such as for example that it limits one to 1 social network per internet domain, is tied to a quite controversial XRI spec that has been rejected by OASIS, and does not provide a clear mechanism for retrieving information about it. But that is not the point. The definition of the format is tying itself unnecessarily to a URI scheme, and moreover one that ties one to what is clearly a client/server model.

    • A REST API should not contain any changes to the communication protocols aside from filling-out or fixing the details of underspecified bits of standard protocols, such as HTTP's PATCH method or Link header field.
    • A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

      Most of these so called RESTful APIs spend a huge amount of time specifying what response a certain resource should give to a certain message. Note for example section 2.1 entitled Responses

    • A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].

      In section 6.3 one sees this example:

      /activities/{guid}/@self                -- Collection of activities generated by given user
      /activities/{guid}/@self/{appid}        -- Collection of activities generated by an app for a given user
      /activities/{guid}/@friends             -- Collection of activities for friends of the given user {guid}
      /activities/{guid}/@friends/{appid}     -- Collection of activities generated by an app for friends of the given user {guid}
      /activities/{guid}/{groupid}            -- Collection of activities for people in group {groupid} belonging to given user {uid}
      /activities/{guid}/{groupid}/{appid}    -- Collection of activities generated by an app for people in group {groupid} belonging to given user {uid}
      /activities/{guid}/@self/{appid}/{activityid}   -- Individual activity resource; usually discovered from collection
      /activities/@supportedFields            -- Returns all of the fields that the container supports on activity objects as an array in json and a repeated list in atom.
      

      For some reason it seems that this protocol does require a very precise lay out of the patterns of URLs. Now it is true that this is then meant to be specified in an XRDS document. But this document is not linked to from any of the representations as far as I can see. So there is some "out of band" information exchange that has happened and on which the rest of the protocol relies. Furthermore it ties the whole service again to one server. How open is a service which ties you to one server?

    • A REST API should never have "typed" resources that are significant to the client. Specification authors may use resource types for describing server implementation behind the interface, but those types must be irrelevant and invisible to the client. The only types that are significant to a client are the current representation's media type and standardized relation names. [ditto]

      Now clearly one does want to have URIs name resources, things, and these things have types. I think Roy is here warning against the danger that expectations are placed on types that depend on the resources themselves. This seems to be tied to the previous point that one should not have fixed resource names or hierarchies as we saw above. To see how this is possible check out my foaf file:

      
      $ cwm http://bblfish.net/people/henry/card --ntriples | grep knows | head
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://axel.deri.ie/~axepol/foaf.rdf#me> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://b4mad.net/FOAF/goern.rdf#goern> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://bigasterisk.com/foaf.rdf#drewp> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://crschmidt.net/foaf.rdf#crschmidt> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://danbri.org/foaf.rdf#danbri> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://data.boab.info/david/foaf.rdf#me> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://davelevy.info/foaf.rdf#me> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dblp.l3s.de/d2r/page/authors/Christian_Bizer> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/James_Gosling> .
          <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/Roy_Fielding> .
      

      Notice that there is no pattern in the URIs to the right. (As it happens there are no ftp URLs there, but it would work just as well if there were). Yet the Tabulator extension for Firefox knows from the relations above alone that (if it believes my foaf file of course) the URIs to the right refer to people. This is because the foaf:knows relation is defined as

      
      @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      
      foaf:knows  a rdf:Property, owl:ObjectProperty;
               :comment "A person known by this person (indicating some level of reciprocated interaction between the parties).";
               :domain <http://xmlns.com/foaf/0.1/Person>;
               :isDefinedBy <http://xmlns.com/foaf/0.1/>;
               :label "knows";
               :range foaf:Person .
      

      This information can then be used by a reasoner (such as the javascript one in the tabulator) to deduce that the resources pointed to by the URIs to the right and to the left of the foaf:knows relation are members of the foaf:Person class.

      Note also that there is no knowledge as to how those resources are served. In many cases they may be served by simple web servers sending resources back. In other cases the RDF may be generated by a script. Perhaps the resources could be generated by java objects served up by Jersey. The point is that the Tabulator does not need to know.

      Furthermore, the ontology information above is not out of band. It is GETable at the foaf:knows URIs itself. The name of the relation links to the information about the relations, which gives us enough to be able to deduce further facts. This is hypertext - hyperdata in this case - at its best. Compare that with the JSON example given above. There is no way to tell what that JSON means outside of the context of the totally misnamed 'Open Social RESTful API'. This is a limitation of JSON, or at least this name space less version. One would have to add a mime type to the JSON to make it clear that the JSON had to be interpreted in a particular manner for this application, but I doubt most JSON tools would know what to do with mime typed JSON versions. And do you really want to go through a mime type registration process every time a social networking application wants to add a new feature or interact with new types of data?

      as Roy summarizes in one one of the replies to this blog post:

      When representations are provided in hypertext form with typed relations (using microformats of HTML, RDF in N3 or XML, or even SVG), then automated agents can traverse these applications almost as well as any human. There are plenty of examples in the linked data communities. More important to me is that the same design reflects good human-Web design, and thus we can design the protocols to support both machine and human-driven applications by following the same architectural style.

      To get a feel of this it really helps to play with other hyperdata applications, other than ones residing in web browsers The semantic address book is one such, that I spent some time writing.

    • A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user‚Äôs manipulation of those representations. The transitions may be determined (or limited by) the client's knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

      That is the out of band point made previously, and confirms the point made about the danger of protocols that depend on URI patterns or resources that are somehow typed at the protocol level. You should be able to pick up a URI and just go from there. With the tabulator plugin you can in fact do just that on any of the URLs listen in my foaf file, or in other RDF.

    What's the point?

    Engineers under the spell of the client/server architecture, will find some of this very counter intuitive. This is indeed why Roy's thesis, and the work done by the people who engineered the web before that and whose wisdom is distilled in various writings by the Technical Architecture Group did something that was exceedingly original. These very simple principles that can feel unintuitive to someone who is not used to thinking at a global information scale, make a lot of sense when you do come to think at that level. When you do write such an Open system, that can allow people to access information globally, you want it to be such that you can send people a URI to any resource you are working with, so that both of you can speak about the same resource. Understanding what the resource that URL is about should be found by GETting the meaning of the URL. If the meaning of that URL depends on the way you accessed it, then you will no longer be able to just send a URL, but you will have to send 8 or 9 URLs with explanations on how to jump from one representation to the other. If some out of band information is needed to understand that one has to inspect the URL itself to understand what it is about, then you are not setting up an Open protocol, but a secret one. Secret protocols may indeed be very useful in some circumstances, and so as Roy points out may non RESTful ones be:

    That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style.
    but note: it is much more difficult for them to make use of the network effect: the value of information grows exponentially with its ability to be linked to other information. In another reply to a comment Roy puts this very succinctly:
    encoding knowledge within clients and servers of the other side’s implementation mechanism is what we are trying to avoid.

    Friday Sep 12, 2008

    RDF: Reality Distortion Field

    Here is Kevin Kelly's presentation on the next 5000 days on the web, in clear easy English that every member of the family can watch and understand. It explains what the semantic web, also known as Web 3.0, is about and how it will affect technology and life on earth. Where is the web going? I can find no fault in this presentation.

    This is a great introduction. He explains how Metcalf's law brought us to the web of documents and is leading us inexorably to a web of things, in which we will be the eyes and the hands of this machine called the internet that never stops running.
    For those with a more technical mind, who want to see how this is possible, follow this up with a look at the introductory material to RDF.

    Warning: This may change the way you think. Don't Panic! Things will seem normal after a while.

    Thursday Sep 04, 2008

    Building Secure, Open and Distributed Social Network Applications

    Current Social Networks don't allow you to have friends outside their network. When on Facebook, you can't point to your friend on LinkedIn. They are data silos. This audio enhanced slide show explains how a distributed decentralized social network is being built, how it works, and how to make is secure using the foaf+ssl protocol (a list of pointers on the esw wiki).

    It is licenced under a CC Attribution ShareAlike Licence.
    My voice is a bit odd on the first slide, but it gets better I think as I go along.

    Building Secure Open & Distributed Social Networks( Viewing this slide show requires a flash plugin. Sorry I only remembered this limitation after having put it online. If you know of a good Java substitute let me know. The other solution would have been to use Slidy. PDF and Annotated Open Document Format versions of this presentation are available below. (why is this text visible in Firefox even when the plugin works?) )

    This is the presentation I gave at JavaOne 2008 and at numerous other venues in the past four months.

    The slidecast works a lot better as a presentation format, than my previous semantic web video RDF: Connecting Software and People which I published as a h.264 video over a couple of years ago, and which takes close to 64MB of disk space. The problem with that format is that it is not easy to skip through the slides to the ones that interest you, or to go back and listen to a passage carefully again. Or at least it feels very clunky. My mp3 sound file only takes 17MB of space in comparison, and the graphics are much better quality in this slide show.

    It is hosted by the excellent slideshare service, which translated my OpenOffice odp document ( once they were cleaned up a little: I had to make sure it had no pointers to local files remaining accessible from the Edit>Links menu (which otherwise choked their service)). I used the Audacity sound editor to create the mp3 file which I then place on my bblfish.net server. Syncing the sound and the slides was then very easy using SlideShare's SlideCast application. I found that the quality of the slides was a lot better once I had created an account on their servers. The only thing missing would be a button in addition to the forward and backward button that would allow one to show the text of the audio, for people with hearing problems - something equivalent to the Notes view in Open Office.

    You can download the OpenOffice Presentation which contains my notes for each slide and the PDF created from it too. These are all published under a Creative Commons Attribution, Share Alike license. If you would like some of the base material for the slides, please contact me. If you would like to present them in my absence feel free to.

    Tuesday Sep 02, 2008

    Getting started with RDF

    So you have seen Kevin Kelly's presentation on the next 5000 days of the web? You don't believe in magic, and you want to see how it can really work? This used to be quite difficult, but it has become a lot easier recently. Here are some pointers I will try to keep up to date.

    Introductory Material

    Read Dean Allemang and Jim Hendler's Book "The Semantic Web for the Working Ontologist". While you are waiting for that book to arrive you can already view and listen to Dean Allemang's excellent presentation at JavaOne 2008. If you are interested in Social Networking, then you could follow that up also with my JavaOne presentation that same year which goes more into the RESTful, self describing web, hyperdata side of things, ie the Web in the Semantic Web.

    One should also remember that one does not need to trust everything one finds on the web. A good semantic web engine will allow you to merge different graphs depending on which ones you trust, which will indeed be something partly subjective, but which can also evolve. The semantic web allows you to change your mind. Good reasoning engines to help make that fast are only just appearing though.

    References

    A lot of the references on the W3C use the original RDF XML syntax, which happens to be somewhat unintuitive to use and leads people to think too syntactically about the semantic web. XML developers may feel tempted to take out their XML tools, which may not get them what they were looking for. Recently a non official Semantic Web primer was put together that uses the much easier to use Turtle notation, the one the SPARQL query language is inspired from.

    Tools

    It is good to use different tools, as each have their own advantages. There are too many ( see the sweet tools listing ) for anyone to try them all out. Here are the one's I use regularly:

    • the cwm python script, does a lot of useful things. Just downloading RDF/XML , following redirects, etc, and transforming it to your preferred format (Turtle) can be extremely useful. It also has a resoning engine, has rules, and can be set up and queried with SPARQL.
    • For my programming tasks, as I am a Java developer, I use Sesame 2. It is a Java framework that has a large following. It's competitor in the Java space is the HP backed Jena, which has better out of the box inferencing support.
    • If you want to quickly view your SQL database as an RDF store I recommend D2RQ. It has not been evolving much recently though.
    • The Tabulator Firefox plugin, turns Firefox into a generic RDF browser. It is a prototype, but is very useful.

    Friday Aug 29, 2008

    excel and rdf

    Scott McNealy, rarely had much nice to say about spreadsheet software, when it was not web enabled. And indeed there are huge numbers of problems with them. Off the top of my head, some of these are:

    • Hidden formula that nobody looks at and that get tweaked without alerting people
    • Data that is never synchronized, with parts of it that is out of date
    • Data that cannot be merged
    • Some products even had virus problems...

    And yet they are immensely popular, especially with the people who never see the problems that they lead to.

    As it happens these are problems within the scope of the semantic web. Every spread sheet is like a mini SQL database. As long as you query the information inside of one database owned by one administrator all is fine. But what when you want to merge information from different databases? Ouch! That's really tough, because there is usually no clear understanding of which pieces should fit together. Do the columns in each database mean the same thing? Well if you have just a few big databases you can link them tediously together, but what if you have thousands of such databases? And each person wielding it is a complete novice to this problem? What if someone just renames a column in one spread sheet? What does that mean?

    The topic of spreadsheets and the semantic web came to be one of the highlights of the conferences I went to in May. Dean Allemang in his talk at JavaOne ( the sound track enhanced slides are now online! ), used this problem in one of his examples. Eric Miller, talked about a solution that involved using the momentum behind spreadsheets to help build ontologies (I think, it's a while back now). This is not all new of course. In a reply to this post Mike Bergman pointed to his year old article entitled "RDF123 Makes Generating Flexible RDF a Snap".

    But often a demo helps a lot, and the one that made me see the light was given by Lee Feigenbaum of Cambridge Semantics just before the end of the Semantic Tech Conference. Lee, who had been working on semantic web tools at IBM before going to start his own company, gave me a quick summary of the benefits of his SHAPE middleware. Essentially by adding URLs into the spreadsheet you can tie their meaning down a lot more carefully. By writing a plugin for Microsoft Excel ( they had a prototype working for openoffice before deciding to focus on M$ tools) that works together with the middleware, users can keep on behaving as they are used to, whilst helping link all the information together. Instead of working against each other, people in a company can build a web of information together. Here is a highlight from Lee's talk entitled Getting to Web Semantics for Spreadsheets in the U.S. Government:

    • Tight integration into Excel allows semantic  concepts to be dragged and dropped from the  semantic repository onto data tables
    • The data table's implicit row/column relations are  explicitly stored in an RDF semantic database
    • Cells, columns, and regions are tagged with explicit  semantics
    • Publish the data tables on the Web
    Intriguing for sure.

    Spreadsheets may yet be back again, but for the good.

    PS. Please send me further links on this so I can flesh out this story better.

    Update

    13 September 2008:

    Tuesday Aug 26, 2008

    Sun Intranet Foaf Experiment

    image of Address Book displaying internal sun foaf

    Building a foaf server from an ldap directory is pretty easy. Rinaldo Di Giorgio put a prototype server together for Sun in less than a week. As a result everyone in Sun now has a experimental temporary foaf id, that we can use to try out some things.

    So what can one do with foaf that one could not so easily do with ldap? Well the semantic web is all about linking and meshing information. So one really simple thing to do is to link an external foaf file with the internal one. I did this by adding an owl:sameAs statement to my public foaf file that links my public and my sun id. (It would be better to link the internal foaf file to the external one, but that would have required a bit more work internally). As a result by dragging and dropping my foaf iconfoaf file onto today's release of the AddressBook someone who is inside the Sun firewall, can follow both my internal and my external connections. Someone outside the firewall will not be able to follow the internal link.

    By extending the internal foaf server a little more one could easily give people inside of Sun a place to link to their external business connection, wherever they might be in the world. To allow other companies to do this too it would of course help if everyone in Sun had a minimally public foaf ID, which would return only minimal information, or whatever the employee was comfortable revealing about themselves. This would allow Sun to present a yet more human face to the world.

    Well that's just a thought, and this is just an experiment. Hopefully it will make the semantic web more real for us here, and allow people's to dream up some great way of bringing all the open source world together, ever closer.

    PS. For people inside of Sun it may be easier to just drag my foaf iconinternal foaf file directly on the the AddressBook (started via jnlp). Otherwise to get the internal foaf file to download you need to click the "fetch" button next to the "same As" combo box when viewing my info. Then you need to switch to "Last Imported" and back to allow "Bernard Traversat" to appear in the second column. He appears as someone I foaf:know after the merger of the internal and the external foaf. I know this is clumsy, and I'll try thinking up a way to make this more user friendly very soon. You are welcome to participate on the Address Book Project.

    PPS. Sun internal users can get more info on the project home page.

    PPPS. We of course use the Firefox Tabulator plugin too for tests. It gives a different interface to my AddressBook. It is more flexible, but less specialised... The Tabulator web application does not work currently because we only produce Turtle output. This is to avoid developers trying to use DOM tools to process these pages, as we don't want to put work into an RDF crystalisation. ( Note: If at some later time you find that the plugin is not compatible with the latest version of Firefox, you can manually disabling compatibility checks. )

    Saturday May 17, 2008

    Social Networks and Data Portability at Semantic Tech conference in San Jose

    The upcoming semantic conference in San Jose, is getting going tomorrow, with an excellent list of speakers and subjects. Here are some highlights of the sessions relating to topics on which I blog regularly.

    Many more interesting talks will make sure I will spend another packed week. The full program is available online.

    Update

    My presentation is now available online with audio as part of the longer Building Secure, Open and Distributed Social Network Applications

    Monday Apr 21, 2008

    FOAF & SSL: creating a global decentralised authentication protocol

    Following on my previous post RDFAuth: sketch of a buzzword compliant authentication protocol, Toby Inkster came up with a brilliantly simple scheme that builds very neatly on top of the Secure Sockets Layer of https. I describe the protocol shortly here, and will describe an implementation of it in my next post.

    Simple global ( passwordless if using a device such as the Aladdin USB e-Token ) authentication around the web would be extremely valuable. I am currently crumbling under the number of sites asking me for authentication information, and for each site I need to remember a new id and password combination. I am not the only one with this problem as the data portability video demonstrates. OpenId solves the problem but the protocol consumes a lot of ssl connections. For hyperdata user agents this could be painfully slow. This is because they may need access to just a couple of resources per server as they jump from service to service.

    As before we have a very simple scenario to consider. Romeo wants to find out where Juliette is. Juliette's hyperdata Address Book updates her location on a regular basis by PUTing information to a protected resource which she only wants her friends and their friends to have access to. Her server knows from her foaf:PersonalProfileDocument who her friends are. She identifies them via dereferenceable URLs, as I do, which themselves usually (the web is flexible) return more foaf:PersonalProfileDocuments describing them, and pointing to further such documents. In this way the list of people able to find out her location can be specified in a flexible and distributed manner. So let us imagine that Romeo is a friend of a friend of Juliette's and he wishes to talk to her. The following sequence diagram continues the story...

    sequence diagram of RDF+SSL

    The stages of the diagram are listed below:

    1. First Romeo's User Agent HTTP GETs Juliette's public foaf file located at http://juliette.net/. The server returns a representation ( in RDFa perhaps ) with the same semantics as the following N3:

      @prefix : <#> . 
      @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
      @prefix todo: <http://eg.org/todo#> .
      @prefix openid: <http://eg.org/openid/todo#> .
      
      <> a foaf:PersonalProfileDocument;
         foaf:primaryTopic :juliette ;
         openid:server <https://aol.com/openid/service>; # see The Openid Sequence Diagram .
      
      :juliette a foaf:Person;
         foaf:name "Juliette";
         foaf:openid <>;
         foaf:blog </blog>;    
         rdfs:seeAlso <https://juliette.net/protected/location>; 
         foaf:knows <http://bblfish.net/people/henry/card#me>,
                    <http://www.w3.org/People/Berners-Lee/card#i> .
      
      <https://juliette.net/protected/location> a todo:LocationDocument .
      

      Romeo's user agent receives this representation and decides to follow the https protected resource because it is a todo:LocationDocument.

    2. The todo:LocationDocument is at an https URL, so Romeo's User Agent connects to it via a secure socket. Juliette's server, who wishes to know the identity of the requestor, sends out a Certificate Request, to which Romeo's user agent responds with an X.509 certificate. This is all part of the SSL protocol.

      In the communication in stage 2, Romeo's user agent also passes along his foaf id. This can be done either by:

      • Sending in the HTTP header of the request an Agent-Id header pointing to the foaf Id of the user. Like this:
        Agent-Id: http://romeo.net/#romeo
        
        This would be similar to the current From: header, but instead of requiring an email address, a direct name of the agent would be required. (An email address is only an indirect identifier of an agent).
      • The Certificate could itself contain the Foaf ID of the Agent in the X509v3 extensions section:
                X509v3 extensions:
                   ...
                   X509v3 Subject Alternative Name: 
                                   URI:http://romeo.net/#romeo
        

        I am not sure if it would be correct use of the X509 Alternative names field. So this would require more standardization work with the X509 community. But it shows a way where the two communities could meet. The advantage of having the id as part of the certificate is that this could add extra weight to the id, depending on the trust one gives the Certificate Authority that signed the Certificate.

    3. At this point Juliette's web server knows of the requestor (Romeo in this case):
      • his alleged foaf Id
      • his Certificate ( verified during the ssl session )

      If the Certificate is signed by a CA that Juliette trusts and the foaf id is part of the certificate, then she will trust that the owner of the User Agent is the entity named by that id. She can then jump straight to step 6 if she knows enough about Romeo that she trusts him.

      Having Certificates signed by CA's is expensive though. The protocol described here will work just as well with self signed certificates, which are easy to generate.

    4. Juliette's hyperdata server then GETs the foaf document associated with the foaf id, namely <http://romeo.net/> . Romeo's foaf server returns a document containing a graph of relations similar to the graph described by the following N3:
      @prefix : <#> . 
      @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
      @prefix wot: <http://xmlns.com/wot/0.1/> .
      @prefix wotodo: <http://eg.org/todo#> .
      
      <> a foaf:PersonalProfileDocument;
          foaf:primaryTopic :romeo .
      
      :romeo a foaf:Person;
          foaf:name "Romeo";
          is wot:identity of [ a wotodo:X509Certificate;
                               wotodo:dsaWithSha1Sig """30:2c:02:14:78:69:1e:4f:7d:37:36:a5:8f:37:30:58:18:5a:
                                                   f6:10:e9:13:a4:ec:02:14:03:93:42:3b:c0:d4:33:63:ae:2f:
                                                   eb:8c:11:08:1c:aa:93:7d:71:01""" ;
                             ] ;
          foaf:knows <http://bblfish.net/people/henry/card#me> .
      
    5. By querying the semantics of the returned document with a SPARQL query such as
      PREFIX wot: <http://xmlns.com/wot/0.1/> 
      PREFIX wotodo: <http://eg.org/todo#> 
      
      SELECT { ?sig }
      WHERE {
          [] a wotodo:X509Certificate;
            wotodo:signature ?sig;
            wot:identity <http://romeo.net/#romeo> .
      }
      

      Juliette's web server can discover the certificate signature and compare it with the one sent by Romeo's user agent. If the two are identical, then Juliette's server knows that the User Agent who has access to the private key of the certificate sent to it, and who claims to be the person identified by the URI http://romeo.net/#romeo, is in agreement as to the identity of the certificate with the person who has write access to the foaf file http://romeo.net/. So by proving that it has access to the private key of the certificate sent to the server, the User Agent has also proven that it is the person described by the foaf file.

    6. Finally, now that Juliette's server knows an identity of the User Agent making the request on the protected resource, it can decide whether or not to return the representation. In this case we can imagine that my foaf file says that
       @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      
       <http://bblfish.net/people/henry/card#me> foaf:knows <http://romeo.net/#romeo> .  
       
      As a result of the policy of allowing all friends of Juliette's friends to be able to read the location document, the server sends out a document containing relations such as the following:
      @prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> .
      @prefix : <http://juliette.org/#> .
      
      :juliette 
          contact:location [ 
                contact:address [ contact:city "Paris";
                                  contact:country "France";
                                  contact:street "1 Champs Elysees" ]
                           ] .
      

    Todo

    • Create an ontology for X509 certificates.
    • test this. Currently there is some implementation work going on in the so(m)mer repository in the misc/FoafServer directory.
    • Can one use the Subject Alternative name of an X509 certificate as described here?
    • For self signed certificates, what should the X509 Distinguished Name (DN) be? The DN is really being replaced here by the foaf id, since that is where the key information about the user is going to be located. Can one ignore the DN in a X509 cert, as one can in RDF with blank nodes? One could I imagine create a dummy DN where one of the elements is the foaf id. These would at least, as opposed to DN, be guaranteed to be unique.
    • what standardization work would be needed to make this

    Discussion on the Web

    Friday Apr 18, 2008

    The OpenId Sequence Diagram

    OpenId very neatly solves the global identity problem within the constraints of working with legacy browsers. It is a complex protocol though as the following sequence diagram illustrates, and this may be a problem for automated agents that need to jump around the web from hyperlink to hyperlink, as hyperdata agents tend to do.

    The diagram illustrates the following scenario. Romeo wants to find the current location of Juliette. So his semantic web user agent GET's her current foaf file. But Juliette wants to protect information about her current whereabouts and reveal it only to people she trusts, so she configures her server to require the user agent to authenticate itself in order to get more information. If the user agent can prove that is is owned by one of her trusted friends, and Romeo in particular, she will deliver the information to it (and so to him).

    The steps numbered in the sequence diagram are as follows:

    1. A User Agent fetches a web page that requires authentication. OpenId was designed with legacy web browsers in mind, for which it would return a page containing an OpenId login box such as the one to the right. openid login box In the case of a hyperdata agent as in our use case, the agent would GET a public foaf file, which might contain a link to an OpenId authentication endpoint. Perhaps with some rdf such as the following N3:
      <> openid:login </openidAuth.cgi> .
      
      Perhaps some more information would indicate which resources were protected.
    2. In current practice a human user notices the login box and types his identifying URL in it, such as http://openid.sun.com/bblfish This is the brilliant invention of OpenId: getting hundreds of millions of people to find it natural to identify themselves via a URL, instead of an email. The user then clicks the "Login button".
      In our semantic use case the hyperdata agent would notice the above openid link and would deduce that it needs to login to the site to get more information. Romeo's Id ( http://romeo.net/ perhaps ) would then be POSTed to the /openidAuth.cgi authentication endpoint.
    3. The OpenId authentication endpoint then fetches the web page by GETing Romeo's url http://romeo.net/. This returned representation contains a link in the header of the page pointing Romeo's OpenId server url. If the representation returned is html then this would contain the following in the header
       <link rel="openid.server" href="https://openid.sun.com/openid/service" />
      
    4. The representation returned in step 3, could contain a lot of other information too. A link to a foaf file may not be a bad idea as I described in foaf and openid. The returned representation in step 3 could even be RDFa extended html, in which case this step may not even be necessary. For a hyperdata server the information may be useful, as it may suggest a connection Romeo could have to some other people that would allow it to decide whether it wishes to continue the login process.
    5. Juliette's OpenId authentication endpoint then sends a redirect to Romeo's user agent, directing it towards his OpenId Identity Provider. The redirect also contains the URL of the OpenId authentication cgi, so that in step 8 below the Identity Provider can redirect a message back.
    6. Romeo user agent dutifully redirects romeo to the identity provider, which then returns a form with a username and password entry box.
    7. Romeo's user agent could learn to fill the user name password pair in automatically and even skip the previous step 6 . In any case given the user name and password, the Identity Provider then sends back some cryptographic tokens to the User Agent to have it redirect to the OpenId Authentication cgi at http://juliette.net/openidAuth.cgi.
    8. Romeo's Hyperdata user agent then dutifully redirects back to the OpenId authentication endpoint
    9. The authentication endpoint sends a request to the Openid Identity provider to verify that the cryptographic token is authentic. If it is, a conventional answer is sent back.
    10. The OpenId authentication endpoint finally sends a response back with a session cookie, giving access to various resources on Juliette's web site. Perhaps it even knows to redirect the user agent to a protected resource, though that would have required some information concerning this to have been sent in stage 2.
    11. Finally Romeo's user agent can GET Juliette's protected information if Juliette's hyperdata web server permits it. In this case it will, because Juliette loves Romeo.

    All of the steps above could be automatized, so from the user's point of view they may not be complicated. The user agent could even learn to fill in the user name and password required by the Identity Provider. But there are still a very large number of connections between the User Agent and the different services. If these connections are to be secure they would need to protected by SSL (as hinted at by the double line arrows). And SSL connections are not cheap. So the above may be unacceptably slow. On the other hand it would work with a protocol that is growing fast in acceptance.

    It is is certainly worth comparing this sequence diagram with the very light weight one presented in "FOAF & SLL: creating a global decentralised authentication protocol".

    Thanks again to Benjamin Nowack for bringing the discussion on RDFAuth to thinking about using the OpenId protocol directly as described above. See his post on the semantic web mailing list. Benjamin also pointed to the HTTP OpenID Authentication proposal, which shows how some of the above can be simplified if certain assumptions about the capabilities of the client are made. It would be worth making a sequence diagram of that proposal too.

    Thursday Apr 17, 2008

    semantic camp paris

    picture of Karima Rafes

    A couple of weeks ago I attended the second Semantic Bar Camp which took place at the Orange research labs at Issy les Moulineaux, near Paris. This was a great opportunity to meet many of the French researchers in the Semantic Web space, to take part in the French debate, and to help convince interested parties of the reality of the technology.

    Jean Rohmer of the large French defense group Thales played the role of the devil's advocate, arguing that the Semantic Web was just pie in the sky theory without practical applications. We delved into various aspects of the theory of the Semantic Web, and I underlined how the biological/evolutionary aspect of language, the Academie Francaise notwithstanding, was a key aspect in understanding the evolution of the web of data. But the best argument was a simple demonstration of the Beatnik Address Book, which showed how hyperdata could solve the serious problem of 2008: the growing number of closed social networks. At the next camp I hope we will be able to delve much more deeply into how to build real practical applications.

    Many thanks to Karima Rafes for organizing this well attended bar camp ( pictures ). Stephane Lauriere from XWiki and who is on the Nepomuk Semantic Desktop project, also posted some photos. And I would like to recommend Alexandre Passant's blog to all french speaking readers.

    Update

    The talk I gave is now available online with audio as "Building Secure, Open and Distributed Social Network Applications".

    Search

    Recent Entries

    Navigation

    Referers