free web site hit counter

The Sun BabelFish Blog

Don't panic !

Tuesday Aug 28, 2007

My Bloomin' Friends

Closed Social Networks are blossoming all over the place. They provide a semblance of protection, at a price: lock in. Locked into the social network provider you get convenience in the form of tools to make conversation easier (video, email, chat boards, ...), some form of privacy protection (if you trust the provider), introductions to 'like minded' people, and other niceties.

Some of us work in the open air: we have to set standards in public view; we stand by what we say; we accept criticism from wherever it comes; and we can't choose our friends based on their social network provider. We describe ourselves in our foaf files where we can specify what we do, how to contact us, our interests, and links to who we know by pointing to their Universal Identifiers. There is no trouble linking between people who are open in this way. We are happy to reference each other: it strenghtens the exposure of our work and the quality of the web. This is how I link to Paul Gearon:

:me foaf:knows  [ = <http://web.mac.com/thegearons/people/PaulGearon/foaf.rdf#me>; 
                  a foaf:Person;
                  foaf:name "Paul Gearon" ] .
I could just point to his URL, but the little extra duplicate information can make life easier for people/robots browsing the data web. It can help people notice inconcistencies and help me correct them.

But not everyone lives in the open the same way, and not everyone wants to make the same amount of information about themselves public. There are a number of different ways to deal with this. I want to discuss a few of them here.

Content Negotition

How much someone says about themselves is up to them, and so is how they protect their information. The same URL that identifies someone, could return more or less information depending on who is asking. I could set up my foaf file so only friends who log in via openid can see my friends. Others would just get default information about me. I could be even more clever. I could allow any friend of my friend who logs in via their openid to see my full foaf file; others would see information about me, and a select group of open friends. Closed Social Networks could open up by making it convenient to specify these policies, and providing the right infrastructure to do so.

Indirect Identification

By directly identifying someone via a URL (as I do) we can leave a lot of the policy of what they make visible up to them. But those that don't have a foaf name, need to be identified indirectly. We can do that by identifying them via some property such as their blog, their home page, their email address, or their openid. I am very open about my email addresses. They are published and visible to all.
 <http://bblfish.net/people/henry/card#me>     <http://xmlns.com/foaf/0.1/mbox> <mailto:henry.story@bblfish.net> .
I value it more that people can contact me easily - living as I do in the middle of nowhere and often living nowhere in particular - than the pain of spammers. Too many people are lazy about security, using virus filled Windoze computers, obvious passwords, cracked software for me to be under any illusion that hiding my email is going to prevent the bad guys from getting it.

However I can't assume that everyone else will accept me applying this argument to their email address. For this there is a nice mathematical technique: I can encrypt their email address using the SHA1 hash function. This create a close to unique string that cannot be dissasembled. You cannot go from the sha1 sum of an email address back to the email. But you can always calculate the same sha1sum from an email. This is how I identify Simon Phipps, Sun's Open source Officer:

:me foaf:knows [ a foaf:Person;
                 foaf:mbox_sha1sum "4e377376e6977b765c1e78b2d0157a933ba11167";
                 foaf:name "Simon Phipps";
                 rdfs:seeAlso <http://www.webmink.net/foaf.rdf>
               ].

If you know Simon's email, then you will know that I know him. "What use is that?" I can hear someone ask. It's all about Working with People on the Internet. Imagine you are reading email on a newsgroup with a foaf enabled mail tool linked to a foaf enabled Address Book (such as Beatnik). You come on an email by Simon saying something interesting about how Sun has changed its stock ticker to JAVA for example. My logo and perhaps that of a couple of other people appears on the mail reader in a way that indicates to you that we know Simon. The post is no longer anonymous for you, and so has more trust value. You feel part of a community.[1]

So spammers can not use that information to spam. Either they already know your email address, and so they are probably already spamming you, or they don't, and this won't help them. They can only [2] learn about social network claims: who claims to know who. They could use this, it is true to introduce themselves as an aquaintance of a friend of yours. A bit of a risky strategy that could quickly get them on a black list. Currently being black listed may not be an expensive proposition. But in a cryptographic web of trust this will be both much easier to notice, and more damaging for the infringers.

Fuzzy Identification

I can directly and indirectly identify a lot of people in my Address Book as described above. This is perfectly acceptable for people who have an open life, like I do, and a large portion of the Open Source community, bloggers, standard setters, etc... But on last count I had over 700 people in my AddressBook. It is a lot of work to identify all of them individuall, and to decide how much visibility I should give them. I may not even want people to know how many people I know this way. Also I may want deniability: there are people one may know, but one may not want to highlight that, and one may want to be able to deny that one knows them to some people. The foaf:sha1sum gives me a way to identify someone, but if some nozy person comes to me and asks me about that person's life after having identified the corresponding email address, there is no escape route other than refusing any conversation, which by itself can easily be taken to be significant. What we need is a way to fuzily identify a group of one's aquaintance.

Bloom Filters

This is what Bloom Filters enable one to do. Originally used in times when memory was expensive, they allowed the whole vocabulary of a language to be condensed into a reasonably short string. Here we can use it to group all the email addresses of our friends together in one opaque string. I could express as follows in RDF (bear in mind that the rdf vocabulary has not been settled on):
:me foaf:bloomMbox [ a bloom:Bloom;
                     bloom:base64String """"
            IAOgQgSAAAICCAADAoQgDABAAiQKgIABgyAIBEhAAAAIUKBACCYAABAAaEkGQAGIEAHRUAgAAQUw 
            hCgwACJNQxQAAggAgCIgAAAAKgICEKAAAABCQiB0JCAAAIkgDASAYiAAAEIQAAIAABDCEAZACOpA 
            ICEEMAGAEGEAxIA=""";
                     bloom:hashNum 4;
                     bloom:length 1000 ] .

Given the above Bloom someone can query it with an email address using the inverse algorithm and the Bloom will answer either that I may know that person, or that it can't tell. The loaf project explains some of the advantages of having this in more detail.

The best way to get a feel for how it works is to try it. Here I have written a little java applet [3] that allows you to test my Bloom for people I know, and to create your own bloom [4].

Your browser is completely ignoring the <APPLET> tag! Go to java.com to download the latest.

Some emails you can try with positive results are tbray attextuality dot C O M, or bill at dehora dot net (suitably transformed of course). The applet lowercases all email addresses when creating and when testing the bloom.

To create your own bloom just click the "Create Bloom" tab. An easy way to extract all your email addresses from an OSX Address Book is to run the following on the command line:

hjs@bblfish:0$ osascript -e 'tell application "Address Book" to get the value of every email of every person' | perl -pe 's/,+ /\n/g' | sort | uniq | pbcopy

You should now be able to paste the list of all your contacts in the applet. To restrict the Addresses to on of your groups named "foaf" for example replace the relevant section above with tell application "Address Book" to get the value of every email of every person in foaf.

You will need to choose the number of hashes and the maximal size of the bucket you wish to fill. The greater the number of hashes and the greater the size of the bucket, the more precision you get and the less deniability.[5]

Conclusion

None of the above tools are by themselves the complete solution for creating an Open Social Network that will satisfy everyone. But for people willing to live in the open, the correct and astute use of them should satisfy most of people's requirements. Access Control on URLs can make it possible to reveal more or less information depending on who is looking; indirect identification can allow one to name people even without direct identification; sha1sums allows one to partially hide sensitve identifying information; and Blooms allow one to make fuzzy statements of set membership. All of these can be combined in different ways. So one can make statements about sha1sum identified people on the open web, or one can do so behind an access controlled file that only friends logged in with OpenId can see. There are bound to be more fun things to be discovered here. But this should make clear just how much can be done in this space.

Notes

  1. For the link from email addresses to sha1sums to work, it helps to canonicalise the emails to all lowercase. This should probably be made more explict in the foaf:mbox_sha1sum definition.
  2. "They can 'only' learn about social network claims", is quite a lot more than some people are willing to accept. See the article by Mark Wahl "Organizing principles for identity systems: Attacks on anonymized social networks and fudging oracles" which contains some very good pointers. For people who want to retain complete anonymity, and this is what people subscribe to when they answer public surveys, any leakage of information is too much leakage. The problem is that because of Metcalf's Law it is nearly impossible to stop information combining itself: Information wants to be linked. So I think, when we are not tied to stringent laws, we should accept this rather than fight it, and use it to our advantage when hunting down spammers: the law holds for them too.
  3. You can get the source code for the applet on the so(m)mer repository in the misc/Bloom subdirectory. I used the pt.tumba.spell.BloomFilter class which I adapted a little for my needs. This was just the first one I found out there. It is probably not the most efficient one, as it uses an array of booleans, when it could use an byte array. If you know of other libraries please let me know.
    The code was put together really quickly and may well contain bugs. Feedback and patches and contributions are welcome.
  4. the advantage of Java Applets over server side code is really obvious here:
    • I don't need a server with a fixed port number to show you this
    • someone can't easily start a denial of service attack to bring the server down
    • You email addresses never leave your computer, so there is no fear of loss of privacy.
    On the last point it would be nice if browser vendors made it easier to get info about the exact restrictions a Java Applet had. I would like to be able to click on an Applet and verify or set it to "no network communication whatever". This would increase trust even more in cases like this.
  5. More info on the load site. Apparently one needs more than 1/4 deniability if one is to preserve some measure of privacy, according to the paper "the price of privacy and the limits of LP decoding" by Cynthia Dwork, Frank McSherry and Kunal Talwar (Microsoft Research) who suggests that
    ... any privacy mechanism, interactive or non-interactive, providing reasonably accurate answers to a 0.761 fraction of randomly generated weighted subset sum queries, and arbitrary answers on the remaining 0.239 fraction, is blatantly non-private.
    Thanks again to Mark Wahl for these references.
  6. Thanks a lot to Dan Brickley for working together with me on this last Friday, and pointing me to many of the important work done here. Dan also wrote a little python script to do something similar. Some of the sites I came across during our discussion: Not having studied bloom filters in detail, I am not sure how compatible the blooms of each of these libraries are. The super simple ruby bloom library does not seem to specify the number of hashes that were used to create a Bloom.
  7. Nick Lothian reminded me in a comment to this that he has written a Bloom Filter demo for facebook. I don't have a facebook account (because I am already on LinkedIn, and I can't really be bothered to move all my information, and because I don't like closed networks), so I was not able to use it. Perhaps I should get a facebook account just for this... Let me know.

Wednesday Jul 25, 2007

A Foaf file for Sun!

Sun Microsystems has recently given all its employees an OpenId that is guaranteed to identify each person at Sun. This has allowed me to add the following to my foaf file:

:me foaf:openid <http://openid.sun.com/bblfish> .

Now it would be nice if Sun could make the statement that all of its employees have such ids in a machine readable way. This could then be used by other organisations, say the W3C of which Sun is a member, to identify all of Sun's employees, and so give them access to member only parts of the W3C web site. But with OpenId as it currently stands this is usually thought to be impossible. For at its core OpenId just allows a client service to verify that an EndUser has its identity confirmed by a certain service, which the end user points you to. There is no way to specify what the service is, who it is related to, who owns the id, etc...

Well OpenId does not provide for this out of the box, but it is not difficult to imagine how one could do this. The first thought that comes to mind is to have Sun Microsystems publish a foaf file (for Sun) that listed all its members using the new foaf:openid inverse functional property. I am imagining something like this:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <http://sun.com/sunw#> .

<> a foaf:PersonalProfileDocument;
    foaf:primaryTopic :sunw.

:sunw a foaf:Organization, foaf:Group;
    foaf:name "Sun Microsystems"@en;
    foaf:homepage <http://www.sun.com/>;
    foaf:member [ foaf:openid <http://openid.sun.com/bblfish> ],
                [ foaf:openid <http://openid.sun.com/jag> ];
                             ....

So Sun would just have to point the W3C to <http://sun.com/sunw> and it could find all the Sun employees OpenIds and give them special priviledges on the W3C web sites. By regularly polling that file, the W3C could keep up to date with its list.

But the problem with the above solution is that it is releasing perhaps more information than necessary. After all each of those openids could be linked to a foaf file, as I explained recently, so revealing a lot of information about the employees at Sun. It would also require regular polling to be kept up to date, and so would be leaky. That is it might not work right after a employee has created his brand new OpenId, thereby leading to some tricky to report bug reports, bad feelings, etc... It may also end up being a very long files - quite long for companies the size of Sun, a lot longer for companies the size of IBM, too long for the Indian Railways (which has over a million employees) and certainly not imaginable for countries such as the USA were it to want to list all its citizens.

What is really needed is a service that can verify the belonging of an id to a group. Wait! That is what OpenId 1.1 provides! The OpenId Server URL names a resource that does two things:

  • It can veryify OpenId URLs as being ones that are part of the group it can identify
  • It can identifies User Agents as being ones that knows a secret tied to that OpenId (owns it).

So to take the Sun example, all that is needed is to specify that https://openid.sun.com/openid/service is an openid group identifier, and that all IDs that can be identified via that service are identifiers for members of that group. So let us create such a relation now, and place it in some temporary openid namespace:

@prefix openid: <http://openid.org/tmp/ont#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

openid:memberIdService a  owl:ObjectProperty;
    rdfs:label "openid member identification service";
    rdfs:comment """Any agent that can identify with an openid ID to this service is the agent who 
is the subject of the foaf:openid relation to that ID, and that agent is  a member of this group."""@en;
    rdfs:domain foaf:Group;
    rdfs:range openid:IDAuthService .

openid:IDAuthService a owl:Class;
    rdfs:label "OpenID Authentication Service";
    rdfs:comment "Members of this class are resources that can authenticate agents who present an OpenID."@en .
This would allow us then to write our information about Sun Microsystems like this
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <http://sun.com/sunw#> .
@prefix openid: <http://openid.org/tmp/ont#> .

<> a foaf:PersonalProfileDocument;
     foaf:primaryTopic :sunw.

:sunw a foaf:Organization, foaf:Group;
    foaf:name "Sun Microsystems"@en;
    foaf:homepage <http://www.sun.com/>;
    openid:memberIdService <https://openid.sun.com/openid/service>.

So now when Sun wishes to become a member of a prestigious organisation like the W3C, all we need to do is send them Sun's foaf file URL. This will give them our openid:memberIdService which they can use to identify all of our members. That way they or any other service can tell who our employees are without us ever giving them a list.

Let's look at this the other way around. A web service such as DZone asks me to identify myself and I give them my OpenId http://openid.sun.com/bblfish. That OpenId may have links to a number of different OpenId Servers. Which one should DZone use? Well it may recognise one of them, and just use that. But would it not be nice if the OpenId services could say something about themselves? One very useful thing they could say is what group they identified. This could be done in a nice RESTful way by simply asking for an RDF representation of the service for which we could get the easier to read N3 representation like this:

hjs@bblfish$ cwm https://openid.sun.com/openid/service

@prefix openid: <http://openid.org/tmp/ont#> .

<> a openid:IDAuthService;
   openid:serviceFor <http://sun.com/sunw#sunw> .
So this would allow a service to follow its nose from openids to the groups they belong to, and assess the trust it has in those groups. The serviceFor relation above could simply be defined as
openid:serviceFor owl:inverseOf openid:memberIdService .

Now you may ask: How does anyone know to trust Sun's foaf file or the Sun OpenId memberIdService? Here we can work a network of trust model as described by David Weitzner in "Whose name is it anyway". To illustrate this imagine the following: If the W3C's foaf file lists its member organisations, by pointing to each of their foaf files, and if the NASDAQ lists its member companies that way using the same foaf file, and Sun itself points back to both of them, then that would be a way of having a distributed reinforcement of the confidence one can have in OpenId servers. After all, if one trusts NASDAQ and the W3C's foaf file, then one should be able to trust that they point to the Sun foaf file correctly. A company listing its members or related organisations is a bit like a person linking to its friends. This is what creates a network of trust.

Friday Jul 20, 2007

foaf and openid

My Sun OpenId is helping me use many services I would not have used before. For example I have started using DZone which is a service like DIGG in that it allows one to vote for interesting stories on the web. But unlike DIGG, I don't have to go through the rigmarole of setting up a new account, waiting for an email, replying to the email, remembering one more password which I have to look up in my keychain anyway, etc, etc...

From my short experience I have identified some simple ways one can improve the user experience. Currently for example all the server knows about me is my openId URL. That makes for an impersonal experience, as you can see from this comment I posted:

I am identified as "openid.sun.com/bblfish" and there is no icon to represent me. If I want a more personal experience I need to register! Which means just entering my name, an email address and a few passwords. Ouch! So we are back to pre-openid land. One more password to enter, and to remember...

Luckily there is an obvious and easy fix to this. My openid http://openid.sun.com/bblfish should not just return a representation that contains a link to the openid server

<link rel="openid.server" href="https://openid.sun.com/openid/service" />
but also a link to a representation that contains more information about me, which would be my foaf file. This could be done very simply by growing the header of my openid html by one line, as specified by the foaf FAQ:
<link rel="openid.server" href="https://openid.sun.com/openid/service" />
<link rel="meta" type="application/rdf+xml" title="FOAF" href="http://bblfish.net/people/henry/card"/>
which is what videntity.org has been doing since 2005 [1], and openid.org has been providing since early July [2]. Now all that would be needed then is for dzone to read the foaf file pointed to, and extract the name relation, email and logo from the person described in the foaf file with the same openid. This could be done with a simple SPARQL query such as
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?mbox ?logo ?nick
WHERE {
        ?p foaf:openid <http://openid.sun.com/bblfish>.
     OPTIONAL { ?p foaf:mbox ?mbox } .
     OPTIONAL { ?p foaf:logo ?logo } .
     OPTIONAL { ?p foaf:nick ?nick } .
}
If you save the above to a file - say openid.sparql - you can run it on the command line using the python cwm script like this:
hjs@bblfish:2$ cwm http://bblfish.net/people/henry/card --sparql=./openid.sparql 
#   Base was: http://bblfish.net/people/henry/card
     @prefix : <http://www.w3.org/2000/10/swap/sparqlCwm#> .
    {
        "bblfish"     :bound "nick" .
        </pix/bfish.large.jpg>     :bound "logo" .
        <mailto:henry.story@bblfish.net>     :bound "mbox" .

        }     a :Result .
    {
        "bblfish"     :bound "nick" .
        </pix/bfish.large.jpg>     :bound "logo" .
        <mailto:henry.story@gmail.com>     :bound "mbox" .

        }     a :Result .
    {
        "bblfish"     :bound "nick" .
        </pix/bfish.large.jpg>     :bound "logo" .
        <mailto:henry.story@sun.com>     :bound "mbox" .

        }     a :Result .

That's how simple it is! [3]

For those who are still trying to keep their info private, one could add some content negotiation mechansim to the serving of the foaf file, such that depending on the authentication level of the requestor (dzone in this case), the server would return more or less information. If dzone could somehow show on requesting my foaf file, that I had authenticated them, and that should not be difficult to do, since I just gave them some credentials, I could give them more information about me. How much information exactly could be decided in the same box that pops up when I have to enter the password for the service... A few extra checkboxes on that form could ask me if I want to allow full, partial or minimal view of my foaf relations. Power users with more time on their hands could even decide on a relation by relation basis.

Notes

[1]
Videntity.org works nicely, and can even import all the information nicely from an existing foaf file! I would rather they give me the option to link to my original foaf file, which I am maintaining, rather than create yet another one on their server. Their foaf creates bnode urls, which makes me a little nervous (The only bnode url that makes me smile is Benjamin Nowack's). Also there is a bug in their foaf file, in that they have given me a URL which makes me both a foaf:Person and a foaf:Document. foaf does specify that there is nothing in the intersection of those sets. Does this make me a budhist?
[2]
Sadly I have not been able to use that openid.org account to log into anything yet. There seems to be a bug in their windows service. Their foaf file returns nearly no information at present and is incomplete. But the idea is good.
[3]
Here cwm returns an N3 representation. SPARQL servers usually can return both a SIMPLE XML and a simple JSON representation. Those working with a programming library, will skip the serialization step end up directly with a collection of solution objects that can be iterated through directly.

Monday May 14, 2007

Metamorphosis: RDF for Veterans

Yesterday evening I decided to walk from Market westwards. I walked all the way past the San Francisco Opera, through to Hayes Street when I noticed a crowd at an Opening Exhibition of the works of final year industrial design students called Metamorphosis. It was open to all, so I entered.

Looking around I noticed an exhibit with an icon that struck me as amazingly similar to the official RDF icons. More surprising even was that this icon was clearly meant to represent relationships, the foundation of the semantic web. So I looked around for the creator and found Trishatah Hunter, who explained her work to me in more detail. She had never heard of rdf or the semantic web!

Trishatah's device is designed to help war veterans find support when in need, feel part of a community, of a larger social network network on which they can rely. Is this unintentionally the first piece FOAF jewelry?

PS. Not sure what exactly the name for that type of jewelry is...
PPS. Another very nice work was Reflections, a work on the importance of objects to memory. It is a space to place objects in. The lights dim very slowly until the object is invisible behind its mirrored glass container. To see it again one has to touch the object, as if to call it back to memory.

Wednesday Feb 28, 2007

OpenId for blogs.sun.com ?

The volume of posts on OpenId, is clearly growing in importance, and big players such as AOL and Microsoft are joining the party. The technical introduction for web developers on the openid wiki will help make more sense of the following discussion:

Given the Web 2.0 is so very much about Micro Killer Apps single sign on is an absolute necessity. As Paul Diamond notes web 2.0 has created a huge number of services that need to be integrated. Indeed, there are services (eg Convinceme) I have not used recently, or blogs I have not responded to, just because I did not want to go through yet another sign on service.

Having OpenId on blogs.sun.com would allow many nice features. Once someone had been allowed to answer a comment on a blog, they could be enabled for every other comments they make without requiring any further aproval. One could generalize this to allow anyone who had ever been allowed by someone on blogs.sun.com to comment, or to all of one's friends as specified in a foaf file.
Danbri points to Doxory.com (tag line: life by committee) as being one such service that uses both the openid information with a foaf file to provide some interesting service. Danny Ayers points to videntity.org as one of the many open id identity registrars that offers you a foaf file. Open Data Spaces, which is built on Virtuoso uses the same url for the openid and the foaf file, and furthermore that URL is editable using WebDav!

Having read the technical introduction carefully, I think the meshing with foaf is simply accomplished like this:
The Foaf url can simply be the open id. According to current OpenId specs the id would have to be able to return a text/html representation, so that the consumer (the blog that is requiring authentication for example), can search the html for the openid.server link relation. The foaf id would then also be able to return and xml/rdf representation by a client on request. This would save the end user from having to learn two different ids, and it would be a way of authenticating a foaf file on top of it. In this scenario the html representation should have a foaf link relation pointing back the the same url.

Otherwise it would probably be useful to have a sioc property to link to an open id.

Wednesday Feb 07, 2007

foaf enabling an enterprise

Developing the Beatnik Address Book (BAB) I have found requires the software to implement the following capabilities:

  1. being able to read rdf files and understand foaf and vcard information at the very least. (on vcard see also the Notes on the vcard format).
  2. being able to store the information locally, keeping track of the source of the information so as to be able to merge and unmerge information on user demand
  3. being able to write semantic files out at some unchanging location
  4. an easy to use GUI (see the article on F3).

I would like to look at 3. the aspects of writing out a foaf file today. At its simplest this is really easy. But in an enterprise environment, if one wants to give every employee a foaf file so as to allow Universal Drag and Drop of people between applications inside the firewall, some questions need to be answered.

General Solution

The main solution and the obvious one is just to write a foaf file out to a server using ftp, scp, WebDav or the nascent Atom Protocol. Ftp and scp are a little tricky for the end user as he would have to understand the relation between the directory structure of the ftp server and its mapping to the web server urls, as well as what is required to specify the mime types of the foaf file, which is very much dependent on the setup of the web server. (see what I had to do to enable my personal foaf file) This may end up being a lot of work with a steep learning curve for someone who wishes to just publish their contact information. WebDav on the other hand, being a RESTful protocol, makes it much easier to specify the location of the file. Wherever the file is PUT that's it's name. Similarly with the Atom Protocol, though I am not sure for either of them how good they are when confronted with arbitrary mime types. My guess is that WebDav should do much better here.
In any case, using either of the above methods one can always later overwrite a previous version if one's address book changes. This is indeed the solution that will satisfy most use cases.

Professional Solution

In a professional setting though, things get to be a little more complicated. Consider for example any large fortune 500 company. These companies already have a huge amount of information on their employees in their ldap directory. This is reliable and authoritative information, and should be used to generate foaf files for each employee. These companies usually have some web interface to the ldap server which aggregates information about the person in human readable form. Such a web interface - call it a Namefinder – could easily point to the machine generated foaf file.

Now the question is: should this foaf file be read only or read/write? If it is read/write then an agent such as the Beatnik Address Book, could overwrite the file with different information from that stored in ldap, which could cause confusion, and be frowned upon. Well of course the WebDav server could be clever and parse the graph in such a way as to enforce the existence of a certain subgraph. So given the following graph generated from the ldap database

<#hjs> a foaf:Person;
             foaf:mbox <mailto:henry.story@sun.com>;
             foaf:name “Henry Story”;
             org:manager <12345#hjs> .

An Address Book that would want to PUT the a graph containing the following subgraph

<#hjs> a foaf:Person;
             foaf:mbox <mailto:henry.story@sun.com>;
             foaf:name “Henry Story”;
             org:manager <#hjs> .

might

  • get rejected, because the server decides it owns some of the relations, especially the org:manager one in this case. (What HTTP return code should be returned on failure?)
  • or it may decide to rewrite the graph and remove the elements it does not approve of and replace them. That is, replace the triple <#hjs> org:manager <#hjs> with <#hjs> org:manager <12345#bt> for example. (Again what should the HTTP return code be?)

Both of those solutions are valid, but they end up creating a file of mixed ownership. Perhaps it would be better to have the file be read only, officially owned by the company, and have it contain a relation pointing to some other file owned by the user himself. Perhaps something like the following would appear in file at http://foaf.sun.com/official/294551 :

<>   a foaf:PersonalProfileDocument;
       foaf:maker  <http://www.nasdaq.com/SUNW>;
       lnk:moreInfo </personal/294551> .

</personal/294551> a rights:EditableFile;
                    rights:ownedBy <#hjs> .

That is, in plain English the resource would say that it is a PersonalProfileDocument and that Sun Microsystems is the maker of the file and that more information is available at the resource </personal/294551>. It would also give ownership permissions on that resource. A PROPFIND on each of those files could easily confirm the access rights of each of them.

Now from there it should be possible for the user agent ( BAB in this case) to deduce that it has space to write information at </personal/294551>. There it can then write out all the personal information it likes: adding relations to DOAP files, to a personal home page, to interests and to other people known, etc... It could even add a pointer to a public foaf file with a statement such as

<http://foaf.sun.com/official/294551#hjs> owl:sameAs <http://bblfish.net/people/henry/card#me> .

Multiple Agent Problems

Having solved the problem of a writable user agent file, there remains one more distant problem of the same person ending up in a more Semantically enabled future with multiple user agents all capable of writing foaf files but each perhaps with slightly different interests. How would these user agents write files making sure that they don't contradict each other, overwrite important information that the other requires, etc...? The Netbeans user agent may want to write out some relations in the foaf file using the doap ontology to point to the projects the employee is working on... Well perhaps it is as easy as just adding those triples to the file or if then the same problem of ownership arises as above, it may be worth placing each triple into a different user agent space... Well. This seems a bit far out for the moment, I'll look at that problem when I build the next Semantic application. If people have already come across this problem please let me know.

Questions

The above are just some initial thoughts on how to do this. Are there perhaps already relations out there to help cut up the responsibility of writing out these files between different agents be they political or software ones? Are there other solutions I am missing?

Tuesday Dec 05, 2006

15 million foaf files

Live Journal produces 15 million foaf files Anil Dash told me at the GNoTE conference yesterday.

That's 15 million rdf files ready to be used by a Semantic Address Book. I am thinking of writing one. Just drag and drop a foaf url (like mine) onto such an address book, and presto all the fields would get filled in including images, geo location information, and friends. The advantage over a simple vcard? An address book could poll a foaf file at regular intervals to keep up to date with changes to your friends foaf files: find out if they have started writing any new blogs, started some new projects, changed house or telephone number, moved, married, ... That should be a killer app. It would also be easy to update one's file: a simple ftp or HTTP PUT to a web server, and your friends and business partners would be able to keep in sync with your contact info.

Thursday Aug 03, 2006

I have a web 3.0 name !

Given the time I have been speaking about foaf, and given that I even went to the Foaf Galway conference a couple of years ago, it's long been overdue for me to put together my foaf file.

So today I have. I now have left behind me my old rusty human name, and assigned myself a Web 3.0 name. I am http://bblfish.net/people/henry/card#me :-). Friends can call me bblfish:me .

Notes
  • that file is served by default as rdf/xml. But if you ask for it to be served as N3 then you will get my original hand written version:
    
    curl -L -H "Accept: text/rdf+n3" http://bblfish.net/people/henry/card
    
    
  • I should point out that you don't have to give yourself a URL. That is just something Tim Berner's Lee recommended recently, to make it easier to find information on the semantic web. You can also choose to be identified by one of the inverse functional properties, such as your email address or home page. This is how I identify Simon Phipps in the foaf file. He has the inverse functional relation foaf:mbox_sha1sum to the string "ee513cd82fea84825b803a44228fd9b765baf6d5".
  • A slightly tricky thing is knowing how the directory structure of your ftp server relates to the mapping of your web server. In my case for example I had to place my file at /usr/home/hjs/www/htdocs/people/henry on bblfish.net for it to appear at http://bblfish.net/people/henry/card. Clearly if this is going to get popular it will be important to use a RESTful protocol suchy as WebDav or Atom APP to hide all this complexity from the end user.
  • Oh, and of course to get the nice HTTP magic, I just followed the Best Practice Recipes for Publishing RDF Vocabularies, though perhaps it was not stricly necessary to work so hard at that. (I just did not want to have a name such as ...foaf.rdf#me, which would have tied my name a little to closely to the rdf/xml representation.)

    Again, as mentioned in the previous point, using WebDav or Atom APP would really simplify the task of publishing such files. One just need to specify the mime type of the application during the HTTP PUT or POST operation, instead of having to do the following...

    Below is the .htaccess file I am using inspired by the above best practices guide.

    > cat .htaccess 
    # Turn off MultiViews
    Options -MultiViews
    
    # Directive to ensure *.rdf files served as appropriate content type,
    # if not present in main apache config
    AddType "application/rdf+xml" .rdf
    AddType "text/rdf+n3; charset=utf-8" .n3
    
    
    # Rewrite engine setup
    RewriteEngine On
    RewriteBase /people/henry
    
    # Rewrite rule to serve HTML content from the vocabulary URI if requested
    #RewriteCond  text/html [OR]
    #RewriteCond  application/xhtml\+xml [OR]
    #RewriteCond  ^Mozilla/.*
    #RewriteRule ^card$ card.rdf [R=303]
    
    # Rewrite rule to serve N3 content from the vocabulary URI if requested
    RewriteCond  text/rdf\+n3
    RewriteRule ^card$ card.n3 [R=303]
    
    
    # Rewrite rule to serve N3 content from the vocabulary URI if requested
    # Rewrite rule to serve RDF/XML content from the vocabulary URI if requested
    RewriteCond  application/rdf\+xml
    RewriteRule ^card$ card.rdf [R=303]
    
    
    # Rewrite Rule to redirect cards to foaf. timbl has me down as card
    RewriteRule ^foaf$ card [R=303]
    
    # Choose the default response
    # ---------------------------
    
    # Rewrite rule to serve the RDF/XML content from the vocabulary URI by default
    RewriteRule ^card$ card.rdf
    
    # Rewrite rule to serve HTML content from the vocabulary URI by default (disabled)
    # (To enable this option, uncomment the rewrite rule below, and comment
    # out the rewrite rule directly above)
    # RewriteRule ^example3$ example3-content/2005-10-31.html [R=303]
    
    
  • Notice that my name is http://bblfish.net/people/henry/foaf#me, but the document you get when you click on the link is either http://bblfish.net/people/henry/foaf.n3 or http://bblfish.net/people/henry/foaf.rdf. I am not a document.
  • Sorry, I have not placed as many relations to people I know as I should in my foaf file. That's quite a lot of work. I'll be doing that next.
  • Now for the fun. My URL works with Tim Berner's Lee's Tabulator. (After setting the Firefox security preferences, as explained in the "Help" section). By highlighting lattitude and longitude columns, then clicking save current query, one can get locations to appear on the Maps tab! neat! I have placed an image online here for those who just want a quick impression of what it does.

Search

Flickr Diary

www.flickr.com
This is a Flickr badge showing public photos from bblfish. Make your own badge here.

Recent Entries

Navigation

Referers