The Sun BabelFish Blog
Don't panic !
Java Annotations & the Semantic Web
Intro
The topic of annotations has been making headlines in the blogosphere [1] [2], and so this is probably a good time to write these thoughts out. As a side note, I have already implemented something along these lines in the current version of BlogEd and this is just a generalisation and improvement over that initial work. Anyhow I want to show here how annotations can be used to make the relation between java and the semantic web obvious to any java programmer.A very quick intro to the Semantic Web
If I can put it really concisely, the semantic web is best thought of as a mathematical structure based on graph theory. It is really very easy to understand. It reduces everything to triples [Subject Relation Object], where the Relation is always identified by URIs, the Subject sometimes (and sometimes by an unnamed node). Finally the Object can be either a URI, a blank node or also be a literal, which is a string or XSD specified bit of xml. Dates for example are good candidates for Literals. You can express anything with this, so mathematicians have proven, apparently. A quick example:
_X rdf:type foaf:Person.
_X foaf:name "Henry Story".
_X foaf:mbox mailto:hjs@bblfish.net.
The above uses the very useful foaf
vocabulary to say that there is an entity _X (an anonymous node) that
is a Person, has a name "Henry Story" and a particular mailbox. Notice
that vocabularies can be easily mixed. So I could use some geolocation
vocabulary to specify where I am at a particular time. Simple. Even more
so, when one presents it in a graphical way
The semantic web can be serialised in many ways. The unfortunate XML/RDF format is one way. Much closer to the structure of the model is NTriples, easier still to read for humans is Turtle or N3. But for those of us who program in Java, a java serialisation would be best of all. All of them are ways of describing the above graph.
Java as an serialisation of the Semantic Web
I'll argue a little provocatively here perhaps, that Java, or at least a superset of JavaBeans, can be thought to be a serialisation of the RDF. RDFS[3] gives us a vocabulary to describe object oriented structures such as those given by Java Beans. It has predicates to specify that an URI is a class and that another is a relation. You can specify the domain of and the range of a relation. This is all that is needed to describe a Java Bean.Here is a first attempt [6] at annotation the AtomPerson class (I have just found a bug in BlogEd so I can't write the less that and greater than symbols which I have replaced by ≤ and ≥)
@RDF(AtomPerson.BASE+"Person")
public interface AtomPerson {
String BASE = "https://bloged.dev.java.net/Ontologies/Atom/2005-01-03/"
@RDF(BASE+"name")
public void setName(String name);
public String getName();
@RDF(BASE+"email")
public void addEmail(URI email);
public Collection≤URI≥ getEmails();
@RDF(BASE+"dateOfBirth")
public void setDateOfBirth(Date birth);
public Date getDateOfBirth();
}
So here we annotate the class and the bean methods with URIs. Having done
this we could then automatically serialise the above Java Bean in any of
the other RDF formats (once one agrees to the correct serializations for
some of the primitive java types, such as int, or Date). That is really
it. That's how easy RDF is.
The above gives us the semantics for RDFS [3]. OWL [4] gives us more power, and brings us closer to (or even beyond I am not sure) what we can get with UML class diagrams. OWL allows us to specify the following properties of relations:
- functional: if A rel B and A rel C the B == C
- inverse functional: if A rel C and B rel C then A == B
- the max number of values for an instance and the min number. This is useful for the add... type beans (the setXXX methods clearly have at most one value.
- transitivity: if A rel B rel C then A rel C
- symmetricity: if A rel B then B rel A
@RDF(AtomPerson.BASE+"Person")
public interface AtomPerson {
String BASE = "https://bloged.dev.java.net/Ontologies/Atom/2005-01-03/"
@RDF(BASE+"name")
public void setName(String name);
public String getName();
@RDF(BASE+"email") @InverseFunctional
public void addEmail(URI email);
public Collection≤URI≥ getEmails();
@RDF(BASE+"dateOfBirth") @Cardinality(max=1,min=1)
public void setDateOfBirth(Date birth);
public Date getDateOfBirth();
@RDF(BASE+"sibling") @Symmetric @Transitive
public void addSibling(Person sibling);
public Collection≤Person≥ getAllSiblings();
}
So the sibling relation (uniquely identified by the https://bloged.dev.java.net/Ontologies/Atom/2005-01-03/sibling
uri and by the addSibling and getSibling methods) is clearly
- symmetric: because if Christina is my sibling then I am her sibling
- transitive: because if Alex is a sibling of Christina then Alex is also my sibling
What does it give us?
- For one we can see how easy it is to understand OWL :-) Please mail me if you still have not understood at henry.story at bblfish.net. I need to know how I can improve this overview. :-)
- This should provide a much more generalised way of mapping JavaBeans to a database. Using URIs for beans is a lot better than using table names. URIs are *Universal*. SQL tables layouts are different from one database to the next. It should be up to the database administrator only to set the mapping from RDF into his private internal scheme. We have done here all the mapping that needs to be done. This has been argued very clearly by the Model Driven Architecture crowd [7]. But I think the Semantic Web adds a generality to the whole concept that both simplifies the problem, standardises it with web technologies, and thereby makes it far more accessible that the Model Driven Architecture could make their framework. Furthermore the above annotation scheme seems a lot simpler than the JMI spec.
- We can implement our beans by using dynamic proxies or other aspect oriented techniques to have our classes mirror an RDF or any normal relational (SQL) database
- The above now maps very easily into UML class diagrams (which one can argue are just another notation for OWL)
- We can annotate normal classes too (with some questions as to how the behavior is to be understood)
- others?
Backward compatibility
I'll argue that all java beans by default already work this way. All we need to do is give every java class a URI. Luckily that is easy as a unique package naming mechanism was built from the outset right into java. Let us invent a URI scheme for java classes to make this more obvious. Letjava:com.sun.labs.tools.blog.AtomPerson,1.2/
would be the URI of the current class [6] and the name relation could be identified via the URI
java:com.sun.labs.tools.blog.AtomPerson,1.2/name
So any java bean can already be transformed into RDF. The RDF annotation we developed above could then be seen to enable us to override the default URI for the class, interface or property.
Possible simplification
In the above I have only annotated the setter method. One could also annotate the getter, adder, getAll methods or even a field. This ends up creating too many places for annotations I think. Is there a standard solution for this? One of these [8] would be to just annotate the variable@RDF(AtomPerson.BASE+"Person")
public AtomPerson {
@Access(read | write) @RDF(BASE+"name")
@Cardinality(max=1,min=1) String name;
@Access(read | write) @RDF(BASE+"email")
@InverseFunctional URI email;
@Access(read | write) @RDF(BASE+"dateOfBirth")
@Cardinality(max=1,min=1)
Date birth;
@Access(read | write) @RDF(BASE+"sibling")
@Symmetric @Transitive Person sibling;
}
And apparently we can then use apt
to then create the getters, setters, adders, etc... Getters would be
produced only for fields with read access and setters only for fields with
write access clearly. This is nice and terse. It will probably be used to
generate the more verbose version we had previously, so it this would be
mostly something one could do for convenience. Still there may be many
cases when this is all that is needed.
More advance Java Beans
In the above I have been working with a little more developed version of Java Beans, for which we clearly need to develop a . Java Beans don't distinguish very well between a thing that is related to a collectiona R (c,d,e,f)
and a thing having numerous relations of the same type
a R c a R d a R e a R f. In Java Beans there are just setters and getters. RDF allows one to distinguish between those cases. So our more advanced java beans would need an
addRelation(Object) and getAllRelation() types
to allow us to add single relation instances to our graph. (Perhaps we can
distinguish these cases simply by using an annotation.) There is in fact a
pragmatic reason why one may want to have an adder method. Sometimes I am
sure it would be a lot faster to do an add operation to a database rather
than having to fetch all the elements in order to add one more element to
the collection. So there may also be some efficiency reasons for doing
this apart from allowing us to map better to RDF and UML.
Combined Inverse Functional Properties
One of the uses of Inverse Functional Properties is that they serve as hints to the database that the value is what in the well established relational database world (note that RDF is also all about relations) is known as a primary key. Given an e-mail address for example you can search in the table [email inverse(mbox) person] table to find the Person that is identified by the email. inverse(X) is a function that gives us the inverse of a relation. And so inverse(mbox) is a functional relation, since mbox is an inverse functional one (duh!). But how do we deal with compound keys? I found these popping up all over the place in my work on BlogEd.Compound keys are what I have named Combined Inverse Functional Properties (CIFP). They state that two object together when known identify something uniquely. Let us imagine a world, a kind of swiss cloud cookoo land, where by design no one ever has the same first name and surname. The first name and surname combination always identify one and only one person. In such a world one could say that the relation from a person to the pair (first name, surname) is inverse functional. Call that relation the fullname relation. How can we now use annotations to specify such a relation?
Well perhaps the following will do the trick. As hinted above the relation we are looking for is towards an ordered pair. And how do we specify an ordered pair in java? With arguments! The arguments of a method are just a bit of syntactic sugar to help us specify a relation to a pair.
@RDF(FoafPerson.BASE + "Person")
interface FoafPerson {
String BASE = "http://xmlns.com/foaf/0.1/";
@RDF(BASE+"weblog") void addWeblog(URI uri);
@RDF(BASE+"weblog") Collection≤URI≥ getWeblogs();
@RDF(BASE+"surname")
@Functional String getSurname();
@RDF(BASE+"surname") void setSurname(String surname);
@RDF(BASE+"firstName") String getFirstName();
@RDF(BASE+"firstName") void setFirstName(String firstName);
@RDF(BASE+"fullName") String[] getFullName()
@RDF(BASE+"fullName") @InverseFunctional
void setFullName(@RDF(BASE+"fistName") String firstName,
@RDF(BASE+"surname") String surname);
@RDF(BASE+"mbox")
@InverseFunctional void addMbox(URI mbox);
@RDF(BASE+"mbox") Collection≤URI≥ getAllMbox();
}
So here we have the setFullName relation that is a relation to a pair. In
the arguments we specify further how each pair is itself related to the
object in question, by annotating the arguments themselves. This I believe
gives us exactly what I was looking for.
So as this is getting just a little complex, let us put together an example of how this might work in practice. Let us imagine that we have a framework that maps our annotated interface to a deductive database. We might then have the following behavior.
FoafPerson me = factory.create(FoafPerson.class);
me.setFirstName("Henry");
me.setSurname("Story");
me.addMbox(URI.create("mailto:hjs@bblfish.net"));
//let us create another person
FoafPerson someone = factory.create(FoafPerson.class);
someone.setFirstName("Henry");
assert(!someone.equals(me)); //yes. They should be different. There could be another person named "Henry"
//now let us add the family name
someone.setSurname("Story");
assert(someone.equals(me));
Well perhaps we don't even need an inferencing backend layer to deduce the
above. That could be done with some very simple java. We might find the
inferencing helpful if we wanted the following behavior to follow:
assert("mailto:hjs@bblfish.net".equals(someone.getAllMbox().iterator().next().toString());
The idea is that as soon as the object knows both my names it would deduce
the that the two variables me and someone refer
to the same person and therefore know that all the other properties are
the same too. Notice that that does not require some black magic
inferencing layer, but some pretty standard simple inferencing on the
equality of objects.
Prior Work
Frank McCabe made me aware that I had not mentioned similar work in the field. Here is a list of some of the other projects that I am aware of:- There is Jastor that works with HP's Jena framework to create Java Beans from OWL files. The Beans created are implemented classes that make calls in the Jena framework.
- There is a very simple Elmo library that works with the Sesame framework, though I am not sure how extensible it is.
- rdfreactor [9] like bloged currently creates interfaces and uses dynamic proxy objects to implement the behavior of the interfaces at run time. The interfaces are generated from OWL/XML files. I learnt a lot from this framework.
- BlogEd 0.7 uses static final variables to annotate methods and interfaces. These interfaces are then used as with rdfreactor by a factory object to create dynamic proxy objects that wrap the behavior.
- Annotations allow one to express OWL directly in Java, which makes the relation between RDF and Java much easier and clearer for Java programmers to understand, as well as making it much easier to work with.
- Just as important, annotations allow one to separate the implementation of the behavior from the declaration of it. The behavior can now be coded directly into a class as with Jastor, or it can use Dynamic Proxies as rdfreactor and BlogEd are currently doing, or it could even use aspect oriented programming languages. This will allow one to write mappings from annotated java classes into RDF/OWL and vice versa without specifying the implementation. Other libraries will then more usefully specialise in interpreting the annotations in the way best suited to the database or framwork used by the application. This should help interoperability between the RDF frameworks.
Conclusion
Mapping between the triples of the Semantic Web and Java Beans is really easy with annotations, which is quite weird if you think about it, since annotations are a way to add metadata to java, and the semantic web is the most general metadata framework in existence. In any case once this relationship is understood we have gone most of the way towards laying the foundations of an open, standards based, model driven architecture I believe. The Semantic Web builds on the most fundamental part of the Web: It's universal naming scheme exemplified by URIs. This gives us something fundamental that all the previous systems lacked or had to invent in an adhoc manner. This should therefore get us a lot further than any previous systems could. I'll be exploring this advantage further on this blog, and in my BlogEd code.Does anyone have any feedback on this? Don't hesistate to mail your questions. I believe that if properly explained this is really not very difficult for Java programmers to take on board. So if you don't quite understand, it's my fault!
Update: thanks to feedback by Pete (UK) Kirkham and Danny Ayers.
- Annotations: Don't Mess with Java
- Annotations are the best thing that has happened to Java in a long time
- RDF Vocabulary Description Language 1.0: RDF Schema
- Web Ontology Working Group papers
- in the spirit of the article Using Annotations to add Validity Constraints to JavaBeans Properties Thouth the type of constraints given by the paper above would seem more correctly best encapsulated in Java by classes. A Date, a social security number, and any other object is probably best identified by a class, which then will have many different serialisations depending on the locale, serialisation format, etc... So though I think the article makes for some very good reading I don't think that it gives the best example for using annotations in java. RDF is a much more powerful and useful tool.
- It seems clear to me that setters and getters don't give us quite all that we want. It would be really nice if java beans also had addXXX and getAllXXX. See the section More Advanced Java Beans above.
- see especiall the paper "Model-driven architecture: Vision, standards, and Emerging technologies" on the JMI page.
- see section "Constraints as Part of a Property Annotation" of [5]
- I just discovered that the RdfReactor team have submitted a paper to ISWC2005 where they mention that they want to use annotations in their next version. That is a very good read btw.
- These thoughts are now (Summer 2006) being developed on the so(m)mer project on dev.java.net.
Posted at 11:13AM Aug 25, 2005 [permalink/trackback] by Henry Story in SemWeb | Comments[6]
