The Sun BabelFish Blog
Don't panic !
REST without RDF is only half as bad as SOAP
I have been a very strong proponent of REST since I came across Roy Fielding's thesis two years ago. His thesis is an abstract description of the architecture of the web, which explains why it is so important to work with the 4 HTTP verbs GET, PUT, POST and DELETE as methods that work on resources named by URIs.
Of all the xml data formats (as opposed to markup formats such as xhtml or OpenDocument, which is not my topic here), RDF is the one that takes the lessons of REST to heart. In RDF every concept, every relation, every object has a permalink so to say, a fixed URI that identifies it. The R in RDF stands for Resource after all. So if you don't know the name of a concept you can GET it's meaning.
But what does taking REST seriously the way RDF does, give one? Well it help one avoid the minefield of complexity that is popping the SOAP bubble, and that is slowly also going to kill or at least severely restrict the interest of the Atom Protocol. Because not only does RDF take REST seriously as mentioned, it is also underpinned by mathematical logic, in such a way that one can build formats that are extensible in a distributed way.
Being able to work in parallel in a distributed way is exactly what has
made the web such an amazing success. And here again the URI is the key
to understanding this. A URI is made up in such a way that everyone on
the web can coin their own by following a few simple rules, without
encroaching on someone else's name. IE. you can buy your domain name mydomain.com
and every string starting with http://mydomain.com/ will be under your
authority. Since RDF allows one to name everything with URIs, it
partakes of the same advantages. Couple that with a clear and simple
semantics, and everything just comes together like magic.
So let's look at the Atom protocol. Atom is a good example of a project that understood the mistakes of SOAP and xml rpc, and took the first stage of REST seriously: it does not wrap HTTP headers in xml, and it uses the HTTP verbs correctly. As far as that goes it is a huge improovement. But there were never enough people on that working group who understand RDF for it to be possible to get their voice to be heard by the group. As a result we have a RESTful protocol without RDF.
And the obvious problems follow:
-
An explosion of complexity:
-
mime type explosion: the atom xml format is served with the mime
type "application/atom+xml", and the new protocol needs another
mime type "application/atomserv+xml". In RDF all information
pretty much can be expressed with one mime type
"application/rdf+xml". If you transform atom feeds to AtomOwl
you can serve the result with the default rdf mime type, or a also
a more human readable one such as " text/rdf+n3".
Where does this mime type cluttering stop? Do we need a new mime type for a document that describes pigs? And another one for a document that describes dogs? And perhaps yet another one for a document that describes Animals? And what about bodily parts? Or banks statements? Why does every xml format need its own mime type? Is it not all just xml? Is that not the universal format? Well perhaps the fact that xml is pure syntax without semantics may have something to do with it... -
xml format explosion: the atom spec needs a way to list categories
as the metaweblog api does. So it wants to create a new xml format
≤categories≥...≤/categories≥ that needs to explained in english
all the terms (as rdf would too) but also how they fit together,
special cases, and how it can be extended. In RDF all one needs to
do is explain the meaning of the terms. One does not also have to
invent a semantics each time round. Because each xml format needs
pretty much its own interpretation engine (or else it is just
meaningless angle brackets), mime types start taking on more and
more importance. After all, you want to know what you are going to
GET if you can't interpret most of what is there.
With RDF there would be no need to invent a new format for listing categories. The vocabulary allready exists for it. It would be as simple as using it. Here we go:@prefix : ≤http://bblfish.net/work/atom-owl/2006-06-06/#≥ [ a :Category; :scheme "http://eg.com/cat/"; :label "philosophy"; :term "humanities/philosophy" ]. [ a :Category; :scheme "http://eg.com/cat/"; :label "blogging"; :term "technologies/publishing" ]. - Difficulty in finding information: The atom namespace at least has pointers to the spec, which is good. But it is not close to being as automateable as getting the meaning of an rdf name, which I described above. There is a mixture of meaning in the english document and in the RelaxNG grammar. Information about inheritance especially is left in english.
- Non automateability: Since things are less well defined, less distributed and more ad-hoc, reading and interpreting a format has to be done on a case by case basis, with unecessary human involvement at each turn (unecessary, since clearly RDF which also requires human involvement, won't need that kind).
-
mime type explosion: the atom xml format is served with the mime
type "application/atom+xml", and the new protocol needs another
mime type "application/atomserv+xml". In RDF all information
pretty much can be expressed with one mime type
"application/rdf+xml". If you transform atom feeds to AtomOwl
you can serve the result with the default rdf mime type, or a also
a more human readable one such as " text/rdf+n3".
- The spec is not easily extensible: there is some talk in the atom format about extensibility, but it is local to that format. So you have to learn the extensibility possibilities of atom. Other formats can and will do things differently. And as a result there will be misunderstandings for each format.
- Bureaucratic: XML feels like one is working in a vast bureaucracy. Everything feels like a form, with special slots that can be filled in, and others that can't. The thing is unwieldy and so one finds more and more odd behaviors appearing to run around the hoops. In Atom I'd say their new introspection document, has that general smell.
- Difficult to learn: As a result of the complexity, lack of generality, people moving between simple xml formats will find it difficult to learn each particular one. In fact it becomes increasingly impossible to master even a small fraction of them. This is one of the things that is killing SOAP. If some have similar general dreams about Atom, then they should be clear that they have exactly the same problems.
All of the above act as a serious break on what will be doable by the Atom Protocol. It certainly makes developing the protocol a lot harder than it should be, and as a result there are some indications that it will be less powerful than the MetaWeblog API which I have seriously criticised.
The world is an incredibly complex place. If you don't start off with the simplest possible structures things will get very quickly out of hand. Much more quickly than even the richest corporations can possibly afford to deal with. Mathematicians and logicians are those whose art over the years has been towards describing more and more complex structures in the simplest of ways. I prefer to build my house on the rock of mathematical logic, than on the promises of a SOAP bubble.
Update
- Danny Ayers has a good post on Neo's view of xml, in which he correctly argues that all these formats present to problem for the Semantic Web, since it is dead easy to transform those xml formats into rdf. And he is totally right about that. AtomOwl provides a number of tools (xslt and xquery) to transform atom xml into AtomOwl rdf. That's not the problem. It's just that it is so tiring to see people wrangle so hard with self imposed problems.
- Mark Baker makes a similar point but from a clearer architectural perspective in Standards as Axioms. That point also links to a post where he is more explicit about the problem of schema explosion that rdf solves.
Posted at 02:42PM Jul 10, 2006 [permalink/trackback] by Henry Story in SemWeb |
Note on comments:
- I know the forms below are a little small. We have asked for years for this to be changed, but I don't think it's going to happen soon. In Apple's Safari you can resize the entry box with you mouse. For people using other browsers click on this javascript link, that should allow you to resize your form.
- Comments are moderated, so they will take a little time to appear. Currently moderation means I have to read them personally. Hopefully with OpenId deployment, this will become more automated.
- HTML markup no longer works here, due to some decision made somewhere. Sorry about that.
- If you are having trouble posting, it may be that you need javascript to be enabled. I don't think javascript should be needed for submitting a form, but that's the way it is here.
- Check your comments by using the preview button...
