The Sun BabelFish Blog

Don't panic !

Tuesday Aug 29, 2006

What Atom is all about: resource metadata

Close to a year ago I posted a quick note on the atom-syntax mailing list highlighting the very strong relation between atom and our well known file systems. I just thought I may as well republish it here, as it has aged well.

REST does not have the notion of a directory hierarchy. You may think it does because urls have slashes in them, and because you can browse the file system in a web browser. But really you are just lucky that most people use hierarchical file systems to construct their urls. The pages you get back if you point your browser to something like http://bblfish.net/bloged/ is an html human readable representation of the contents of a directory.

What is a feed? It is a machine readable representation of the RESTful equivalent of a directory listing.
What is an entry? It is the machine readable representation of the RESTful equivalent of file metadata.

Directories contain files. Feeds contain entries. Files contain 'content'. Entries contain content too. When you type 'ls -ali' in a directory

bash-2.03$ ls -ali
total 61
2705932 drwxr-xr-x   3 hjs  vuser    512 Nov 22 16:10 ./
2662433 drwxr-xr-x  34 hjs  vuser   1024 Nov 22 16:10 ../
2705933 -rw-r--r--   1 hjs  vuser   2059 Nov 22 16:10 BlogEd.jnlp
2705934 -rw-r--r--   1 hjs  vuser  49674 Nov 22 16:10 BlogEd.tiff
2705935 drwxr-xr-x   2 hjs  vuser   1024 Nov 22 16:10 lib/

you get all kinds of metadata about the directory '.', and about the files in the directory. The first number is the inode, which is a little like the src url of the content.

The equivalence between the above and a basic atom feed is strinking:

<xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom";>

  <title>Example Feed</title>
  <link href="http://example.org/"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author>
       <name>John Doe</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

  <entry>
       <title>Atom-Powered Robots Run Amok in France</title>
       <link href="http://example.org/2003/12/13/atom03"/>
       <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
       <updated>2005-11-13T18:30:02Z</updated>
       <summary>Some text.</summary>
       <content type="xhtml" xml:lang="en"
          xml:base="http://diveintomark.org/">
          <div xmlns="http://www.w3.org/1999/xhtml">         
        <p><i>After a crash course in french cultural integration...</i></p>
          </div>
       </content>
  </entry>
</feed>

The information is just presented differently. The xml makes it easier for machines to parse the content, as it deals with all the problems of unicode and separation of the data. Yes you can use sed and grep in unix, but you always will come across weird corner cases, such as when a file has a blank space in it, non ascii weird characters, ...

Entries and Feeds are therefore both containers. In Unix we have known this all along: everything is a file. It is just that Directories are special kinds of files, that contain files themselves. So, as I have argued for a long time, Feeds are really types of Entries.

Feeds can have images (icons) associated with them. In most end user file systems, so can folders. The atom group is working on adding the ability to add icons and images to entries too, then the symmetry will be complete.

In sum, Entries are really containers for file metadata. As OSX is slowly moving to enabling file system metadata, (see Ars technica article),

% xattr --set name John file
% xattr --set color red file

% xattr --list file
file
        color   red
         name   John

so using the link relation one can add all types of metadata on feeds or entries

<entry>
   <title>Atom draft-07 snapshot</title>
   <link rel="alternate" type="text/html"
         href="http://example.org/2005/04/02/atom"/>
   <link rel="enclosure" type="audio/mpeg" length="1337"
         href="http://example.org/audio/ph34r_my_podcast.mp3"/>
   <id>tag:example.org,2003:3.2397</id>
    ...
</entry>

The attribute value of the rel relation is in fact a url, as befits a web document text, and people can add whatever metadata they like there.

By adding a "feed" link relation one allows feeds to point to feeds, which is the equivalent of directories inside directories.

A few improvements of atom over directories is that our feed can contain not just the current version of an entry, but all previous versions as well, which I think I remember was a feature supported by the vms file system.

Of course the URI id construct and the fact that the web does not have partitions, allows one to keep track of the identity of a file over the whole web, which simple inodes just cannot do.

Not surprising then someone has created an app filesystem. You can now mount your atom collection on the command line!

Comments:

Note on comments:

Post a Comment:
  • HTML Syntax: NOT allowed

Search

Recent Entries

Navigation

Referers