Andy Tucker's Weblog
Andy Tucker's Weblog

20040803 Tuesday August 03, 2004

Door API details

I keep answering this question (or variations) in email, so I thought it might have wider interest. Plus this way I can point to the blog entry rather than repeating myself endlessly. One of the things I've worked on in the past is Solaris Doors. Doors are an inter-process communication mechanism with an RPC-like client/server interface. They differ from "standard" RPC by being (a) fast, (b) relatively simple, and (c) restricted to a single system. In addition, there are some features (particularly the ability to pass door references, and the unreferenced notification) that lend themselves well to implementing complicated distributed system semantics (in fact, the Sun Cluster 3.x product uses a CORBA-style ORB for inter-process and inter-node communication, part of which is implemented using doors). Doors are used fairly extensively within Solaris daemons and other system-level software that is shipped as part of the OS.

A door is created when a process (known as the door server) calls door_create(3DOOR) with a server function and gets a file descriptor back. That descriptor then can be passed to other processes or attached to the file system using fattach(3C). Once another process (the door client) has the descriptor, it can "invoke" the door by calling door_call(3DOOR). The client can also pass data and descriptors (including other door descriptors). As a result of the call to door_call, the client thread blocks and a thread in the door server wakes up and starts running the server function. When the server function is complete, it calls door_return(3DOOR) to pass (optional) data and descriptors back to the client. door_return also switches control back to the client; the server thread blocks in the kernel and never returns from the door_return call.

This leads to a problem: if I allocate data to return to the client via door_return, how do I free it? I can't free it before calling door_return, obviously, and control never returns to me after calling door_return (unless there's an error), so I can't just free it after the call. There are a few ways to handle this (in increasing order of complexity):

  • Copy the data to the stack. On each door call, the server thread's stack is "rewound" to the base. This implicitly frees any data on the stack, so any data that needs to be returned to the client can be first copied onto the stack (using a local variable or alloca(3C)), then freed before calling door_return with a pointer to the stack data.

  • Use thread-specific data. When a server thread starts running the server function, we know that any data previously by returned a call to door return from the same thread has already been copied into the client's address space. This means you can use thread-specific data to track previously returned data; for example, the server function could check and free any per-thread data stored due to prior door calls before continuing to execute. Note that, if the server thread is never re-used, the data will still be allocated. Other threads can't free this data since there's no way to make sure the data has been copied back to the client.

  • Use a door reference. When returning data that needs to be freed, create a door with the DOOR_UNREF flag and associate the door's unique ID (see door_info(3DOOR)) with the data in a hash table. Then, pass the door back to the client. The client should call close(2) on the door as soon as it receives it; this will send an unreferenced notification to the server (as long as the server still has one reference, since the notification happens when the reference count goes from 2 to 1). An unreferenced notification is just like a normal door call from the server's point of view, except that the data pointer is set to a special value (see the door_create(3DOOR) man page for details). When the unreferenced notification happens, the server can look up the unique id in the hash table and free the referenced data.

I've also considered extending the doors API to include something like a door_reply() function, which could be used (optionally) to specify reply data without losing execution control. On return from door_reply, the reply data will have been copied back to the client (or into the kernel), and the server can free the data from its address space. The control transfer back to the client would happen with a subsequent door_return() call (the arguments of which would be ignored). This is a bit slower than the standard door_return semantics (since two trips into the kernel are required), but makes freeing reply data and other server-side cleanup much simpler. Unfortunately, I haven't had time to actually implement this, or convince someone else to do it.

For those wishing more information on doors (particularly if the above didn't make any sense), there's a good introductory chapter in the second edition of Unix Network Programming, Volume 2: Interprocess Communication by the late Richard Stevens. The original idea for doors came from Spring OS, a research operating system developed in Sun Labs. The details were changed significantly in the transition to Solaris. There is also a Linux implementation based on the Solaris API, though it isn't part of the standard kernel.

(2004-08-03 14:39:43.0) Permalink Comments [4]


archives
links