Tuesday June 14, 2005
Liane Praza's WeblogLiane Praza's Weblog (Open)Solaris: getting started OpenSolaris has arrived. It is an amazing thing to be able to share with the world what we've all been pouring our lives into for so long. I realized that a great place to start would be my first putback to Solaris as part of the kernel group. I'd had more than a passing familiarity with Solaris as part of the team that released Sun Cluster 3.0. As the cluster software was intricately tied to Solaris, I had a number of opportunities to do minor modifications to Solaris to make it interoperate better with the clustering software. But, by 2001 I was in the big leagues -- I had joined the larger team primarily responsible for the code released today in OpenSolaris. Almost everyone new to Solaris starts by fixing a few bugs, and I expect that will be common for new people contributing to OpenSolaris too. Fixing a small-ish bug is the best way to figure out what's really involved in putting code back into (Open)Solaris. My first bug was 4314534: NFS cannot be controlled by SRM. Essentially, our resource management tools work on LWPs, not kernel threads. NFS ran as a bunch of kernel threads, so administrators were unable to have NFS as a managed resource; it often took priority over other applications on the system. A senior engineer had already suggested an approach:
A better way to solve this would be to have nfsd (or lockd) create the
lwps. nfsd can park a thread in the kernel (in some nfssys call) that
blocks until additional server threads are needed. It can then return to
user level, call thread_create (with THR_BOUND) for however many lwps are
needed, and park itself again. Since this will only happen when growing the
NFS server thread pool, the performance impact should be negligible. The
newly created lwps will similarly make an nfssys call to park themselves in
the kernel waiting for work to do. The threads parked in the kernel should
still be interruptible so that signals and /proc control works correctly.
If the server pool needs to shrink, an appropriate number of lwps simply
return to user level and exit.
The userland code was pretty simple, and is shared between nfsd and lockd in
thrpool.c.
/*
* Thread to call into the kernel and do work on behalf of NFS.
*/
static void *
svcstart(void *arg)
{
int id = (int)arg;
int err;
while ((err = _nfssys(SVCPOOL_RUN, &id)) != 0) {
/*
* Interrupted by a signal while in the kernel.
* this process is still alive, try again.
*/
if (err == EINTR)
continue;
else
break;
}
/*
* If we weren't interrupted by a signal, but did
* return from the kernel, this thread's work is done,
* and it should exit.
*/
thr_exit(NULL);
return (NULL);
}
I also had to make the additions and modifications to
/* * Userspace thread creator variables. * Thread creation is actually done in userland, via a thread * that is parked in the kernel. When that thread is signaled, * it returns back down to the daemon from whence it came and * does the lwp create. * * A parallel "creator" thread runs in the kernel. That is the * thread that will signal for the user thread to return to * userland and do its work. * * Since the thread doesn't always exist (there could be a race * if two threads are created in rapid succession), we set * p_signal_create_thread to FALSE when we're ready to accept work. * * p_user_exit is set to true when the service pool is about * to close. This is done so that the user creation thread * can be informed and cleanup any userland state. */ Of course, much to my chagrin, the change had unforeseen implications,
which caused bug
4528299. Fixing that was fun and required changes to
None of this is particularly sexy or subtle, but.. hopefully now you see the type of place we all start with Solaris (and now OpenSolaris). A disclaimer is also required. I'm by no means an NFS expert -- those who were simply allowed me into their code to accomplish a specific task. Check out blogs by actual NFS experts like Spencer Shepler and David Robinson for more detailed NFS information.
Technorati Tag: OpenSolaris
|
Calendar
RSS Feeds
All /General /Solaris SearchLinks
NavigationReferersToday's Page Hits: 168 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||