Trond Norbye's Weblog

Main | Next page »

http://blogs.sun.com/trond/date/20091017 Saturday October 17, 2009

Bazaar shared repositories and Hudson build slaves

I created a Bazaar plugin for Hudson a while back, and we have been using that with great success on our Hudson build farm to build Drizzle, libmemcached and Gearman.

If you look at our build farm you will see that we compile the same source project multiple times on the same slaves, only with a variation in the configuration (compiler, 32/64 bit etc). With the normal configuration you will check out the complete repository for each of the projects, using a lot of bandwidth and disk space. Disk is cheap these days so I don't care that much about that, but my bandwidth is limited so I would like to reduce that if possible.

If we look at the disk layout on the build slaves it perfectly suited for using Bazaar shared repositories:

$SLAVE_HOME/workspace/

So if we create a Bazaar shared repository in $SLAVE_HOME/workspace, all of our Bazaar projects will store the upstream information in $SLAVE_HOME/workspace/.bzr. Not only will this dramatically reduce the disk footprint, it will also reduce the amount of data being downloaded. So how do you set up a shared repository? It's no magic, simply log into the build slave and run the following command:

trond@storm > bzr init-repository $SLAVE_HOME/workspace

Unfortunately there is a "bug" in Hudson causing Hudson to nuke the directory every day when it is doing housekeeping of the workspace directory. Luckily for us there is a simple workaround for the issue, simply create a project named ".bzr".

Update: Hudson will also nuke the directory if it hasn't been modified on a time period, and it seems that bazaar doesn't update the modified date on the .bzr directory. A simple workaround for this is to install the following cron job:

1 1 * * 0 /usr/bin/touch /home/hudson/hudson/workspace/.bzr

http://blogs.sun.com/trond/date/20090915 Tuesday September 15, 2009

Trailing whitespace

If you are an emacs user and don't want your source code to include trailing whitespaces, try adding the following to your ~/.emacs file:

(add-hook 'before-save-hook 'delete-trailing-whitespace)

This will automatically remove all trailing whitespaces from your buffer before saving your file!

http://blogs.sun.com/trond/date/20090911 Friday September 11, 2009

Don't log, dump core!

A common trend as a profession grows and diversifies is the loss of the good, old craftsmanship; software development is no exception. It seems to me developers who use a debugger are a dying breed, and many who do more than "rm" on a corefile are really hard to find. So what's wrong with logging?? Well, I'll suggest that you start off by reading Don't Log, Debug!

I think Tor has some really good points in his blog post. When you write the program you don't know where the bug is, so you will most likely not include enough information to track down the bug anyway. You will most likely have to provide the customer with an instrumented version if you cannot reproduce the problem locally.

I have heard people trying to excuse themselves by saying: "I can't use a debugger to find this issue because it is a timing issue". Well, I don't buy that, because if they have to add more logging from their code the timing will change as well (and possibly mask out the error).

Tor works in "the Java world", whereas I spend my time developing C++/C programs. We have an option as well: coredumps. When you load the corefile into your debugger you can inspect every variable in your application at the time you generated the coredump, and you can look at the callstacks from all of the threads in your program. Personally I find it much more fun to use the debugger to inspect the corefile instead of reading through miles of logfiles...

With this in mind, rethink the excuse with timing issues. Wouldn't it be better to just dump core when you encounter the problem and load the corefile into your favorite debugger :-) .

You may think that dumping core is brutal to your users, because not all failures are fatal errors. You may be able to recover gracefully from some of the errors, but even if you don't know what leads to the error I still generate a coredump so I can dig into the problem. To avoid shutting down the service, I'll just fork off a copy of the program to generate a dump from:

#define recoverable_assert(ev) do_recoverable_assert(ev, #ev, __FILE__, __LINE__)

int do_recoverable_assert(int eval, const char *expression, const char *file, int lineno)
{
   if (eval == 0) {
      if (fork() == 0) {
         fprintf(stderr, "%s:%u: %s\n", file, line, expression);
         abort();
      }
      return 1;
   }

   return 0;
}

... cut ...

if (recoverable_assert((address % 8) == 0) {
   /* the address for the client buffer isn't aligned, start recover */
}

Unless you have modified your environment, you should get coredumps when your program performs an illegal operation. Unfortunately some engineers/managers think it's inappropriate for the user to get a coredump from a program, so they add logic into their program to trap such signals and exit cleanly. Personally I don't think that this is a good idea because the engineers lose valuable information when a problem occurs at a customer's site. With the corefile available you may investigate on a problem you fail to reproduce locally (and if the customer don't want to release the corefile due to security reasons, it is still possible to debug on-site). If the user doesn't want the coredump, they should turn this feature off in his shell/startup script before starting the program.

http://blogs.sun.com/trond/date/20090810 Monday August 10, 2009

Hudson Bazaar plugin released

I released the first version of my Bazaar plugin for Hudson a few days ago, so you should be able to install it by logging into your Hudson server and select: Manage Hudson, Manage Plugins, Available.

To use the Bazaar plugin, press Configure for the desired project and scroll down to the Source Code Management section and select Bazaar. Fill in the Repository URL, and if you would like to nuke the repository each time, press the advanced button and check the "Clean build" check-box. See:

http://blogs.sun.com/trond/date/20090607 Sunday June 07, 2009

Compiling Drizzle on OpenSolaris 2009.06

I thought it would be appropriate with a new and updated blog post on how to compile Drizzle with the release of OpenSolaris 2009.06. To make the blog more copy'n'paste friendly I have removed the prompt from all of the command's I am displaying :-)

The first thing we need to do is to install a complier, and all of the common tools used to build opensource projects. Drizzle also require libevent and gperf, and there exists precompiled packages for them. So let's go ahead and install the software with the following command:

   pfexec pkg install ss-dev SUNWlibevent SUNWgnu-gperf

I like to put the software I compile in separate ZFS filesystems, so let's go ahead and create:

  • /opt/dscm - To hold the scm systems
  • /opt/drizzle - This is where we want our Drizzle installation
  • /opt/gearman - This is where we want our Gearman installation

"Why not just put everything in /usr/local?" you may ask. Well, I don't like that because then I have a hard time figuring what files to remove when I want to uninstall a package. "This must turn into a long and complex path?" would probably be your next question. The answer is no. Just create the appropriate symbolic links and you are good to go :-)

So let's go ahead and create the ZFS filesystems:

for f in dscm drizzle gearman google
do
   pfexec zfs create -o mountpoint=/opt/$f rpool/$f
   pfexec chown `/usr/bin/id -u`:`/usr/bin/id -g` /opt/$f
done
  

Drizzle, Gearman and libmemcached all use Bazaar for development, and there isn't a package available for OpenSolaris so we need to install this ourself. The Bazaar team is really active and using the "release early, release often" model, and I want a easy way to keep up with the versions. Instead of having zombie files / versions laying around, I ended up with a model where I install each version into its own directory, and I have a symbolic link to the version I want to use. Because we install in a "nonstandard" location, we need to create a startup-script so that Python can find the modules. So let's go ahead and install Bazaar (1.15 is the latest stable version right now) :

wget --no-check-certificate http://launchpad.net/bzr/1.15/1.15final/+download/bzr-1.15.tar.gz
gtar xfz bzr-1.15.tar.gz
cd bzr-1.15
python setup.py install --prefix=/opt/dscm/bazaar-1.15
mkdir /opt/dscm/bin
cat > /opt/dscm/bin/bzr <<EOF
#! /bin/ksh
export PYTHONPATH=/opt/dscm/bazaar/lib/python2.4/site-packages
exec /opt/dscm/bazaar/bin/bzr "\$@"
EOF
chmod a+x /opt/dscm/bin/bzr
ln -s bazaar-1.15 /opt/dscm/bazaar
cd ..
rm -rf bzr-1.15.tar.gz bzr-1.15

The next time you want to upgrade Bazaar, all you need to do is to move the symbolic link /opt/dscm/bazaar to point to the new version. You can now either put /opt/dscm/bin into your path, or you can create something like /opt/local/bin and create a symbolic link to /opt/dscm/bin/bzr from there (and then put /opt/local/bin in your path. To avoid path problems, I'll keep on referring to bzr with absolute path throughout the example.

For some reason OpenSolaris doesn't contain a prebuilt 64-bit version of GNU readline, so that we need to compile that ourself (It is scheduled for an upcoming build AFAIK). To keep the example simple, I'll just install the readline library into /opt/drizzle. So just execute the following commands to download, build and install:

wget http://ftp.gnu.org/gnu/readline/readline-6.0.tar.gz
gtar xfz readline-6.0.tar.gz
cd readline-6.0
./configure --disable-static --prefix=/opt/drizzle 
gmake all install
gmake clean
./configure --disable-static --prefix=/opt/drizzle --libdir=/opt/drizzle/lib/`isainfo -k` CFLAGS="-m64"
gmake all install
ln -s `isainfo -k` /opt/drizzle/lib/64
ln -s . /opt/drizzle/lib/32
cd ..
rm -rf readline-6.0.tar.gz readline-6.0

Stop! why do you build it two times?" If you look at the options there I compile one version with "-m64", and that option will create a 64bit binary. Most people would probably not care for the 32bit binary, but I like to build both versions when I build a library (so that I don't have problems later on if I want to build a 32 (or 64 bit) binary using the library. The reason for the two symbolic links I create at the end is explained in chapter 32-bit and 64-bit Libraries.

Drizzle use Google Protocol buffers in the communication protocol, so let's go ahead and compile them. I don't use the latest version, because there is a compilation error in that version (and I haven't had the time to look at that yet):

wget http://protobuf.googlecode.com/files/protobuf-2.0.3.tar.gz
gtar xfz protobuf-2.0.3.tar.gz
cd protobuf-2.0.3
./configure --disable-static --with-zlib --prefix=/opt/google CPPFLAGS="-fast -m32" LDFLAGS="-fast" \
            --bindir=/opt/google/bin/i86
gmake all install
gmake clean
./configure --disable-static --with-zlib --prefix=/opt/google CPPFLAGS="-fast -m64" LDFLAGS="-fast -m64" \
            --libdir=/opt/google/lib/`isainfo -k` --bindir=/opt/google/bin/`isainfo -k`
gmake all install
cd ..
ln -s `isainfo -k` /opt/google/lib/64
ln -s . /opt/google/lib/32
cp /usr/lib/isaexec /opt/google/bin/protoc
rm -rf protobuf-2.0.3.tar.gz protobuf-2.0.3

With all the dependencies installed, we can go ahead and grab the source for libmemcached, libdrizzle, Gearman and Drizzle:

for f in libdrizzle gearmand libmemcached drizzle 
do
   /opt/dscm/bin/bzr branch lp:$f
done

So let's go ahead and start building them. libdrizzle is first up:

cd libdrizzle
./config/autorun.sh
./configure --disable-static --prefix=/opt/drizzle CFLAGS="-fast -m32" LDFAGS="-fast"
gmake all install
./configure --disable-static --prefix=/opt/drizzle --libdir=/opt/drizzle/lib/`isainfo -k` CFLAGS="-fast -m64" LDFAGS="-fast"
gmake clean
gmake all install
cd ..

The next one on the list is libmemcached:

cd libmemcached
./config/bootstrap
PATH=$PATH:/usr/perl5/bin ./configure --disable-static --prefix=/opt/drizzle CFLAGS="-fast -m32" LDFAGS="-fast" \
    --without-memcached --bindir=/opt/drizzle/bin/i86
gmake all install
PATH=$PATH:/usr/perl5/bin ./configure --enable-64bit --disable-static --prefix=/opt/drizzle \
    --libdir=/opt/drizzle/lib/`isainfo -k` CFLAGS="-fast" LDFAGS="-fast" --without-memcached --bindir=/opt/drizzle/bin/`isainfo -k`
gmake clean
gmake all install
for f in memcat memrm memcp memerror memflush memslap memstat
do
cp /usr/lib/isaexec /opt/drizzle/bin/$f
done
cd ..

There is a problem with the configure script for Gearman, so it is not able to create a 32 bit binary on a machine capable of running in 64 bit mode, so from now on we will only create 64 bit binaries (I will work on a patch for this):

cd gearmand
./config/bootstrap
./configure --prefix=/opt/gearman --disable-static --sbindir=/opt/gearman/sbin/`isainfo -k` --libdir=/opt/gearman/lib/`isainfo -k` \
            --bindir=/opt/gearman/bin/`isainfo -k` CFLAGS="-fast -I/opt/drizzle/include -m64" \
            LDFLAGS="-L/opt/drizzle/lib/64 -R/opt/drizzle/lib/64"
gmake clean
gmake all install
cd ..
cp /usr/lib/isaexec /opt/gearman/sbin/gearmand
cp /usr/lib/isaexec /opt/gearman/bin/gearman

Before we can start compiling Drizzle we need to make sure that Drizzle can detect our PCRE installation. OpenSolaris ships with a version that is too new for the Drizzle configure script, so that we need to create a symbolic link to make sure it detects it properly:

pfexec ln -s pcre/pcre.h /usr/include/pcre.h

Now all is set for compiling Drizzle:

cd drizzle
PATH=$PATH:/opt/dscm/bin ./config/autorun.sh
PATH=$PATH:/opt/google/bin ./configure CPPFLAGS="-I/opt/google/include -I/opt/gearman/include -I/opt/drizzle/include" \
   LDFLAGS="-L/opt/google/lib/64 -L/opt/gearman/lib/64 -L/opt/drizzle/lib/64 -R/opt/drizzle/lib/64:/opt/gearman/lib/64:/opt/google/lib/64" \
   --prefix=/opt/drizzle --libdir=/opt/drizzle/lib/`isainfo -k` 
PATH=$PATH:/opt/google/bin gmake all install

Now you should have Drizzle installed in /opt/drizzle. If you look in some of my previous blog posts you should be able to find out how to install it as an SMF service :-)

Cheers

http://blogs.sun.com/trond/date/20090525 Monday May 25, 2009

Manage Gearmand and Drizzle with SMF

If you are running Gearman or Drizzle on Solaris, you may want to let SMF start and monitor the services. I just pushed service definitions and and scripts to install them a couple of days ago.

If you look in the scripts directory in Gearman, you will see a script named smf_install.sh. If you run this script it will define a user and group named gearmand, create the Gearman authorizations and profile before a service named gearman is defined. To start the Gearman service all you need to do is to run:

trond@storm ~> svcadm enable gearman

For Drizzle you need to look in the support-files/smf directory for a script named install.sh. To start Drizzle all you need to do is to run:

trond@storm ~> svcadm enable drizzle

http://blogs.sun.com/trond/date/20090325 Wednesday March 25, 2009

Bazaar plugin for Hudson

I have been using Hudson to build various software projects I am working on for some time and I really like it, so one of the first things I did when I started compiling Drizzle on OpenSolaris was to configure a new build target on my server.

Google pointed me to a Bazaar plugin, but unfortunately it didn't work well in the "master-slave" configuration. I tried to look at the source in the plugin to try to make it work, but it soon found out that it was a lot faster to just clone the Mercurial plugin and adapt it to Bazaar.

I pushed the plugin earlier today, so you may browse the source code and compile it yourself if you like (or wait for it to be listed in the available plugins for Hudson).

We are using this plugin on Drizzle to monitor the builds on one server while compiling Drizzle on other machines.

http://blogs.sun.com/trond/date/20090305 Thursday March 05, 2009

Adding debugging functions to your dbx session

Have you ever been sitting in a debugging session thinking: "arg, why don't I have a function doing xyz"? I know I have! Luckily for us we don't have to terminate the debugging session, build a new version of the application with the function available and try to recreate the debugging session. There is an easy way to extend our debugging session with new functions :-)

Talking is one thing, but developers don't believe anything before they see the code. So let's go ahead and create an example.

trond@opensolaris> nl -ba main.c
     1	#include <stdio.h>
     2	
     3	struct item {
     4	   /* the interesting data */
     5	   struct item* next;
     6	};
     7	
     8	int main(int argc, char** argv) {
     9	   struct item items[10];
    10	   for (int ii = 0; ii < 10; ++ii) {
    11	      items[ii].next = items + ii + 1;
    12	   }
    13	   /* terminate the list */
    14	   items[9].next = NULL;
    15	
    16	   /* let's create a loop */
    17	   items[9].next = &items[7];
    18	
    19	   return 0;
    20	}
trond@opensolaris> cc -o testprogram -g main.c -ldl

So let's start a debugging session, and set a breakpoint at line 17 in main.c:

trond@opensolaris> dbx testprogram
Reading testprogram
Reading ld.so.1
Reading libdl.so.1
Reading libc.so.1
(dbx) stop at 17
(2) stop at "main.c":17
(dbx) run
Running: testprogram 
(process id 17090)
stopped in main at line 17 in file "main.c"
   17      items[9].next = &items[7];
(dbx)

So how do we verify that we don't have a loop in this list??? The first thing we need to do is to create a small C-function to do loop detection and compile it into a shared object:

trond@opensolaris> nl -ba looptest.c
     1	#include <stdio.h>
     2	
     3	struct item {
     4	   /* the interesting data */
     5	   struct item* next;
     6	};
     7	
     8	int looptest(struct item* root) {
     9	   struct item* lookahead = root;
    10	
    11	   while (root != NULL) {
    12	      if (lookahead != NULL && lookahead->next != NULL) {
    13	         lookahead = lookahead->next->next;
    14	      } else {
    15	         lookahead = NULL;
    16	      }
    17	      if (root == lookahead) {
    18	         /* loop detected */
    19	         return 1;
    20	      } else {
    21	         root = root->next;
    22	      }
    23	   }
    24	
    25	   /* no loop */
    26	   return 0;
    27	}
trond@opensolaris> cc -o looptest.so -G -KPIC -g looptest.c

The trick is that we can use the call command in dbx to call a function from within the process we are debugging, and the function we want to call is dlopen. Why? dlopen will load the functions in the shared object into the address space of the process so that we can call them. Let's jump back to the debugging session:

(dbx) call dlopen("./looptest.so", 0x102)
Reading looptest.so
stopped in main at line 17 in file "main.c"
   17      items[9].next = &items[7];

So what is 0x102? Well that is result of RTLD_NOW | RTLD_GLOBAL (check the dlopen manual page for more info). Now we can call the looptest function from our debugging session:

(dbx) print looptest(items)
looptest(items) = 0

So let's continue the debugging and execute the next line that creates a loop in the list:

(dbx) next
stopped in main at line 19 in file "main.c"
   19      return 0;
(dbx) print looptest(items)
looptest(items) = 1

One small caveat is that you have to link your application with -ldl for this to work....

http://blogs.sun.com/trond/date/20090301 Sunday March 01, 2009

Socket connection timeout

I have been struggling to get a patch sent to the libmemcached mailing list to work as expected on my system. The first thing I normally do when I receive a new patch is to apply it and run the test suite, and if that doesn't work I normally notifies the author of the patch without digging too deep into the patch. This time I didn't get a test failure caused by one of the assert() statements in the patch, but the test program terminated unexpectedly. In order to give the author some more information on where it failed, I had to at least figure out how the program terminated (calling exit()?? a signal??? etc). truss is a good tool for that:

trond@opensolaris> truss -o /tmp/truss -E ./tests/testapp consistent_not
[... cut ...]
trond@opensolaris> tail -10 /tmp/truss
 0.0000 so_socket(PF_INET, SOCK_STREAM, IPPROTO_IP, "", SOV_DEFAULT) = 9
 0.0000 setsockopt(9, SOL_SOCKET, SO_LINGER, 0xFFBFF570, 8, SOV_DEFAULT) = 0
 0.0000 fcntl(9, F_GETFL)                               = 2
 0.0000 fcntl(9, F_SETFL, FWRITE|FNONBLOCK)             = 0
 0.0001 connect(9, 0x0004D3F0, 16, SOV_DEFAULT)         Err#150 EINPROGRESS
 0.0001 pollsys(0xFFBFF5E8, 1, 0xFFBFF578, 0x00000000)  = 1
 0.0000 fcntl(9, F_GETFL)                               = 130
 0.0000 fcntl(9, F_SETFL, FWRITE)                       = 0
 0.0000 write(9, " g e t   t H m Z u B 7 T".., 107)     Err#32 EPIPE
 0.0000     Received signal #13, SIGPIPE [default]

Hmm.. So we see that the program terminates when it receives the SIGPIPE signal, and we don't have a signal handler for this. If we look in the man page for write, you will see that write will generate SIGPIPE if the stream isn't connected to the peer.

It took me some time to figure out the bug here (I created a scaled down example and posted to a mailing list, and a friendly soul out there pointed me to the error!!). If you look closely in the truss output above, I only call connect once. The libmemcached took it for granted that if poll returned with a POLLOUT event, the connection was established. The correct thing would however be to call connect once more and check the return value (and possibly errno). The connect man page could have spent a paragraph describing this behavior ;-)

To end this blog, I'll create a new version of the connect call taking a timeout value as a parameter:

int my_connect(int s, const struct sockaddr *name, int namelen, int timeout)
{
   /* We may need to toggle on non-blocking mode */
   int flags = fcntl(s, F_GETFL, 0);
   if (flags == -1) {
      flags = 0;
   } else if (!(flags & O_NONBLOCK)) {
      (void)fcntl(s, F_SETFL, flags | O_NONBLOCK);
   }
   
   int ret;
   while (connect(s, name, namelen) < 0) {
      if (errno == EISCONN) {
         ret = 0;
         break;
      } else if (errno == EINPROGRESS || errno == EALREADY) {
         struct pollfd fds[1] = {[0].fd = s,
                                 [0].events = POLLOUT};         
         if (poll(fds, 1, timeout) == 0) {
            /* poll timed out, so let's set that in errno ;-) */
            errno = ETIMEDOUT;
            ret = -1;
            break;
         } else if (fds[0].revents & POLLERR) {
            ret = -1;
            break;
         }
      } else if (errno != EINTR) {
         ret = 1;
         break;
      }
   }
   
   /* Restore the old setting on the socket if we updated it above */
   int error = errno;
   if (!(flags & O_NONBLOCK)) {
      (void)fcntl(s, F_SETFL, flags);
   }
   errno = error;
   return ret;
}

http://blogs.sun.com/trond/date/20090227 Friday February 27, 2009

Adding debugging functionality into your app!

In this blog I will just toss out some small ideas you may use if you would like to make your application a little bit more debug friendly while you are developing on it. The first little example is a function that you may call from your application to set a memory watchpoint on a memory area, so that you can trap "dangeling" pointers writing to your data structures. When do you want to use this? Well let's say that you when you develop your application it seems that someone change your data structure, and you cannot figure out how this is happening. One solution would be to run the program in the debugger and set a watchpoint on the memory area, but let's say that it happens only 1 out of a 1000 times. A better solution would probably be to set the watchpoint from your application code. The following little function does exactly that:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <procfs.h>
#include <sys/fault.h>

void memory_watchpoint(const void *ptr, size_t size, int mode)
{
   int fd = open("/proc/self/ctl", O_WRONLY);

   if (fd != -1) {
      typedef struct {
         long cmd;
         prwatch_t prwatch;
      } ctl_t;

      ctl_t ctls = { .cmd = PCWATCH, 
                     .prwatch.pr_vaddr = (uintptr_t)ptr,
                     .prwatch.pr_size = size };

      if (mode & 1) {
         ctls.prwatch.pr_wflags = WA_WRITE;
      }

      if (mode & 2) {
         ctls.prwatch.pr_wflags |= WA_READ;
      }

      (void)write(fd, &ctls, sizeof(ctls));
      (void)close(fd);
   }
}

I would now turn on the watchpoint for the memory area when I'm done using it, and disable the watchpoint right before I intent to use it etc. Solaris would then trigger when someone tries to access the memory without disabling the watchpoint first.

Another thing you could do is to stop your binary when you detect a problem in your application instead of terminating the problem. Personally I hate to read log-files to try to figure out why the process exited (in 99,9% of the times you don't have all the information you want), and getting core-file is a big step in the right direction. My favorite is however when I can attach a debugger to the process instead:

#include <unistd.h>
#include <fcntl.h>
#include <procfs.h>
#include <unistd.h>

void stop_process(void)
{
   int fd = open("/proc/self/ctl", O_WRONLY);
   if (fd != -1) {
      long cmds[2] = {PCSTOP, 0};
      (void)write(fd, &cmds[0], sizeof(cmds));
      (void)close(fd);
   }
} 

If you can't stop your server (let's say it's a database server), you could always call fork() first to let the parent continue and debug the child process ;-)

This is just two small examples you may use to make your application more debug friendly. You can take a look at proc(4) for more information on what you may use the /proc filesystem for. You will see that the /proc-filesystem on Solaris differs from the one on Linux in the way that the Linux /proc-filesystem is more optimized for the human eye (you may just cat the file, but Solaris provides a lot of tools to operate on the /proc-filesystem. See pmap, pstack, pfiles, pldd to mention a few (some of the tools works on core-files as well!! (and you may specify the thread you want to look at as well). I encourage you to look at the man pages.

http://blogs.sun.com/trond/date/20090225 Wednesday February 25, 2009

Creating a relocatable binary

When you create a large project you would most likely want arrange your code into logical parts and create various libraries containing logical units of code. In the "good old days" this would normally be an archive of object files created by using ar. ex:

trond@opensolaris> ar -r myarchive.a foo1.o foo2.o foo3.o
ar: creating myarchive.a

You would pass the archive to the archive to the linker when you linked your program like:

trond@opensolaris> cc -o myprog main.o myarchive.a

The linker would now search for a definition for all of the undefined symbols in main.o in myarchive.a. (please note that if you named your archive libmyarchive you could also pass it to the linker as -lmyarchive). This is what we call static linking, but static linking have some obvious drawbacks:

  • If you find a bug in the code in a library, you need to relink all applications that link with the library to fix them.
  • At runtime all programs that include functions from the library will have their own copy of the function loaded into memory.

A better solution is to create relocatable objects (aka shared libraries, dll etc), and link the application with those instead (and if you look at a default installation of Solaris you will not find any of the system libraries as static archives). Unfortunately relocatable objects have it's own problems, and that is what I'll address in the rest of this blog post.

So let's go ahead and create a small example to look at the problems and how to solve them. I have created two small source files: lib.c contains the function I want in my library, and main.c contains my application:

trond@opensolaris> cat lib.c
#include <stdio.h>

void my_print(int val) {
   fprintf(stdout, "The value is %d\n", val);
}

We compile this into a relocatable object by using the -Kpic option to cc, and create a shared object by using the -G.

trond@opensolaris> cc -c -Kpic lib.c
trond@opensolaris> cc -o libfoo.so -G lib.o

The next thing we need to do is to compile our main program and link it with the library.

trond@opensolaris> cat main.c
extern void my_print(int);

int main(int argc, char **argv) {
   my_print(argc);

   return 0;
}

trond@opensolaris> cc -c main.c
trond@opensolaris> cc -o myprog main.o -lfoo
ld: fatal: library -lfoo: not found
ld: fatal: file processing errors. No output written to myprog

This doesn't work! Why? By default the linker will only look for shared libraries in /lib[/64] and /usr/lib[/64] (see Directories Searched by the Runtime Linker for more information. To instruct the linker to search the current directory we need to pass -Lpath to the link step (or we could use LD_LIBRARY_PATH, but you don't want to use LD_LIBRARY_PATH... Why? ask Google: LD_LIBRARY_PATH evil)

trond@opensolaris> cc -o myprog main.o -L. -lfoo

It compiled just fine, so let's try start it:

trond@opensolaris> ./myprog
ld.so.1: myprog: fatal: libfoo.so: open failed: No such file or directory

I see multiple solutions to this problem:

  • Copy libfoo.so to /usr/lib, but it really doesn't belong in there...
  • Create a wrapper script that exports LD_LIBRARY_PATH and invokes my binary (but I would prefer to avoid using LD_LIBRARY_PATH if it is possible
  • Use crle (this is the Solaris version of ldconf) to include the directory containing my library into the list the runtime linker will search for libraries in. I don't think this is a good idea as well, because I don't think all of the other programs want my library (and this feels a bit like a really global LD_LIBRARY_PATH hack...)
  • Use -R to insert a runtime search path into my binary. Now this looks like a really good idea! let's try it!
trond@opensolaris> cc -o myprog main.o -L. -R. -lfoo
trond@opensolaris> ./myprog
The value is 1

So I guess I'm done now.. Let's install my binary in my $HOME/bin directory and try it :-)

trond@opensolaris> cp myprog libfoo.so ~/bin
trond@opensolaris> myprog
ld.so.1: myprog: fatal: libfoo.so: open failed: No such file or directory

WHAT! It doesn't work!! So what happens when we try to run the binary? let's look:

trond@opensolaris> truss /home/trond/bin/myprog
execve("/home/trond/bin/myprog", 0x08047704, 0x0804770C)  argc = 1
mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEFB0000
resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
resolvepath("/home/trond/bin/myprog", "/home/trond/bin/myprog", 1023) = 30
stat64("/home/trond/bin/myprog", 0x08047328) = 0
open("/var/ld/ld.config", O_RDONLY)		Err#2 ENOENT
sysconfig(_CONFIG_PAGESIZE)			= 4096
stat64("./libfoo.so", 0x08046B08)		Err#2 ENOENT
stat64("/lib/libfoo.so", 0x08046B08)		Err#2 ENOENT
stat64("/usr/lib/libfoo.so", 0x08046B08)	Err#2 ENOENT
ld.so.1: myprog: fatal: libfoo.so: open failed: No such file or directory
write(2, " l d . s o . 1 :   m y p".., 74)	= 74
lwp_self()					= 1

It tries to look for libfoo.so in the current directory!!! let's take a peak in the binary (I have bolded out the interesting pieces):

trond@opensolaris> dump -Lv myprog 

myprog:

  **** DYNAMIC SECTION INFORMATION ****
.dynamic:
[INDEX]	Tag         Value
[1]	NEEDED          libfoo.so
[2]	NEEDED          libc.so.1
[3]	INIT            0x8050a20
[4]	FINI            0x8050a3c
[5]	RUNPATH         .
[6]	RPATH           .
[7]	HASH            0x8050118
[8]	STRTAB          0x805042c
[9]	STRSZ           0x37f
[10]	SYMTAB          0x805028c
[11]	SYMENT          0x10
[12]	SUNW_SYMTAB     0x80501fc
[13]	SUNW_SYMSZ      0x230
[14]	SUNW_SORTENT    0x4
[15]	SUNW_SYMSORT    0x8050810
[16]	SUNW_SYMSORTSZ  0x38
[17]	CHECKSUM        0x83d2
[18]	VERNEED         0x80507ac
[19]	VERNEEDNUM      0x1
[20]	PLTSZ           0x30
[21]	PLTREL          0x11
[22]	JMPREL          0x8050850
[23]	REL             0x8050848
[24]	RELSZ           0x38
[25]	RELENT          0x8
[26]	DEBUG           0
[27]	FEATURE_1       PARINIT
[28]	SUNW_CAP        0x8050108
[29]	FLAGS           0
[30]	FLAGS_1         0
[31]	SUNW_STRPAD     0x200
[32]	SUNW_LDMACH     EM_386
[33]	PLTGOT          0x8060a5c

Well, let's re-link our application with the correct directory specified to -R

trond@opensolaris> cc -o myprog main.o -L. -R/home/trond/bin -lfoo
trond@opensolaris> cp myprog /home/trond/bin
trond@opensolaris> myprog
The value is 1

This doesn't feel very user-friendly.. I have hard-coded in my binary where it should search for it's shared libraries. If all users install the binaries in the standard directories such as /opt/foo/bin/myfoo it would work, but it doesn't seem flexible. Luckily for us the run-time linker in Solaris may help us out here, so that we can instruct the runtime linker to search for shared libraries relative to where the binary is installed. If we use the special token $ORIGIN the runtime linker will replace that with the location of the binary. So let's relink our application once more, but this time we want to move the library to $HOME/lib, and keep the binary in $HOME/bin.

trond@opensolaris> mv /home/trond/bin/libfoo.so /home/trond/lib
trond@opensolaris> cc -o myprog main.o -L. -R\$ORIGIN/../lib -lfoo
trond@opensolaris> cp myprog /home/trond/bin
trond@opensolaris> myprog
The value is 1

So let's use truss to see what's happening :-)

trond@opensolaris> truss /home/trond/bin/myprog
execve("/home/trond/bin/myprog", 0x08047708, 0x08047710)  argc = 1
mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEFB0000
resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
resolvepath("/home/trond/bin/myprog", "/home/trond/bin/myprog", 1023) = 25
stat64("/home/trond/bin/myprog", 0x0804732C)	= 0
open("/var/ld/ld.config", O_RDONLY)		Err#2 ENOENT
sysconfig(_CONFIG_PAGESIZE)			= 4096
stat64("/home/trond/bin/../lib/libfoo.so", 0x08046B0C) = 0
resolvepath("/home/trond/bin/../lib/libfoo.so", "/home/trond/lib/libfoo.so", 1023) = 28
open("/home/trond/bin/../lib/libfoo.so", O_RDONLY) = 3
mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEFA0000
mmap(0x00010000, 69632, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF80000
mmap(0xFEF80000, 1489, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEF80000
mmap(0xFEF90000, 1812, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 0) = 0xFEF90000
munmap(0xFEF81000, 61440)			= 0
memcntl(0xFEF80000, 1300, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3)					= 0
stat64("/home/trond/bin/../lib/libc.so.1", 0x08046B0C) Err#2 ENOENT
stat64("/lib/libc.so.1", 0x08046B0C)		= 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY)		= 3
mmap(0xFEFA0000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0xFEFA0000
mmap(0x00010000, 1409024, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE20000
mmap(0xFEE20000, 1305977, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEE20000
mmap(0xFEF6F000, 28320, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 1306624) = 0xFEF6F000
mmap(0xFEF76000, 6328, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF76000
munmap(0xFEF5F000, 65536)			= 0
memcntl(0xFEE20000, 188300, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3)					= 0
mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE10000
munmap(0xFEFA0000, 32768)			= 0
getcontext(0x0804718C)
getrlimit(RLIMIT_STACK, 0x08047184)		= 0
getpid()					= 10032 [10031]
lwp_private(0, 1, 0xFEE12A00)			= 0x000001C3
setustack(0xFEE12A60)
sysi86(SI86FPSTART, 0xFEF76FCC, 0x0000133F, 0x00001F80) = 0x00000001
ioctl(1, TCGETA, 0x080467A0)			= 0
fstat64(1, 0x08046700)				= 0
The value is 1
write(1, " T h e   v a l u e   i s".., 15)	= 15
_exit(0)

In the beginning of the blog I said that one of the benefits of using shared objects was that we didn't have to relink our application if we found a bug in the library we needed to fix, so I thought we should show that as well. So let's modify the library:

trond@opensolaris> cat lib.c
#include <stdio.h>

void my_print(int val) {
   fprintf(stdout, "The end...\n");
}
trond@opensolaris> cc -c -Kpic lib.c
trond@opensolaris> cc -o libfoo.so -G lib.o
trond@opensolaris> cp libfoo.so /home/trond/lib
trond@opensolaris> myprog
The end!!!

So that's it for this time :-)

http://blogs.sun.com/trond/date/20090224 Tuesday February 24, 2009

coreadm

If your program tries to execute an illegal instruction (or receives an unexpected signal), Solaris will terminate the program and write a dump of the program image and state into a file named core in the current directory of the running application (just like any other Unix out there). This sounds great, but there are some caveats:

  • You don't want a daemon to run using a networked mounted filesystem as it's running directory, so the common pattern used in daemons is to use "/" as the current directory. "/" should not be writable for normal users, and you don't want all your daemons running as root. This means that your daemons cannot dump core :-(
  • All processes running in the same directory will potentially dump core into the same file, overwriting (or corrupting) the content of the corefile.

To solve this (and other problems) Solaris provides a tool named coreadm that allows you to tune the behavior to fit your needs :-)

To set the core filename to “core.program-name.pid” for all programs started from this shell, just execute the following command:

trond@opensolaris> coreadm -p “core.%f.%p” $$

You may set this as the default corefile name by executing the following command:

trond@opensolaris> pfexec coreadm -i "core.%f.%p" 

Check out the documentation for a full description and examples for what you may do with coreadm.

http://blogs.sun.com/trond/date/20090219 Thursday February 19, 2009

Using OpenSolaris as a development platform

OpenSolaris is a great platform for software development. OpenSolaris is like a Swiss army knife, it contains a lot of great tools that will make your life as a developer much easier. In my previous job I worked on a cross-platform product, so I had to develop and debug the application on a wide range of platforms (Solaris, Trusted Solaris, Linux, Microsoft Windows, SCO, HP-UX). All of the platforms provide a set of debugging tools with its strengths and weaknesses, bug when it comes to trying to nail down a bug out in the customers datacenter none of the other environments could compare to Solaris. The only problem is that you need to know of the existence of the tool before you can use it! I have been using Solaris (and now OpenSolaris) as my primary developent platform for a lot of years now, so I thought I would start a little serie of blogs covering different tools available in OpenSolaris.

You will find more information about all the tools I am going to mention in the manual page for the tool, and http://docs.sun.com contains a lot of information. If I could give you just one advice, it would be: "Know your tools". If you know how to use your tools and what they are capable of, you know how to choose the right tool is to solve your problem!

I am running OpenSolaris on my development machines, so I will create my examples and description on OpenSolaris. You will also find most of the tools in Solaris, but the installation procedure (and installed location) may be different. If you want to try them yourself I would recommend installing OpenSolaris 2008.11, and if you don't have a real computer to install it on, you could just download VirtualBox and install it a virtual machine.

A default installation of OpenSolaris does not install development tools, so the first thing we have to do is to install all of the software development tools (compiler, debugger, SCM systems, autoconf etc). Luckily for us there is a package containing all of the tools we need: ss-dev. So let's start by installing the Sun Studio developent package:

trond@opensolaris:~$ pfexec pkg install ss-dev

But hang on a second, why not just use gcc and gdb? After all, developers stick with the tools they are used to. There are a lot of developers out there that still thinks the combination of vi and make is the killer combination :-)

For me the answer is easy:

  • The Sun Studio tools is better integrated with the features Solaris provides
  • I don't see any "value added" from the other tools
  • The gcc and gdb provided in OpenSolaris is quite old. If you want to use them, you should compile newer versions yourself.

A lot of developers like to do their development inside an IDE, and you start Sun Studio from the application menu, or by typing sunstudio on the command line:

trond@opensolaris:~$ sunstudio

You can of course continue to develop with vi/emacs/gedit/... and make if want. If you are going to write your own configure-scripts or Makefiles, I guess you will find the following link containing a nice translation of the command-line options for the gcc compiler to the Sun Studio compiler handy: http://developers.sun.com/sunstudio/documentation/techart/gc_options.html#Compiler_Linker_Option_Translati

http://blogs.sun.com/trond/date/20090211 Wednesday February 11, 2009

Enable corefiles in Drizzle

In my previous blog post I created a SMF manifest to let you manage Drizzle like every other service running on your system. For some odd reason Drizzle will catch all signals and exit cleanly instead of creating a core dump, making finding bugs much harder (who wants to read logfiles instead of poking around in the corefile with a debugger?). I was thinking of filing a bug for this behavior until Monty Taylor pointed out to me that there is a command line option I may add to enable creations of core dumps.

With this new information I think we should modify the configuration I created in the previous blog post. In the previous blog post we stored the database in /var/drizzle. I would like to move the database to /var/drizzle/data, and store all corefiles in /var/drizzle/crash. If you want to preserve your old database you should move the files in /var/drizzle to /var/drizzle/data (or if you don't care you can just go ahead and delete them)

trond@opensolaris:~$ svcadm disable drizzle
trond@opensolaris:~$ svccfg delete drizzle
trond@opensolaris:~$ pfexec mkdir /var/drizzle/data /var/drizzle/crash
trond@opensolaris:~$ pfexec chown -R drizzle:drizzle /var/drizzle

The next thing we should do is to modify the SMF manifest so that we may modify the corefile pattern later on:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="manifest" name="drizzle">
    <service name="application/database/drizzle" type="service" version="1">
        <single_instance/>

        <dependency name="multi-user-server" grouping="require_all" restart_on="none" type="service">
            <service_fmri value="svc:/milestone/multi-user-server" />
        </dependency>

        <property_group name="general" type="framework">
            <propval name="action_authorization" type="astring" 
                     value="solaris.smf.manage.drizzle" />
            <propval name="value_authorization" type="astring" 
                     value="solaris.smf.value.drizzle" />
        </property_group>

        <property_group name="drizzle" type="application">
            <propval name="datadir" type="astring" 
                     value="/var/drizzle/data" />
            <propval name="corepattern" type="astring" 
                     value="/var/drizzle/crash/core.%f.%p" />
            <propval name="port" type="astring" 
                     value="4427" />
        </property_group>

        <instance name="drizzle" enabled="false">
            <exec_method type="method" name="start" exec="/lib/svc/method/drizzle start" timeout_seconds="30" >
                <method_context>
                    <method_credential user="drizzle" group="drizzle" />
                </method_context>
            </exec_method>
            <exec_method type="method" name="stop" exec="/lib/svc/method/drizzle stop  %{restarter/contract}" timeout_seconds="60" >
                <method_context>
                    <method_credential user="drizzle" group="drizzle" />
                </method_context>
            </exec_method>
        </instance>

        <stability value="Unstable" />
        <template>
            <common_name>
                <loctext xml:lang="C">Drizzle database server</loctext>
            </common_name>
            <documentation>
                <manpage title="drizzle" section="1" manpath="/opt/drizzle/share/man" />
            </documentation>
        </template>
    </service>
</service_bundle>

With the new configuration option corepattern specified, we can modify the startup script:

#!/sbin/sh
. /lib/svc/share/smf_include.sh

case "$1" in
   'start')
      coreadm -p "`svcprop -p drizzle/corepattern $SMF_FMRI`" $$
      /opt/drizzle/sbin/drizzled --datadir=`svcprop -p drizzle/datadir $SMF_FMRI` --port=`svcprop -p drizzle/port $SMF_FMRI`  --skip-stack-trace &
   ;;

   'stop')
      smf_kill_contract $2 TERM 1
   ;;

   *)
      echo "Usage: $0 {start|stop}"
      exit 1
;;
esac

exit $SMF_EXIT_OK

To install the files and create the SMF service, execute the following commands:

trond@opensolaris:~$ pfexec install -f /lib/svc/method drizzle
trond@opensolaris:~$ pfexec install -f /var/svc/manifest/application -m 0444 drizzle.xml
trond@opensolaris:~$ svccfg import /var/svc/manifest/application/drizzle.xml
trond@opensolaris:~$ svccfg enable drizzle

If you try to kill the drizzled process with signal 6 (SIGABORT) you should see a corefile in /var/drizzle/crash:

trond@opensolaris:~$ pfexec pkill -6 -x drizzled
trond@opensolaris:~$ pfexec ls -l /var/drizzle/crash
total 65823
-rw------- 1 drizzle drizzle 67312364 2009-02-11 22:05 core.drizzled.2189

http://blogs.sun.com/trond/date/20090207 Saturday February 07, 2009

Manage Drizzle with SMF

In my previous post I described how easy it is to compile Drizzle on OpenSolaris. Most people don't just compile software for fun (there are some geeks like me out there that actually do that..), people want to use the software. In this blog I will show you how to integrate Drizzle with SMF, so that you may administer drizzle just like any other service.

I will continue to use the VirtualBox image I created in the previous blog post, but I like to run the latest bits of OpenSolaris so we start by installing the latest development version. To do this we need to change the location of our preferred package repository:

trond@opensolaris:~$ pfexec pkg set-authority -O http://pkg.opensolaris.org/dev/ opensolaris.org

With the new package repository in place, we can upgrade to the latest bits with the following commands:

trond@opensolaris:~$ pfexec pkg install SUNWipkg
trond@opensolaris:~$ pfexec pkg image-update
trond@opensolaris:~$ pfexec init 6

The last thing we did in the previous blog post was running "gmake install" to install the binaries into /opt/drizzle. gmake install created a user named drizzle, but we want to run drizzle as a role. I guess that there is a command that may convert a user into a role, but I normally just delete user and recreate it as a role with the same user id:

trond@opensolaris:~$ id drizzle
uid=102(drizzle) gid=100(drizzle) groups=100(drizzle)
trond@opensolaris:~$ pfexec userdel drizzle
trond@opensolaris:~$ pfexec roleadd -d /opt/drizzle/var -g drizzle -u 102 -s /bin/bash drizzle

We don't want to store our database in /opt/drizzle/var, so let's go ahead and create /var/drizzle instead:

trond@opensolaris:~$ pfexec zfs create -o mountpoint=/var/drizzle rpool/drizzle-data
trond@opensolaris:~$ pfexec chown drizzle:drizzle /var/drizzle
trond@opensolaris:~$ pfexec chmod 0700 /var/drizzle
trond@opensolaris:~$ pfexec rmdir /opt/drizzle/var

We need to create authorizations in order to let normal users able to start and stop Drizzle through SMF. We create two new authorizations: One to change the configuration variables for our Drizzle service, and one for managing (start / stop) the service. Use pfexec vi /etc/security/auth_attr to add the following lines:

solaris.smf.value.drizzle:::Change Drizzle value properties::
solaris.smf.manage.drizzle:::Manage Drizzle service states::

To make it easier for ourselves, we can also create a new profile containing these authorizations. Use pfexec vi /etc/security/prof_attr to add the following line:

Drizzle Administration::::auths=solaris.smf.manage.drizzle,solaris.smf.value.drizzle

With the authorizations and profile defined, it is time to create the SMF descriptor. The SMF descriptor is an XML file describing our service. I will not comment every everything here, but just the interesting sections. You should store the following in a file named drizzle.xml

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="manifest" name="drizzle">
    <service name="application/database/drizzle" type="service" version="1">
        <single_instance/>

        <dependency name="multi-user-server" grouping="require_all" restart_on="none" type="service">
            <service_fmri value="svc:/milestone/multi-user-server" />
        </dependency>

        
        <!-- We need to map the name of the authorizations we defined to this service -->
        <property_group name="general" type="framework">
            <propval name="action_authorization" type="astring" 
                     value="solaris.smf.manage.drizzle" />
            <propval name="value_authorization" type="astring" 
                     value="solaris.smf.value.drizzle" />
        </property_group>


        <property_group name="drizzle" type="application">
            <propval name="datadir" type="astring" 
                     value="/var/drizzle" />
            <propval name="port" type="astring" 
                     value="4427" />
        </property_group>

        <!-- Define the instance and how to start / stop it -->
        <instance name="drizzle" enabled="false">
            <exec_method type="method" name="start" exec="/lib/svc/method/drizzle start" timeout_seconds="30" >
                <method_context>
                    <method_credential user="drizzle" group="drizzle" />
                </method_context>
            </exec_method>
            <exec_method type="method" name="stop" exec="/lib/svc/method/drizzle stop  %{restarter/contract}" timeout_seconds="60" >
                <method_context>
                    <method_credential user="drizzle" group="drizzle" />
                </method_context>
            </exec_method>
        </instance>
        

       <stability value="Unstable" />
        <template>
            <common_name>
                <loctext xml:lang="C">Drizzle database server</loctext>
            </common_name>
            <documentation>
                <manpage title="drizzle" section="1" manpath="/opt/drizzle/share/man" />
            </documentation>
        </template>
    </service>
</service_bundle>

As you see above we execute the script /lib/svc/method/drizzle to start or stop the server. Save the following lines into a file named drizzle:

#!/sbin/sh
. /lib/svc/share/smf_include.sh

case "$1" in
   'start')
      /opt/drizzle/sbin/drizzled --datadir=`svcprop -p drizzle/datadir $SMF_FMRI` --port=`svcprop -p drizzle/port $SMF_FMRI` &
   ;;

   'stop')
      smf_kill_contract $2 TERM 1
   ;;

   *)
      echo "Usage: $0 {start|stop}"
      exit 1
;;
esac

exit $SMF_EXIT_OK

To install the files and create the SMF service, execute the following commands:

trond@opensolaris:~$ pfexec install -f /lib/svc/method drizzle
trond@opensolaris:~$ pfexec install -f /var/svc/manifest/application -m 0444 drizzle.xml
trond@opensolaris:~$ svccfg import /var/svc/manifest/application/drizzle.xml

Well that's it. You can now manage your Drizzle server as every other service managed by SMF:

trond@opensolaris:~$ svcs drizzle
STATE          STIME    FMRI
disabled       20:03:55 svc:/application/database/drizzle:drizzle
trond@opensolaris:~$ svcadm enable drizzle
trond@opensolaris:~$ svcs drizzle
STATE          STIME    FMRI
online         20:04:07 svc:/application/database/drizzle:drizzle
trond@opensolaris:~$ ps -ef |grep drizzle
 drizzle  1260     1   1 20:04:08 ?           0:00 /opt/drizzle/sbin/drizzled --datadir=/var/drizzle --port=4427

We can modify the configuration (data dir and port number) through svccfg:

trond@opensolaris:~$ svccfg
svc:> select drizzle
svc:/application/database/drizzle> listprop
general                       framework
general/action_authorization  astring  solaris.smf.manage.drizzle
general/entity_stability      astring  Unstable
general/single_instance       boolean  true
general/value_authorization   astring  solaris.smf.value.drizzle
multi-user-server             dependency
multi-user-server/entities    fmri     svc:/milestone/multi-user-server
multi-user-server/grouping    astring  require_all
multi-user-server/restart_on  astring  none
multi-user-server/type        astring  service
drizzle                       application
drizzle/datadir               astring  /var/drizzle
drizzle/port                  astring  4427
tm_common_name                template
tm_common_name/C              ustring  "Drizzle database server"
tm_man_drizzle                template
tm_man_drizzle/manpath        astring  /opt/drizzle/share/man
tm_man_drizzle/section        astring  1
tm_man_drizzle/title          astring  drizzle
svc:/application/database/drizzle> setprop drizzle/port=9999
svc:/application/database/drizzle> quit
trond@opensolaris:~$ svcadm refresh drizzle
trond@opensolaris:~$ svcadm restart drizzle
trond@opensolaris:~$ ps -ef | grep drizzle
 drizzle  1301     1   1 20:07:52 ?           0:00 /opt/drizzle/sbin/drizzled --datadir=/var/drizzle --port=9999


Valid HTML! Valid CSS!

This is a personal weblog, I do not speak for my employer.