Mostly Harmless

John Alderson's Blog
Tuesday Apr 03, 2007

D'you want -r with that?

Some vintage (thankfully) goofs in no particular order and not all by me. There is nothing like the curious abdominal tingle which accompanies the dawning realisation that an rm command has broken free of its moorings...

1. All C'ed up

    This is my favourite, probably because it is another class of unconscious assumption.

    I had just commented a large number of C source files and then moved on from that to edit a Bourne shell script to be run as root. One of the lines I added was:

      rm /tmp/scratch$$.*       /* Clean up scratch files */
      
    This was such a narrow escape! There are almost no interesting regular files in "/". The main thing to put back on that machine was the symlink from /bin to /usr/bin, without which all of the shell scripts beginning "#!/bin/sh" cannot work.

    Still, the error message rm: Clean: no such file or directory was enough to kick off the old rm frissance.

2. Dotty

    The need to remove several large source trees accessible only by superuser and sitting below directories with names like: .old .070996 .arch etc, led me (one fateful evening) to type:
      rm -rf .*
      
    just above them. I feel clammy just thinking about it.

    The trees were very large so it did not concern me that this command had not returned after some minutes - but slowly I became aware that all was not well... It didn't help that this machine was in a kind of cluster with several others - each of which shared its entire filesystem via amd mountpoints with root allow privs.

    It was a long night...

3. Cron only too HTH

    A customer complained that his machine had hung and then corrupted huge chunks of /usr when he forced a panic. He submitted a manifest of what was left in /usr as evidence.

    We already had an idea that there was some rm-ery afoot and asked him to check what the machine had been doing at the time. Following a hunch I took the source of rm and made a modified version which output file names instead of unlinking them and ran it on a local system with the same Solaris release.

    The rationale was that rm's file walk removes files and descends directories in directory entry order. Since most of the files in /usr are put there during installation and never subsequently deleted this file walk will tend to be the same on machines installed with the same release. I split the output of my "neutered" rm at ld.so.1, whose removal probably hung the machine. The remaining list of files was an almost exact match with the manifest supplied by the customer...

    ... who, in the meantime, had found a root cronjob which did something like:

      cd $SCRATCHDIR
      rm -rf *
      
    This is a well known danger spot. One should guard against the possibilities of $SCRATCHDIR being either null or not something that one could cd to.

4. Death by wrapper

    Many years ago a friend of mine became so concerned by his propensity for causing collateral damage with rm that he aliased it to rm -i. I have since encountered whole sites where this lunacy is standard practice.

    The funny thing in the case of my friend was the speed with which the inevitable disaster overtook him. I think it was mere days before he typed rm * on some machine or other, expecting the friendly dialogue but getting only a prompt and a directory more echoey than he had intended.

    Thurber in The Bear Who Could Let It Alone concludes "It is better to fall flat on your face than lean too far backward".

Friday Nov 10, 2006

truss -d -u on syscall stubs

Using truss -d (or -D) gives timing information which may help identify where delays are occurring in a process. However, it does not timestamp both the exit and entry points on a system call so it is not always obvious if the delay is really in the system call or in the program's cogitations.

Dtrace or TNF probes with microstate accounting will get accurate per-system call duration information, but there is also a cheap and cheerful way using truss itself where these other methods are not available or appropriate and where you don't mind the relatively high probe effect of truss.

Since system calls are made from within libc stubs you can use the library call tracing facility of truss to trace entry and exit of the stub. So for open[64](), close() and various read flavours you could use:

    truss -d -u "libc::*_open*,*_close,*_*read*" command_or_-p_PID
$ truss -d -u "libc::*_open*,*_close,*_*read*" wc -l /etc/group

Base time stamp:  1163198817.8709  [ Fri Nov 10 15:46:57 MST 2006 ]
16824:   0.0000 execve("/usr/bin/wc", 0xFFBFF144, 0xFFBFF154)  argc = 3
16824:   argv: wc -l /etc/group
16824:   0.0097 resolvepath("/usr/bin/wc", "/usr/bin/wc", 1023) = 11
16824:   0.0099 resolvepath("/usr/lib/ld.so.1", "/usr/lib/ld.so.1", 1023) = 16
16824:   0.0101 stat("/usr/bin/wc", 0xFFBFEF18)                 = 0
...
16824:   0.0496 -> libc:_open64(0xffbff296, 0x0, 0x1b6, 0x0)
16824:   0.0500   -> libc:__open64(0xffbff296, 0x0, 0x1b6, 0x0)
16824:   0.0504 open64("/etc/group", O_RDONLY)                  = 3
16824:   0.0505   <- libc:__open64() = 3
16824:   0.0508 <- libc:_open64() = 3
16824:   0.0511 fstat64(3, 0xFFBFEF10)                          = 0
16824:   0.0512 brk(0x00024AC8)                                 = 0
16824:   0.0513 brk(0x00026AC8)                                 = 0
16824:   0.0514 fstat64(3, 0xFFBFEDB8)                          = 0
16824:   0.0515 ioctl(3, TCGETA, 0xFFBFEE9C)                    Err#25 ENOTTY
16824:   0.0516 -> libc:_read(0x3, 0x23b44, 0x2000, 0xff28b5f4)
16824:   0.0520 read(3, " r o o t : : 0 : r o o t".., 8192)     = 8192
16824:   0.0521 <- libc:_read() = 8192
...

Wednesday Sep 20, 2006

Dingy corners of file i/o

ufs bears in its wake a rat-pack of low-life ioctls. One of these - DIRECTIO_ON allows you to force directio per-file and is used potentially by databases if their app tunable is so set. The idea is that with a properly sized memory area the db can manage its own buffering and avoid the overhead of copying to kernel buffers.

I was asked about dd with small blocksize being extremely slow on some database files but fast on others. The question of why anyone would want to dd a database file and the dangers of un-sparsing such files etc. are another subject The interesting thing is that the directio setting persisted long after the db had closed the file.

The flag is in the inode's i_flag and hangs around in the cached inodes hanging off the dnlc ready to surprise the next program which opens it. Basically only databases should perform io on database files. But if people use per-file directio on other files for other reasons it is worth remembering to clean up afterwards.

Here are the noddies I tested this with:

# cat dio.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/fcntl.h>

int directio(int fildes, int advice);
static char *Usage = "Usage: dio <filename> 1|0";

void
bomb(char *mess)
{
        fprintf(stderr,"%s\n",mess);
        exit(1);
}

main(int argc, char **argv)
{
        int fd, adv;

        if (argc != 3) {
                bomb(Usage);
        }

        if ((fd = open64(argv[1],O_RDWR)) < 0) {
                bomb("Can't open file");
        }

        switch ((int)(*argv[2])) {
            case '0':
                adv = DIRECTIO_OFF;
                break;
            case '1':
                adv = DIRECTIO_ON;
                break;
            default:
                bomb(Usage);
        }

        if (directio(fd,adv) < 0) {
                bomb("could not set advice");
        }

        close(fd);
        exit(0);
}



# cat flu.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/fcntl.h>
#include <sys/filio.h>

int directio(int fildes, int advice);
static char *Usage = "Usage: flu <filename>";

void
bomb(char *mess)
{
        fprintf(stderr,"%s\n",mess);
        exit(1);
}

main(int argc, char **argv)
{
        int fd;

        if (argc != 2) {
                bomb(Usage);
        }

        if ((fd = open64(argv[1],O_RDWR)) < 0) {
                bomb("Can't open file");
        }

        if (ioctl(fd,_FIOISBUSY) < 0) {
                bomb("could not ioctl");
        }

        close(fd);
        exit(0);
}

_FIOISBUSY flushes a file from the dnlc. That's more infomative than just doing "dio foo 0".

Tests

# ls -l
total 10032
-rwxrwxrwx   1 root     other    5120015 Mar 24 16:22 foo
# timex dd if=foo of=/dev/null count=10000
10000+0 records in
10000+0 records out

real        0.09
user        0.04
sys         0.05

# dio foo 1
# timex dd if=foo of=/dev/null count=10000
10000+0 records in
10000+0 records out

real        2.41     ## Whoa! 10000 latencies = 26 times slower
user        0.03
sys         0.31

# flu foo            ## Simulate eventual purging of the inode
# timex dd if=foo of=/dev/null count=10000
10000+0 records in
10000+0 records out

real        0.10     ## Back to normal
user        0.01
sys         0.09


Calendar
Search my blog
Lake Guillemont
Feeds
Links
Referrers