Wednesday July 09, 2008 Today's dtrace one-liner is how do you get dtrace to exit with the same exit code as the program you are tracing so you can use it in a script?
There are going to be many ways to solve this. Mine is simply to include a probe for the exit system call and then pass arg0 of the exit system call to the dtrace exit() function:
while dtrace -n 'syscall::*exit:entry /pid == $target / { trace(arg0); exit(arg0) } fbt:autofs::return / pid == $target / { stack(5); trace(arg1) }' -c 'ls -d /home/cjg'
do
umountall -F nfs
svcadm restart autofs
done
Simple and allows the test to run and run until the problem is seen.
Thursday July 03, 2008 Today's dtrace “one liner” is to dig into why a process that is failing when a call to gethostbyname_r(). Now if the application reported all that was going on this would not be needed, meanwhile in the real world we are lucky enough to have dtrace. To test it out I'm tracing the getent(1) command. In real life it would use the -p flag to dtrace and the process ID of the daemon.
/usr/sbin/dtrace -n 'pid$target::gethostbyname_r:entry {
self->name = arg0;
}
pid$target::gethostbyname_r:return / arg1 == 0 / {
ustack(5);
}
pid$target::gethostbyname_r:return / self->name / {
printf("%s", copyinstr(self->name));
self->name = 0
}' -c "getent hosts xxxxxdredd"
dtrace: description 'pid$target::gethostbyname_r:entry ' matched 3 probes
dtrace: pid 4748 has exited
CPU ID FUNCTION:NAME
2 48759 gethostbyname_r:return
libnsl.so.1`gethostbyname_r+0xc4
getent`dogethost+0x54
getent`main+0x7c
getent`_start+0x108
2 48759 gethostbyname_r:return xxxxxdredd
It would be nice to be able to get the h_errno value as well but so far I've not managed that.
Update:
Jon Haslem kindly explained to me the subtleties of copyin() so that I can get the h_errno value.
/usr/sbin/dtrace -Zn 'pid$target::gethostbyname_r:entry {
self->name = arg0;
self->errno = arg4;
}
pid$target::gethostbyname_r:return / arg1 == 0 / {
ustack(5);
printf("%d %s h_errno %x", pid,
copyinstr(self->name),
*(int *)copyin(self->errno,sizeof(int)));
}' -c "getent hosts xxxxxdredd"
dtrace: description 'pid$target::gethostbyname_r:entry ' matched 2 probes
dtrace: pid 5087 has exited
CPU ID FUNCTION:NAME
2 48764 gethostbyname_r:entry errno: d4220008
2 48765 gethostbyname_r:return
libnsl.so.1`gethostbyname_r+0xc4
getent`dogethost+0x54
getent`main+0x7c
getent`_start+0x108
5087 xxxxxdredd h_errno 1
#
Wednesday July 02, 2008 This is not quite a one liner as I'm reusing the code from a previous post to print out the devices in a human readable form other wise it is just a one liner and was when I typed it in.
The question posed here was what is the maximum number of commands sent to a LUN at any one time? Clearly this will max out at the throttle for the device however what was interesting, since the customer had already tuned the throttle down and the problem had gone away was what was their configuration capable of sending to the LUN:
#!/usr/sbin/dtrace -qCs
#define SD_TO_DEVINFO(un) ((struct dev_info *)((un)->un_sd->sd_dev))
#define DEV_NAME(un) \
stringof(`devnamesp[SD_TO_DEVINFO(un)->devi_major].dn_name) /* ` */
#define DEV_INST(un) (SD_TO_DEVINFO(un)->devi_instance)
fbt:*sd:*sd_start_cmds:entry { @[DEV_NAME(args[0]),DEV_INST(args[0])] = max(arg
s[0]->un_ncmds_in_driver) }
END {
printa("%s%d %@d\n", @);
}
This produces a nice list of disk devices and the maximum number of commands that have been sent to them at anyone time:
# dtrace -qCs /var/tmp/max_sd.d -n 'tick-5sec { exit(0) }'
sd2 1
sd0 70
#
Combine that with the dscript from the latency bubble posting earlier and you can drill down on where your IO is waiting.
Tuesday July 01, 2008 Todays dtrace one liner is part of a case investigating why messages are not making it into the messages file. Using the divide and concur priciple the first question you need to answer is: “Is the process that is supposed to generate the messages calling into syslog?”
$ dtrace -n 'pid$target::syslog:entry { printf("%d %s", arg0, copyinstr(arg1)) }' -p $(pgrep xxxx)
dtrace: description 'pid$target::syslog:entry ' matched 1 probe
CPU ID FUNCTION:NAME
1 43227 syslog:entry 5 %s
^C
That is enough to answer the first question. You can get all flash and pull out the string passed in as the second argument but then it is not a one liner and is answering a different question. However it is neat so here it is:
$ dtrace -qn 'pid$target::syslog:entry { printf("%d %s", arg0, (this->arg1 = copy
instr(arg1))) }
pid$target::syslog:entry / this->arg1 == "%s"/ { printf(" %s\n", copyinstr(arg2))
}' -p $(pgrep xxxx)
How about varargs and vsprintf for dtrace....
Except where otherwise noted, this site is
licensed under a Creative Commons License 2.0
This is a personal weblog, I do not speak for my employer.