DTrace, the PID Provider, and LD_PRELOAD
A problem I encountered a few weeks ago was the following:
$ svcs -l sendmail fmri svc:/network/smtp:sendmail svcs: svcs.c:335: Unexpected libscf error: invalid argument. Exiting.
"svcs -a" produced no error and the correct output whereas "svcs -xv" also produced the above error. It didn't matter which SMF service we chose, the same problem was seen when using the '-l' or '-xv' flags. The question we needed to answer was what function within libscf was returning this error? The source code showed several places where SCF_ERROR_INVALID_ARGUMENT could be set.
We can see that /usr/bin/svcs links with libscf.so.1:
$ ldd /usr/bin/svcs libcontract.so.1 => /usr/lib/libcontract.so.1 libscf.so.1 => /usr/lib/libscf.so.1 libuutil.so.1 => /usr/lib/libuutil.so.1 libumem.so.1 => /usr/lib/libumem.so.1 libc.so.1 => /usr/lib/libc.so.1 libnvpair.so.1 => /usr/lib/libnvpair.so.1 libgen.so.1 => /usr/lib/libgen.so.1 libnsl.so.1 => /usr/lib/libnsl.so.1 libmp.so.2 => /usr/lib/libmp.so.2 libmd.so.1 => /usr/lib/libmd.so.1 libm.so.2 => /usr/lib/libm.so.2
As a first attempt at troubleshooting the problem we'd like to use the PID provider to look at what functions within libscf are called during the execution and see which are returning SCF_ERROR_INVALID_ARGUMENT. However the first attempt produced an error:
$ dtrace -n 'pid$target:libscf.so.1::return /pid == $target/ {printf("= %d\n", arg1);}' -c "svcs -l sendmail" -o /var/tmp/dtrace.svcs.out
dtrace: invalid probe specifier pid$target:libscf.so.1::entry /pid == $target/ {printf("= %d\n", arg1);}: probe description pid1153:libscf.so.1::entry does not match any probes
The issue here is that DTrace isn't loading the libscf.so.1 even though /usr/lib is in the LD_LIBRARY_PATH. The trick/workaround is to use the LD_PRELOAD variable to pre-load the library before DTrace attaches.
$ export LD_PRELOAD_32=/usr/lib/libscf.so.1
Note we use LD_PRELOAD_32 for 32bit libraries and LD_PRELOAD_64 for 64bit libraries. In theory setting LD_PRELOAD should allow the compiler/linker to select the correct ELF library but in this case it didn't which is why we use LD_PRELOAD_32 to force 32bit.
Running the DTruss on a working host we now get:
$ dtrace -n 'pid$target:libscf.so.1::return /pid == $target/ {printf("= %d\n", arg1);}' -c "svcs -l sendmail" -o /var/tmp/dtrace.svcs.out
dtrace: description 'pid$target:libscf.so.1::return ' matched 429 probes
fmri svc:/network/smtp:sendmail
name sendmail SMTP mail transfer agent
enabled true
state online
next_state none
state_time 15 February 2009 01:49:26 GMT
logfile /var/svc/log/network-smtp:sendmail.log
restarter svc:/system/svc/restarter:default
contract_id 84
dependency require_all/refresh file://localhost/etc/mail/sendmail.cf (online)
dependency require_all/refresh file://localhost/etc/nsswitch.conf (online)
dependency optional_all/none svc:/system/filesystem/autofs (online)
dependency require_all/none svc:/system/filesystem/local (online)
dependency require_all/none svc:/network/service (online)
dependency require_all/refresh svc:/milestone/name-services (online)
dependency optional_all/refresh svc:/system/identity:domain (online)
dependency optional_all/none svc:/system/system-log (online)
dtrace: pid 1406 has exited
This produced a 651k output file. A better way might be to look for any functions that return -1, or SCF_ERROR_INVALID_ARGUMENT.
$ dtrace -n 'pid$target:libscf.so.1::return /(pid == $target) && (arg1 == -1)/ {printf("= %d\n", arg1);}' -c "svcs -l sendmail" -o /var/tmp/dtrace.svcs.out
Unfortunately the original problem was never fully investigated as the host was rebuilt before root cause could be established.