Solaris Quiz - answer
The correct answer was that ksh has 'true' as a built in function to the korn shell whereas the bourne shell execs a separate binary /usr/bin/true. So in vmstat/mpstat we see a lot of time attributed to 'sys' even though our instinct is that 'while true do' should be userland only. At least this sounds like a good theory. How to prove it though? Today Matthew our I'm going to be using truss(1). Im going to create 2 scripts#!/bin/sh while true do : done
and
#!/bin/ksh while true do : done
So now we have a couple of scripts to trace. Im going to use truss with -f so that we 'follow' any forked processes and -c to 'count' the number of system calls. We use '-f' because otherwise we'll only see the forked sub-shell and not the script that we are actually trying to measure. So what do we actually see? First of all using korn shell, then bourne shell.
garyli@podtwo[/var/tmp] $truss -fc ./x.ksh
^Csyscall seconds calls errors
read .000 3
open .000 12 1
close .000 16 2
time .000 2
brk .000 6
stat .000 1
getpid .000 2
getuid .000 2
fstat .000 11
getgid .000 2
ioctl .000 3 2
execve .000 1
fcntl .000 6
sigprocmask .000 1
sigaction .000 19
mmap .002 18
munmap .001 5
memcntl .000 4
llseek .000 2
resolvepath .001 8
stat64 .000 3
fstat64 .000 3
open64 .000 2
-------- ------ ----
sys totals: .012 132 5
usr time: .011
elapsed: 5.400
So here we see only 1 execve call. Now when we run the same script but under bourne shell...
garyli@podtwo[/var/tmp] $truss -fc ./x.sh
^Csyscall seconds calls errors
_exit .020 119
fork .079 120
read .040 477
open .080 721 121
close .020 959 120
brk .160 1081
stat .010 121
getpid .030 240
getuid .000 241
fstat .000 600
access .000 1
getsid .010 120
getpgid .010 120
getgid .000 240
ioctl .047 716 358
execve .310 121
fcntl .010 240
sigaltstack .020 120
sigaction .141 6360
waitid .053 238
mmap .181 1321
munmap .060 360
memcntl .040 240
sysconfig .010 240
resolvepath .100 722
stat64 .002 5 4
open64 .010 120
-------- ------ ----
sys totals: 1.450 15963 603
usr time: .461
elapsed: 5.370
We see 121 in the execve row, meaning that we execec'd 121 sub processes well 120 (less the exec of the subshell). Now the number of execs is dwarfed by say, the number of sigaction calls, so why concentrate on exec? Well the sigactions are just a part of what happens when some processes are exec'd. An individual truss -fc against /usr/bin/true shows 53 calls to sigaction (120 * 53 = 6360)
So, I think there's just one more question to be answered here. Why do we need to make 53 calls to sigaction for a binary that just returns 'true'. The answer is here...
garyli@podtwo[/var/tmp] $file /usr/bin/true /usr/bin/true: executable /usr/bin/sh script
So.... 'true' is actually a script...
garyli@podtwo[/var/tmp] $cat /usr/bin/true #!/usr/bin/sh # Copyright (c) 1984, 1986, 1987, 1988, 1989 AT&T # All Rights Reserved # THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF AT&T # The copyright notice above does not evidence any # actual or intended publication of such source code. #ident "@(#)true.sh 1.6 93/01/11 SMI" /* SVr4.0 1.4 */
So we are starting an entire shell just to exit with '0' and thus give us the 'true' we want. Shells need to have signal handlers to correctly handle CTRL-Z and so on. That's all a bit superfluous for 'true'. We can do something a lot simpler than that such as...
int main()
{
return 0;
}
We can then compile that code and call it newtrue. Now when we run the script and get it to call 'newtrue' we get a truss output similar to this...
garyli@podtwo[/var/tmp] $truss -fc ./x1.sh
^Csyscall seconds calls errors
_exit .050 146
fork .101 147
read .000 1
open .090 594 148
close .010 448 1
brk .120 447
stat .050 148
getpid .040 148
getuid .000 2
fstat .020 446
getsid .000 1
getpgid .000 1
getgid .000 2
ioctl .019 294 1
execve .530 148
fcntl .000 2
sigaltstack .000 1
sigaction .001 53
waitid .063 292
mmap .141 1187
munmap .030 297
memcntl .050 149
sysconfig .000 2
resolvepath .220 741
open64 .000 1
-------- ------ ----
sys totals: 1.541 5698 150
usr time: .232
elapsed: 4.760
Now we see 148 execs but only 53 calls to sigaction. It's all a bit trivial, but shows quite nicely how much information that we can get out of an old-school tool like truss. If only everything were that simple.
And Finally. Congratulations to David Morris from the UK on getting the first post. ( Jul 27 2004, 03:15:36 PM BST ) Permalink
Comments are closed for this entry.

