Alan Hargreaves' Weblog
The ramblings of an Australian SaND TSC* Principal Field Technologist
* Solaris and Network Domain Technology Support Centre - The group I work forTags
(update 1) acoustic bind birthday blues bugs cec cec2007 cec2008 china cmt contention cringley debugging dogs dtrace earthquake encumbered-binaries extra flash funny google guitar halloween huron install kids linux liveupgrade locking mdb music mysql newyear niagra openjava opensolaris oracle patches patents percussion performance redhat secondlife security solaris sru sun support sxcr t2 t2000 timeslider ufs upgrade virtualbox windows youtube zfs
Friday Sep 10, 2004
dtracing on the train - gnome-vfs-daemon
There was some discussion internally recently about using gnome-vfs-daemon as a process to start demonstrating dtrace with. As a result I thought it would be interesting to have a look at it on my notebook while coming to work on the train (about a 90 minute trip).
OK, first off, let's have a look at the system calls. We could do this with
$ /usr/sbin/dtrace -n 'syscall:::entry/execname=="gnome-vfs-daemon"'
But that really generates a lot of output. One of the things that I do notice though is the following sequence repeating itself every three seconds.
0 397 open64:entry 0 397 open64:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 103 ioctl:entry 0 309 llseek:entry 0 17 close:entry 0 317 pollsys:entry
This peaked my interest. How about an aggregation over a minute?
$ /usr/sbin/dtrace -q -n '
syscall:::entry/execname=="gnome-vfs-daemon"/{
@p[probefunc] = count();
}
tick-1m {
printf("\nCount Syscall\n----- -------\n");
printa(" %@3d %s\n",@p); exit(0);
}'
Count Syscall
----- -------
20 pollsys
20 close
20 llseek
40 open64
360 ioctl
So, for each pollsys() and close(), it looks like I am seeing two open64()s and 18 ioctl()s, and from earlier we know that we see two open64()s followed by the ioctl()s. Let's have a closer look at these system calls. Also, while we are at it, let's try and find out what the ioctl is. We now move to scripting files.
#!/usr/sbin/dtrace -s
syscall::ioctl:entry
/execname == "gnome-vfs-daemon"/
{
self->interest = timestamp;
self->fd = arg0;
self->cmd1 = (char)(arg1>>8);
self->cmd2 = arg1&0xff
}
syscall::ioctl:return
/self->interest/
{
printf("ioctl(%d, '%c'<<8|%d, buf) took %u us",
self->fd,
self->cmd1,
self->cmd2,
(timestamp - self->interest)/1000
);
ustack();
self->interest = 0;
}
syscall::open64:entry
/execname == "gnome-vfs-daemon"/
{
self->file = copyinstr(arg0);
self->interest = 1;
}
syscall::open64:return
/self->interest/
{
printf("%s returns fd %d\n", self->file, (int)arg1);
ustack();
self->interest = 0;
}
An excerpt of the output shows
CPU ID FUNCTION:NAME
0 398 open64:return /etc/fstab returns fd -1
libc.so.1`__open64+0x15
libc.so.1`_endopen+0xa8
libc.so.1`fopen64+0x26
libgnomevfs-2.so.0.600.0`_gnome_vfs_get_unix_mount_table+0x3e
libglib-2.0.so.0.400.1`g_source_callback_funcs
libgnomevfs-2.so.0.600.0`poll_fstab
0 398 open64:return /etc/mnttab returns fd 44
libc.so.1`__open64+0x15
libc.so.1`_endopen+0xa8
libc.so.1`fopen64+0x26
libgnomevfs-2.so.0.600.0`_gnome_vfs_get_current_unix_mounts+0x37
libglib-2.0.so.0.400.1`g_source_callback_funcs
libgnomevfs-2.so.0.600.0`poll_mtab
0 104 ioctl:return ioctl(44, 'm'<<8|7, buf) took 232 us
libc.so.1`ioctl+0x15
libgnomevfs-2.so.0.600.0`_gnome_vfs_get_current_unix_mounts+0x55
0 104 ioctl:return ioctl(44, 'm'<<8|7, buf) took 4 us
libc.so.1`ioctl+0x15
libgnomevfs-2.so.0.600.0`_gnome_vfs_get_current_unix_mounts+0xfe
OK, so we first attempt to open /etc/fstab, which obviously does not exist in solaris, then we open /etc/mnttab, which does. We then go and do lots of ioctl()s on it.
The 'm'<<8|7 (as well as being MTIOCGUARANTEEDORDER in sys/mtio.h) is apparantly MNTIOC_GETMNTENT (common/sys/mntio.h).
This has only gone back into Solaris 10 in the last few months. It's generally called by getmntent(3C).
So basically, it looks like gnome-vfs-daemon is reading the mount table of the system every three seconds. This certainly seems like a lot of overkill. Indeed on the notebook I ran it on, if we try the following script
#!/usr/sbin/dtrace -s
syscall::ioctl:entry
/execname == "gnome-vfs-daemon"/
{
self->interest = timestamp;
}
syscall::ioctl:return
/self->interest/
{
@ioctl[probefunc] = quantize((timestamp - self->interest)/1000);
self->timestamp = 0;
}
syscall::open64:entry
{
self->interest = timestamp;
self->file = copyinstr(arg0);
}
syscall::open64:return {
@open64[self->file] = quantize((timestamp - self->interest)/1000);
self->timestamp = 0;
}
tick-1min {
printa (@ioctl);
printa (@open64);
exit(0)
}
We can look at just how much time is being spent (we could always actually sum the numbers, but I like looking at graphs).
ioctl
value ------------- Distribution ------------- count
1 | 0
2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 340
4 | 0
8 | 0
16 | 0
32 | 0
64 | 0
128 | 0
256 |@@ 20
512 | 0
/etc/fstab
value ------------- Distribution ------------- count
8 | 0
16 |@@@@@@@@@@@@@@@@@@@@ 10
32 |@@@@@@@@@@@@@@@@@@@@ 10
64 | 0
/etc/mnttab
value ------------- Distribution ------------- count
32 | 0
64 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 20
128 | 0
So, worst case on this system is 340*2 + 10*16 + 10*32 + 20*64 = 2440 microseconds every minute, or 122 microseconds every iteration. This might not initially sound like an awful lot to worry about. However, in this iteration, I am only seeing 81 mount points on my system and I only have one gnome-vfs-daemon running. The truly scary concept is when we start looking at this issue in the light of large sunray servers, say with 250-300 entries on the mount table and around 80 users (instances of gnome-vfs-daemon).
Fortunately there is an answer. If we lookat the man page on mnttab(4) we see
NOTES
The snapshot of the mnttab information is taken any time a
read(2) is performed at offset 0 (the beginning) of the
mnttab file. The file modification time returned by stat(2)
for the mnttab file is the time of the last change to
mounted file system information. A poll(2) system call
requesting a POLLRDBAND event can be used to block and wait
for the system's mounted file system information to be dif-
ferent from the most recent snapshot since the mnttab file
was opened.
So we actually only need to stat the file to see if the mount table has changed. This is a much more efficient way of doing things.
After logging a bug on this, the engineer who took it determined that the code to do the stat() was actually in there, but due to some faulty logic, it would never get executed. This is now logged into the gnome bugzilla as 152114 and 152115 and with luck should be fixed shortly.
Posted at 06:15PM Sep 10, 2004 by Alan Hargreaves in Solaris Express |

