#!/usr/sbin/dtrace -s
#pragma D option quiet
proc:::exec-success
/execname == "gnome-session"/
{
start = timestamp;
go = 1;
}
io:::start
/go/
{
printf("%10d { -> I/O %d %s %s %s }\n",
(timestamp - start) / 1000000, pid, execname,
args[0]->b_flags & B_READ ? "reads" : "writes",
args[2]->fi_pathname);
}
io:::done
/go/
{
printf("%10d { <- I/O to %s }\n",
(timestamp - start) / 1000000, args[2]->fi_pathname);
}
io:::start
/go && ((struct buf *)arg0)->b_file != NULL &&
((struct buf *)arg0)->b_file->v_path == NULL/
{
printf("%10s (vp %p)\n", "", ((struct buf *)arg0)->b_file);
}
io:::start
/go/
{
@apps[execname] = count();
@files[args[2]->fi_pathname] = count();
@appsfiles[execname, args[2]->fi_pathname] = count();
}
proc:::exec
/go/
{
self->parent = execname;
}
proc:::exec-success
/self->parent != NULL/
{
printf("%10d -> %d %s (from %d %s)\n",
(timestamp - start) / 1000000, pid, execname,
curpsinfo->pr_ppid, self->parent);
self->parent = NULL;
}
proc:::exit
/go/
{
printf("%10d <- %d %s\n",
(timestamp - start) / 1000000, pid, execname);
}
profile-101hz
/go && arg1 != NULL/
{
printf("%10d [ %d %s ]\n",
(timestamp - start) / 1000000, pid, execname);
}
profile-101hz
/go && arg1 == NULL &&
(curlwpsinfo->pr_flag & PR_IDLE)/
{
printf("%10d [ idle ]\n",
(timestamp - start) / 1000000);
}
END
{
printf("\n %-72s %s\n", "APPLICATION", "I/Os");
printa(" %-72s %@d\n", @apps);
printf("\n %-72s %s\n", "FILE", "I/Os");
printa(" %-72s %@d\n", @files);
printf("\n %-16s %-55s %s\n", "APPLICATION", "FILE", "I/Os");
printa(" %-16s %-55s %@d\n", @appsfiles);
}
This script uses a combination of CPU sampling and I/O tracing to determine
roughly what's going on over the course of logging in. I ran the above
script on on my
Acer Ferrari 3400
laptop running
OpenSolaris by dropping to the
command-line login
after a reboot and running:
# nohup ./login.d > /var/tmp/login.out &I then logged in, and made sure that the first thing I did was launch a gnome-terminal to pkill dtrace. (This could obviously be made quite a bit more precise, but it works as a crude way of indicating the completion of login.) Here is the output from performing this experiment. The first column is the millisecond offset from the start of gnome-session. CPU samples are contained within square brackets, launched programs are contained withing curly braces, and I/O is explicitly noted. An I/O summary is provided at the end. A couple of observations:
- Nearly one third of all I/O is to shared object text and read-only data.
This is classic death-of-a-thousand-cuts, and it's hard to see that there's
an easy way to fix this. But perhaps text could be reordered to save some
I/O? More investigation is probably merited here.
- Over a quarter of all I/O is from
GConf
-- and many of these are from
wandering around an expansive directory hierarchy looking for configuration
information. It is well-known that the XML backend is a big performance
problem, and a better backend is apparently being worked on. At any rate,
solving this problem is clearly in
GConf's future.
- Of the remaining I/O, a bunch
is to icon files.
Glynn Foster
pointed out that this looks to be addressed by the
GTK+ gtk-update-icon-cache, new in
GTK+ 2.6 and contained within
GNOME 2.10, so this experiment will obviously need to
be repeated on
GNOME 2.10.
- CPU utilization doesn't look like a big issue -- or it's one that is at
least dwarfed by the cost of performing I/O. That said, gconfd-2 looks to
be a bit piggy in this department as well. We're only using CPU sampling --
and we're using a pretty coarse grained sample -- and gconfd-2 still
showed up. More precise investigation into CPU utilization can be
performed with the sched provider and its on-cpu and
off-cpu probes.
# mdb -k Loading modules: [ unix krtld genunix specfs dtrace ufs ip sctp usba uhci s1394 random nca lofs nfs audiosup sppp ptm ipc ] > ffffffff8f6851c0::vnode2path /usr/sfw/lib/libXrender.so.1 >Yes, having to do this sucks a bit, and it's a known issue. And rumor has it that Eric even has a workspace with the fix, so stay tuned...
Update: Eric pointed me to a prototype with the fix, and I reran the script on a GNOME cold start; here is the output from that run. Interestingly, because the symlinks now show up, a little postprocessing reveals that we chased nearly eighty symlinks on startup! From the output, reading many of these took 10 to 20 milliseconds; we might be spending as much as one second of GNOME startup blocked on I/O just to chase symlinks! Ouch! Again, it's not clear how one would fix this; having an app link to libpango-1.0.so.0 and not libpango-1.0.so.400.1 is clearly a Good Thing, and having this be a symlink instead of a hardlink is clearly a Good Thing -- but all of that goodness leaves you with a read dependency that's hard to work around. Anyway, be looking for Eric's fix in OpenSolaris and then in an upcoming Solaris Express release; it makes this kind of analysis much easier -- and thanks Eric for the quick prototype!
Technorati tags: OpenSolaris Solaris DTrace GNOME
Posted by 24.152.137.250 on July 11, 2005 at 10:26 PM PDT #
Posted by Nico on July 27, 2005 at 12:11 PM PDT #
Posted by 注册香港公司 on July 16, 2007 at 11:16 PM PDT #
Posted by 注册香港公司 on July 16, 2007 at 11:18 PM PDT #
Posted by 注册香港公司 on July 16, 2007 at 11:18 PM PDT #
Posted by 注册香港公司 on July 16, 2007 at 11:19 PM PDT #
Posted by 注册香港公司 on July 16, 2007 at 11:21 PM PDT #
Posted by 注册香港公司 on July 16, 2007 at 11:21 PM PDT #