Wednesday November 21, 2007 Perl DTrace and complex data structures
Recently I had to solve the following problem. Suppose that you have a bunch of strings, consisting of various tokens (think of sentences and words). For example. For example, the first string may be 'a b c' and the second can be 'a b d'. Some strings can have common prefixes - sets of tokens that start the string. I'd like to be able to see what tokens are common and what are different, so for two strings above I'd like to print something like
a b c d
This tree shows all unique strings that I have in my input in a compressed way with all common prefixes joined together. If I add another string 'a e f' to it, the tree would become
a b c d e f
I started by writing a little prototype in Lisp, using Emacs lisp interpreter. First of all I needed a little helper function which tranlates a single string into a deep tree:
;;
;; Convert list l to a one-brunch tree
;; For example, (list2tree nil '(a b c)) becomes
;; (a (b (c nil)))
(defun list2tree (l)
(when l
(list (car l)
(list2tree (cdr l)))))
The main code is a bit tricky because it uses double recursion - one for the already constructed tree and another for the tokens in the string.
;;
;; Given the tree and a list of args construct a new tree finding the common
;; prefix in the args. For example:
;;
;; (linsert (linsert nil '(a b c)) '(a b d)) gives
;;
;; ((a
;; (b
;; (c nil)
;; (d nil))))
;;
(defun linsert (tree args)
(if tree
;; Add new elements ot an existing tree
(let* ((front (car tree)) ; first brunch
(key (car front)) ; key of the brunch
(el (car args)))
(if (eq key el) ; if key matches the first element
;;
;; Replace the matching brunch of a tree with a tree including rest
;; of args
;;
(cons (cons key
(linsert (cdr front)
(cdr args)))
(cdr tree))
;; No match - try next brunch
(cons front
(linsert (cdr tree) args))))
;; Tree is empty, construct a tree using list2tree
(list (list2tree args))))
A simple test shows that we got what we wanted:
(setq x (linsert nil '(a b c))) ((a (b (c nil)))) (setq y (linsert x '(d e f))) ((a (b (c nil))) (d (e (f nil)))) (setq z (linsert y '(d e h))) ((a (b (c nil))) (d (e (f nil) (h nil)))) (pp z) ((a (b (c nil))) (d (e (f nil) (h nil))))
So far so good. Now I tried to translate the above code into Perl using array references and got thoroughly confused. I just could not get the same thing working in perl. List references are rather ugly once you try to do something non-trivial with them. So I decided to follow another root and searched on CPAN for available implementation of trees and found the Tree::Simple module which seemed to provide the functionality needed. Here is the perl version:
#!/usr/bin/perl
use Tree::Simple;
# Insert a list into a tree
sub tinsert
{
my $tree = shift;
return $tree unless scalar @_;
my $el = shift;
my @rest = @_;
my @match = grep { $_->getNodeValue() eq $el} $tree->getAllChildren();
if (scalar @match) {
my $t = $match[0];
tinsert($t, @rest);
} else {
tinsert($tree->addChild(Tree::Simple->new($el)), $el, @rest);
}
}
As a side benefit I got everything I needed to pretty-print the result:
# Print tree node
sub print_node
{
my $tree = shift;
print ' ' x $tree->getDepth(), $tree->getNodeValue(), "\n";
}
# Print the whole tree
sub tprint
{
my $tree = shift;
$tree->traverse(\&print_node);
}
my $tree = Tree::Simple->new("root");
tinsert($tree, 'a', 'b', 'c');
tinsert($tree, 'd', 'e', 'f');
tinsert($tree, 'd', 'e', 'h');
tprint($tree);
This produces
a b c d e f h
So, although direct list manipulation turned out to be pretty ugly, using Perl objects and a great library from Stevan Little, the resulting code is pretty simple.
This wasn't just an exercise in recursive functions. I used this to post-process a huge file with DTrace data describing Solaris build collected by a little D script:
#!/usr/sbin/dtrace -Cs /* * Provide information about dmake targets and directories. */ #include/* * Use process p_mstart time instead of pid since pids roll over */ proc:::exec-success /curpsinfo->pr_projid == $1 && execname == "dmake"/ { this->proc = curthread->t_procp; this->parent = curthread->t_procp->p_parent->p_parent; this->pcwd = this->parent->p_user.u_cdir->v_path == NULL ? " " : stringof(this->parent->p_user.u_cdir->v_path); printf("I %d\t%d\t%d\t%d\t%d\t%s\t%s\t%s [%s]\n", walltimestamp, pid, this->proc->p_mstart, this->parent->p_pid, this->parent->p_mstart, cwd, this->pcwd, curpsinfo->pr_psargs, this->parent->p_user.u_psargs); @dirs[cwd] = count(); } syscall::rexit:entry /curpsinfo->pr_projid == $1 && execname == "dmake"/ { printf("O %d\t%d\t%d\n", walltimestamp, pid, curthread->t_procp->p_mstart); } END { printf("---directories--- \n"); printa("%@d\t%s\n", @dirs); }
Combining the DTrace and Perl magic together into a single mix I got some interesting build timelines for sparc and x86. For example:
--------------------------------------------------------------------------------
Directory Time Spent Target
--------------------------------------------------------------------------------
...
usr/src ksh usr/src/tools/scripts/nightly.sh /export/onnv-76//etc/env
usr/src 51s 1h37m30s -e install
usr/src/uts 2m15s 1h23m43s install
usr/src/uts/common/sys 2m16s all_h
usr/src/uts/common/rpc 2m16s all_h
usr/src/uts/common/rpcsvc 2m16s all_h
usr/src/uts/common/gssapi 2m16s all_h
usr/src/uts/common/idmap 2m16s all_h
usr/src/uts/sun4v 2m18s 16m16s install
usr/src/uts/sun4v/genassym 2m18s 7s install
usr/src/uts/sun4v/genassym 2m18s 6s def.targ
usr/src/uts/sun4v/unix 2m24s 5m6s install
usr/src/uts/sun4v/unix 2m25s 5m4s install.targ
usr/src/uts/sun4v/genassym 2m26s 1s all.targ
usr/src/uts/sun4v/genunix 3m20s 3m51s all.targ
usr/src/uts/sparc/ip 3m21s 1m25s ipctf.debug64
usr/src/uts/sparc/ip 3m22s 1m24s debug64/ipctf.a
usr/src/uts/sun4v/platmod 7m11s 3s all.targ
usr/src/uts/sun4v/genunix 7m30s 1m35s install
usr/src/uts/sun4v/genunix 7m31s 1m33s install.targ
usr/src/uts/sparc/ip 7m32s 11s ipctf.debug64
usr/src/uts/sparc/ip 7m33s 10s debug64/ipctf.a
usr/src/uts/sun4v/generic 10m17s 1m55s install
usr/src/uts/sun4v/generic 10m18s 1m55s install.targ
usr/src/uts/sun4v/generic 10m18s 1m54s def.targ
usr/src/uts/sun4v/unix 10m22s 1m50s symcheck
usr/src/uts/sun4v/genassym 10m23s 1s all.targ
usr/src/uts/sun4v/genunix 10m39s 1m31s all.targ
usr/src/uts/sparc/ip 10m40s 10s ipctf.debug64
usr/src/uts/sparc/ip 10m41s 10s debug64/ipctf.a
usr/src/uts/sun4v/niagara 12m13s 1m56s install
usr/src/uts/sun4v/niagara 12m13s 1m56s install.targ
usr/src/uts/sun4v/niagara 12m13s 1m55s def.targ
usr/src/uts/sun4v/unix 12m19s 1m49s symcheck
usr/src/uts/sun4v/genassym 12m20s 1s all.targ
usr/src/uts/sun4v/genunix 12m36s 1m31s all.targ
usr/src/uts/sparc/ip 12m37s 10s ipctf.debug64
usr/src/uts/sparc/ip 12m37s 10s debug64/ipctf.a
usr/src/uts/sun4v/niagara2 14m9s 1m54s install
usr/src/uts/sun4v/niagara2 14m9s 1m54s install.targ
usr/src/uts/sun4v/niagara2 14m9s 1m53s def.targ
usr/src/uts/sun4v/unix 14m15s 1m48s symcheck
usr/src/uts/sun4v/genassym 14m15s 1s all.targ
usr/src/uts/sun4v/genunix 14m31s 1m30s all.targ
usr/src/uts/sparc/ip 14m32s 10s ipctf.debug64
usr/src/uts/sparc/ip 14m32s 10s debug64/ipctf.a
usr/src/uts/sun4v/vfalls 16m3s 1m54s install
usr/src/uts/sun4v/vfalls 16m3s 1m54s install.targ
usr/src/uts/sun4v/vfalls 16m4s 1m53s def.targ
usr/src/uts/sun4v/unix 16m9s 1m48s symcheck
usr/src/uts/sun4v/genassym 16m10s 1s all.targ
usr/src/uts/sun4v/genunix 16m26s 1m30s all.targ
usr/src/uts/sparc/ip 16m27s 10s ipctf.debug64
usr/src/uts/sparc/ip 16m27s 10s debug64/ipctf.a
usr/src/uts/sun4v/ontario 18m19s 10s install
usr/src/uts/sun4v/ontario/platmod 18m19s 4s install
usr/src/uts/sun4v/ontario/platmod 18m19s 3s install.targ
usr/src/uts/sun4v/ontario/tsalarm 18m23s 6s install
usr/src/uts/sun4v/ontario/tsalarm 18m23s 5s install.targ
usr/src/uts/sun4v/montoya 18m28s 4s install
usr/src/uts/sun4v/montoya/platmod 18m28s 4s install
usr/src/uts/sun4v/montoya/platmod 18m29s 3s install.targ
usr/src/uts/sun4v/huron 18m32s install
usr/src/uts/sun4v/maramba 18m32s install
usr/src/uts/sun4u 18m32s 53m18s install
usr/src/uts/sun4u/genassym 18m33s 7s install
usr/src/uts/sun4u/genassym 18m33s 7s def.targ
usr/src/uts/sun4u/unix 18m40s 3m46s install
usr/src/uts/sun4u/unix 18m40s 3m45s install.targ
usr/src/uts/sun4u/genassym 18m41s 1s all.targ
usr/src/uts/sun4u/genunix 19m35s 2m34s all.targ
usr/src/uts/sparc/ip 19m36s 10s ipctf.debug64
usr/src/uts/sparc/ip 19m36s 10s debug64/ipctf.a
usr/src/uts/sun4u/platmod 22m8s 2s all.targ
usr/src/uts/sun4u/genunix 22m25s 1m32s install
usr/src/uts/sun4u/genunix 22m26s 1m31s install.targ
usr/src/uts/sparc/ip 22m27s 10s ipctf.debug64
usr/src/uts/sparc/ip 22m28s 10s debug64/ipctf.a
usr/src/uts/sun4u/cheetah 25m54s 2m3s install
usr/src/uts/sun4u/cheetah 25m55s 2m3s install.targ
usr/src/uts/sun4u/cheetah 25m55s 2m2s def.targ
usr/src/uts/sun4u/unix 26m7s 1m50s symcheck
usr/src/uts/sun4u/genassym 26m8s 1s all.targ
usr/src/uts/sun4u/genunix 26m24s 1m31s all.targ
usr/src/uts/sparc/ip 26m25s 10s ipctf.debug64
usr/src/uts/sparc/ip 26m25s 10s debug64/ipctf.a
usr/src/uts/sun4u/cheetahplus 27m58s 2m2s install
usr/src/uts/sun4u/cheetahplus 27m58s 2m2s install.targ
usr/src/uts/sun4u/cheetahplus 27m58s 2m1s def.targ
usr/src/uts/sun4u/unix 28m11s 1m48s symcheck
usr/src/uts/sun4u/genassym 28m12s 1s all.targ
usr/src/uts/sun4u/genunix 28m27s 1m30s all.targ
usr/src/uts/sparc/ip 28m28s 10s ipctf.debug64
usr/src/uts/sparc/ip 28m29s 10s debug64/ipctf.a
usr/src/uts/sun4u/jalapeno 30m 2m1s install
usr/src/uts/sun4u/jalapeno 30m 2m1s install.targ
usr/src/uts/sun4u/jalapeno 30m 2m def.targ
usr/src/uts/sun4u/unix 30m12s 1m48s symcheck
usr/src/uts/sun4u/genassym 30m13s 1s all.targ
usr/src/uts/sun4u/genunix 30m29s 1m30s all.targ
usr/src/uts/sparc/ip 30m30s 10s ipctf.debug64
usr/src/uts/sparc/ip 30m30s 10s debug64/ipctf.a
usr/src/uts/sun4u/serrano 32m1s 2m1s install
usr/src/uts/sun4u/serrano 32m1s 2m1s install.targ
usr/src/uts/sun4u/serrano 32m2s 2m def.targ
usr/src/uts/sun4u/unix 32m14s 1m48s symcheck
usr/src/uts/sun4u/genassym 32m15s 1s all.targ
usr/src/uts/sun4u/genunix 32m30s 1m30s all.targ
usr/src/uts/sparc/ip 32m31s 10s ipctf.debug64
usr/src/uts/sparc/ip 32m32s 10s debug64/ipctf.a
usr/src/uts/sun4u/spitfire 34m2s 1m58s install
usr/src/uts/sun4u/spitfire 34m3s 1m58s install.targ
usr/src/uts/sun4u/spitfire 34m3s 1m57s def.targ
usr/src/uts/sun4u/unix 34m12s 1m48s symcheck
usr/src/uts/sun4u/genassym 34m13s 1s all.targ
usr/src/uts/sun4u/genunix 34m28s 1m30s all.targ
usr/src/uts/sparc/ip 34m29s 10s ipctf.debug64
usr/src/uts/sparc/ip 34m30s 10s debug64/ipctf.a
usr/src/uts/sun4u/hummingbird 36m 1m59s install
usr/src/uts/sun4u/hummingbird 36m1s 1m58s install.targ
usr/src/uts/sun4u/hummingbird 36m1s 1m57s def.targ
usr/src/uts/sun4u/unix 36m11s 1m48s symcheck
usr/src/uts/sun4u/genassym 36m11s 1s all.targ
usr/src/uts/sun4u/genunix 36m27s 1m30s all.targ
usr/src/uts/sparc/ip 36m28s 10s ipctf.debug64
usr/src/uts/sparc/ip 36m28s 9s debug64/ipctf.a
usr/src/uts/sparc 1h11m49s 14m7s install
usr/src/uts/sun4v 9m5s install
usr/src/uts/sun4v/bge 9m5s 1m6s install
usr/src/uts/sun4v/bge 9m6s 1m5s install.targ
...
The table above shows when a directory was entered during the build and how much time was actually spent building it.
( Nov 21 2007, 03:06:47 PM PST ) Permalink