Views on software from Bryan Cantrill's deck chair The Observation Deck

Sunday Aug 21, 2005

It's been an exciting few weeks for DTrace. The party got started with Wez Furlong's new PHP DTrace provider at OSCON. Then Devon O'Dell announced that he was starting to work in earnest on a DTrace port to FreeBSD. And now, Rich Lowe has made available a prototype Ruby DTrace provider. To install this, grab Ruby 1.8.2, apply Rich's patch, and run ./configure with the --enable-dtrace option. When you run the resulting ruby, you'll see two probes: function-entry and function-return. The arguments to these probes are as follows:
  • arg0 is the name of the class (a pointer to a string within Ruby)

  • arg1 is the name of the method (also a pointer to a string within Ruby)

  • arg2 is the name of the file containing the call site (again, a pointer to a string within Ruby)

  • arg3 is the line number of the call site.

So if, for example, you'd like to know the classes and methods that are called in a particular Ruby script, you could do it with this simple D script:

#pragma D option quiet

ruby$target:::function-entry
{
        @[copyinstr(arg0), copyinstr(arg1)] = count();
}

END
{
        printf("%15s %30s   %s\n", "CLASS", "METHOD", "COUNT");
        printa("%15s %30s   %@d\n", @);
}

To run this against the cal.rb that ships in the sample directory of Ruby, call the above script whatmethods.d and run it this way:

# dtrace -s ./whatmethods.d -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

          CLASS                         METHOD   COUNT
          Array                             <<   1
          Array                        compact   1
          Array                        reverse   1
          Array                          shift   1
          Array                           size   1
          Array                        unshift   1
          Array                      values_at   1
            Cal                          group   1
            Cal                     initialize   1
            Cal                        monthly   1
            Cal                          opt_c   1
            Cal                          opt_j   1
            Cal                          opt_m   1
            Cal                          opt_t   1
            Cal                          opt_y   1
            Cal                           pict   1
            Cal                     set_params   1
            Cal                        unlines   1
          Class                            now   1
          Class                   valid_civil?   1
          Class                    valid_date?   1
           Date                              -   1
             IO                          write   1
         Module                append_features   1
         Module                        include   1
         Module                       included   1
         Module                   undef_method   1
         Object                         detect   1
         Object                           nil?   1
         Object                          print   1
         Object     singleton_method_undefined   1
         String                             ==   1
         String                         center   1
         String                         empty?   1
         String                           scan   1
         String                          split   1
           Time                     initialize   1
           Time                           to_a   1
          Array                             ==   2
          Class                      jd_to_ajd   2
         Fixnum                             >=   2
           Hash                           each   2
           Hash                           keys   2
         Module                           attr   2
         Module               method_undefined   2
         Module                         public   2
       Rational                         coerce   2
          Array                              +   3
          Class                    civil_to_jd   3
           Hash                             []   3
         Object                        collect   3
          Array                        collect   4
          Class                      inherited   4
          Range                           each   4
         String                           size   4
         Module           private_class_method   5
         Object                           eval   5
         Object                        require   5
         String                           gsub   5
          Class                     jd_to_wday   7
          Class                           once   7
           Date                       __8713__   7
           Date                           wday   7
         Fixnum                              %   8
          Array                           join   10
           Hash                            []=   10
         String                              +   10
          Array                           each   11
       NilClass                           to_s   11
         Module                   alias_method   22
         Module                        private   22
         Symbol                           to_s   26
         Module                    module_eval   28
           Date                           mday   31
         Object                           send   31
           Date                            mon   42
           Date                      __11105__   43
          Class                    jd_to_civil   45
           Date                           succ   47
          Class                            os?   48
           Date                              +   49
         Fixnum                             <=   49
         String                          rjust   49
          Class                      ajd_to_jd   50
          Class                        clfloor   50
           Date                      __10417__   50
        Integer                           to_i   50
         Object                        Integer   50
       Rational                         divmod   50
       Rational                           to_i   50
           Date                            <=>   51
           Date                            ajd   51
           Date                             jd   51
       Rational                            <=>   51
          Class                           new0   52
           Date                     initialize   52
        Integer                           to_r   52
         Object         singleton_method_added   67
           Date                          civil   75
         Symbol                           to_i   91
          Float                              *   96
          Float                         coerce   96
         Fixnum                              /   97
         Object                        frozen?   100
       Rational                              -   104
         Fixnum                           to_s   123
          Array                             []   141
         Object                          class   150
         Module                   method_added   154
          Float                              /   186
         Module                            ===   200
       Rational                              /   204
       Rational                              +   248
          Float                          floor   282
         Fixnum                             <<   306
          Class                         reduce   356
        Integer                            gcd   356
         Object                       Rational   356
         Fixnum                              +   414
          Class                           new!   610
       Rational                     initialize   610
          Class                            new   612
         Fixnum                            abs   712
         Fixnum                            div   762
         Fixnum                              *   1046
         Fixnum                              <   1069
         Fixnum                              >   1970
         Fixnum                              -   2398
         Object                       kind_of?   2439
         Fixnum                             >>   4698
         Fixnum                             []   7689
         Fixnum                             ==   11436

This may leave us with many questions. For example, there are a couple of calls to construct new objects -- where are they coming from? To answer that question:

#pragma D option quiet

ruby$target:::function-entry
/copyinstr(arg1) == "initialize"/
{
        @[copyinstr(arg0), copyinstr(arg2), arg3] = count();
}

END
{
        printf("%-10s %-40s %-10s %s\n", "CLASS",
            "INITIALIZED IN FILE", "AT LINE", "COUNT");
        printa("%-10s %-40s %-10d %@d\n", @);
}
Calling the above whereinit.d, we can run it in a similar manner:
# dtrace -s ./whereinit.d -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

CLASS      INITIALIZED IN FILE                      AT LINE    COUNT
Cal        ./cal.rb                                 144        1
Date       /usr/local/lib/ruby/1.8/date.rb          593        1
Date       /usr/local/lib/ruby/1.8/date.rb          703        1
Date       /usr/local/lib/ruby/1.8/date.rb          916        1
Time       /usr/local/lib/ruby/1.8/date.rb          702        1
Date       /usr/local/lib/ruby/1.8/date.rb          901        49
Rational   /usr/local/lib/ruby/1.8/rational.rb      374        610

Looking at the Date class, it's interesting to look at line 901 of file /usr/local/lib/ruby/1.8/date.rb:

   897   # If +n+ is not a Numeric, a TypeError will be thrown.  In
   898   # particular, two Dates cannot be added to each other.
   899   def + (n)
   900     case n
   901     when Numeric; return self.class.new0(@ajd + n, @of, @sg)
   902     end
   903     raise TypeError, 'expected numeric'
   904   end

This makes sense: we're initializing new Date instances in the + method for Date. And where are those coming from? It's not hard to build a script that will tell us the file and line for the call site of an arbitrary class and method:

#pragma D option quiet

ruby$target:::function-entry
/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
{
        @[copyinstr(arg2), arg3] = count();
}

END
{
        printf("%-40s %-10s %s\n", "FILE", "LINE", "COUNT");
        printa("%-40s %-10d %@d\n", @);
}

For this particular example (Date#+()), call the above wherecall.d and run it this way:

# dtrace -s ./wherecall.d "Date" "+" -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

FILE                                     LINE       COUNT
./cal.rb                                 102        2
./cal.rb                                 60         6
./cal.rb                                 63         41

And looking at the indicated lines in cal.rb:

    55   def pict(y, m)
    56     d = (1..31).detect{|d| Date.valid_date?(y, m, d, @start)}
    57     fi = Date.new(y, m, d, @start)
    58     fi -= (fi.jd - @k + 1) % 7
    59
    60     ve  = (fi..fi +  6).collect{|cu|
    61       %w(S M Tu W Th F S)[cu.wday]
    62     }
    63     ve += (fi..fi + 41).collect{|cu|
    64       if cu.mon == m then cu.send(@da) end.to_s
    65     }
    66

So this is doing exactly what we would expect, given the code. Now, if we were interested in making this perform a little better, we might be interested to know the work that is being induced by Date#+(). Here's a script that reports the classes and methods called by a given class/method -- and its callees:

#pragma D option quiet

ruby$target:::function-entry
/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
{
        self->date = 1;
}

ruby$target:::function-entry
/self->date/
{
        @[strjoin(strjoin(copyinstr(arg0), "#"),
            copyinstr(arg1))] = count();
}

ruby$target:::function-return
/copyinstr(arg0) == $$1 && copyinstr(arg1) == $$2/
{
        self->date = 0;
        ndates++;
}

END
{
        normalize(@, ndates);
        printf("Each call to %s#%s() induced:\n\n", $$1, $$2);
        printa("%@8d call(s) to %s()\n", @);
}

Calling the above whatcalls.d, we can answer the question about Date#+() this way:

# dtrace -s ./whatcalls.d "Date" "+" -c "../ruby ./cal.rb"
    August 2005
 S  M Tu  W Th  F  S
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

Each call to Date#+() induced:

       1 call(s) to Class#new0()
       1 call(s) to Class#reduce()
       1 call(s) to Date#+()
       1 call(s) to Date#initialize()
       1 call(s) to Fixnum#+()
       1 call(s) to Fixnum#<<()
       1 call(s) to Integer#gcd()
       1 call(s) to Module#===()
       1 call(s) to Object#Rational()
       1 call(s) to Object#class()
       2 call(s) to Class#new()
       2 call(s) to Class#new!()
       2 call(s) to Fixnum#abs()
       2 call(s) to Fixnum#div()
       2 call(s) to Rational#+()
       2 call(s) to Rational#initialize()
       3 call(s) to Fixnum#*()
       3 call(s) to Fixnum#<()
       8 call(s) to Object#kind_of?()
      10 call(s) to Fixnum#-()
      10 call(s) to Fixnum#>()
      23 call(s) to Fixnum#>>()
      37 call(s) to Fixnum#[]()
      52 call(s) to Fixnum#==()

That's a lot of work for something that should be pretty simple! Indeed, it's counterintuitive that, say, Integer#gcd() would be called from Date#+() -- and it certainly seems suboptimal. I'll leave further exploration into this as an exercise to the reader, but suffice it to say that this has to do with the use of a rational number in the Date class -- the elimination of which would elminate most of the above calls and presumably greatly improve the performance of Date#+().

Now, Ruby aficionados may note that some of the above functionality has been available in Ruby by setting the set_trace_func function (upon which the Ruby profiler is implemented). While that's true (to a point -- the set_trace_func seems to be a pretty limited mechanism), the Ruby DTrace provider is nonetheless a great lurch forward for Ruby developers: it allows developers to use DTrace-specific constructs like aggregations and thread-local variables to hone in on a problem; it allows Ruby-related work performed lower in the stack (e.g., in the I/O subsystem, CPU dispatcher or network stack) to be connected to the Ruby code inducing it; it allows a running Ruby script to be instrumented (and reinstrumented) without stopping or restarting it; and it allows multiple, disjoint Ruby scripts to be coherently observed and understood as a single entity. More succinctly, it's just damned cool. So thanks to Rich for developing the prototype provider -- and if you're a Ruby developer, enjoy!