Darryl Gove's blog
Performance visualization discussion group
Saw this on Neil Gunther's blog, a discussion group for performance visualization. I've no idea whether it will be an interesting group or not, but there's certainly many possibilities for making performance data more readily understandable. So far there look to be about 12 folks signed up to the group.
One visualization tool that I think shows promise is chime, which is built upon dtrace.
Posted at 12:08PM Mar 27, 2008 by Darryl Gove in Sun |
Adding dtrace probes to user code (part 3)
I've previously discussed how to add dtrace USDT probes into user code. The critical step is to run the object files through dtrace, for dtrace to record the instrumentation points and to modify the object files prior to linking. The output of this step is an object file that also needs to be linked into the executable. Here's an example:
$ cc -O -c app.c $ cc -O -c app1.c $ dtrace -G -32 -s probes.d app.o app1.o $ cc -O probes.o app.o app1.o
The results from running the example code under a suitable dtrace script are:
$ sudo dtrace -s script.d -c a.out
dtrace: script 'script.d' matched 10 probes
a=1, b=2
a=1, b=2
a=1, b=2
a=2, b=3
dtrace: pid 20655 has exited
2 3 1
1 2 3
One question that has come up is whether it's necessary to run a single call to dtrace which instruments all the object files, or whether it's possible to use multiple calls.
The object file that dtrace produces probes.o is going to be over written with each call to dtrace, so it's no surprise that the naive approach of multiple calls to dtrace each call generating the same object file does not work:
$ dtrace -G -32 -s probes.d app.o
$ dtrace -G -32 -s probes.d app1.o
$ cc -O app.o app1.o probes.o
$ sudo dtrace -s script.d -c a.out
dtrace: script 'script.d' matched 9 probes
a=1, b=2
a=1, b=2
a=1, b=2
a=2, b=3
dtrace: pid 20725 has exited
2 3 1
1 2 2
The next thing to try is whether changing the generated object file works:
$ dtrace -G -32 -s probes.d -o probe0.o app.o
$ dtrace -G -32 -s probes.d -o probe1.o app1.o
$ cc -O probes.o app.o app1.o probe1.o
$ sudo dtrace -s script.d -c a.out
dtrace: script 'script.d' matched 9 probes
a=1, b=2
a=1, b=2
a=1, b=2
a=2, b=3
dtrace: pid 20673 has exited
2 3 1
1 2 2
And if we wanted more proof, swapping the order of the object files generates the following:
$ cc -O app.o app1.o probe1.o probe0.o
$ sudo dtrace -s script.d -c a.out
dtrace: script 'script.d' matched 1 probe
a=1, b=2
a=1, b=2
a=1, b=2
a=2, b=3
dtrace: pid 20683 has exited
1 2 1
So the conclusion is that the only way it will work is by putting all the object files onto the commandline of a single call to dtrace.
Posted at 04:49PM Mar 25, 2008 by Darryl Gove in Sun |
Call stack visualisation using dtrace
Nice demonstration of using dtrace to extract callstack information.
Posted at 11:12AM Dec 17, 2007 by Darryl Gove in Sun |
Adding dtrace probes to user code (part 2)
Adam Leventhal pointed out in the comments to my post on adding dtrace userland probes that there is an improved approach to adding userland dtrace probes. He describes this approach on his blog.
The approach solves two problems. First, that C++ name mangling makes it hard to add dtrace probes for that language. Second, that code with dtrace probes inserted in it will not compile on systems that do not have the necessary dtrace support.
So going back to the example code, I'll try to show the problem and the solution. Here's app.cc:
#include <stdio.h>
#include <sys/sdt.h>
void func(int a, int b)
{
DTRACE_PROBE2(myapp,func_call,a,b);
printf("a=%i, b=%i\n",a,b);
}
void main()
{
func(1,2);
func(2,3);
}
When compiled with the C compiler the following symbols get defined:
$ cc -c app.cc $ nm app.o app.o: [Index] Value Size Type Bind Other Shndx Name ... [10] | 0| 0|FUNC |GLOB |0 |UNDEF |__dtrace_myapp___func_call ...
When compiled with the C++ compiler the following happens:
$ CC -c app.cc $ nm app.o app.o: [Index] Value Size Type Bind Other Shndx Name ... [7] | 0| 0|FUNC |GLOB |0 |UNDEF |__1cbA__dtrace_myapp___func_call6FLL_v_ ...
Because the call to the dtrace probe is not declared as being extern 'C' the compiler mangles the C++ function name.
The new approach that Adam describes involves dtrace preprocessing the probe description file to generate a header file, and the including the header file in the source code. The big advantage of having the header file is that it's now possible to declare the dtrace probes to have extern 'C' linkage, and avoid the name mangling issue. The syntax for preprocessing the probe description file is:
$ dtrace -h -s probes.d
This generates the following header file:
/*
* Generated by dtrace(1M).
*/
#ifndef _PROBES_H
#define _PROBES_H
#include <unistd.h>
#ifdef __cplusplus
extern "C" {
#endif
#if _DTRACE_VERSION
#define MYAPP_FUNC_CALL(arg0, arg1) __dtrace_myapp___func_call(arg0, arg1)
#define MYAPP_FUNC_CALL_ENABLED() __dtraceenabled_myapp___func_call()
extern void __dtrace_myapp___func_call(int, int);
extern int __dtraceenabled_myapp___func_call(void);
#else
#define MYAPP_FUNC_CALL(arg0, arg1)
#define MYAPP_FUNC_CALL_ENABLED() (0)
#endif
#ifdef __cplusplus
}
#endif
#endif /* _PROBES_H */
The other advantage is that the header file can protect the definitions of the dtrace probes with #if _DTRACE_VERSION; which enables the same source to be compiled on systems which do not support dtrace.
The source code needs to be modified to support this syntax:
#include &kt;stdio.h>
#include "probes.h"
void func(int a, int b)
{
MYAPP_FUNC_CALL(a,b);
printf("a=%i, b=%i\n",a,b);
}
void main()
{
func(1,2);
func(2,3);
}
The rest of the process is the same as before.
Posted at 01:56PM Nov 27, 2007 by Darryl Gove in Sun |
Adding dtrace probes to user code
The process of adding dtrace probes to userland code is described in the dynamic tracing guide. However, there's no better way of learning how to do it, than trying it out on a snippet of code.
Here's a short bit of code that calls a function twice, each time with different parameters. The plan is to insert a probe that can report the passed parameters.
#include <stdio.h>
void func(int a, int b)
{
printf("a=%i, b=%i\n",a,b);
}
void main()
{
func(1,2);
func(2,3);
}
The first change is to add the <sys/sdt.h> header file. This file has definitions for the DTRACE_PROBE<N> macro. N represents the number of parameters that are to be reported by the probe. In this case we are going to pass two parameters (a and b) to the probe. As well as the parameters that are to be passed to the dtrace probe, the macro takes the name to be used tof the application provide (in this case the name will be myapp) and the name of the probe (in this case func_call). The modified source code looks as follows:
#include <stdio.h>
#include <sys/sdt.h>
void func(int a, int b)
{
DTRACE_PROBE2(myapp,func_call,a,b);
printf("a=%i, b=%i\n",a,b);
}
void main()
{
func(1,2);
func(2,3);
}
The next step is to write a probe description file which dtrace will use to produce the probes. A full file would describe the stability of the probe in more detail, but a lightweight file just describes the probes defined by the provider application:
provider myapp
{
probe func_call(int, int);
};
Having completed this, it's necessary to compile and link the application. Initially each source file needs to be compiled, and then before the application is linked, dtrace needs to be invoked to modify the object files, removing the calls to the probes, but leaving space for them to be reinserted. dtrace also needs to compile the probe description file into an object file. Finally the modified object files and the probe description file can be linked to produce the executable. As follows:
$ cc -c app.c $ dtrace -G -32 -s probes.d app.o $ cc probes.o app.o
The resulting code in the application looks like:
func()
113a0: 9d e3 bf a0 save %sp, -96, %sp
113a4: f0 27 a0 44 st %i0, [%fp + 68]
113a8: f2 27 a0 48 st %i1, [%fp + 72]
113ac: d0 07 a0 44 ld [%fp + 68], %o0
113b0: 01 00 00 00 nop
113b4: d2 07 a0 48 ld [%fp + 72], %o1
113b8: 11 00 00 45 sethi %hi(0x11400), %o0
113bc: 90 12 22 60 bset 608, %o0 ! 0x11660
113c0: d2 07 a0 44 ld [%fp + 68], %o1
113c4: 40 00 42 c7 call printf ! 0x21ee0
113c8: d4 07 a0 48 ld [%fp + 72], %o2
113cc: 81 c7 e0 08 ret
113d0: 81 e8 00 00 restore
The nop at 0x113b0 is there for dtrace to dynamically patch with a call instruction that will enable the dtrace probe.
Finally, the following is an example of using the new probe:
$ more script.d
myapp$target:::func_call
{
@[arg0,arg1]=count();
}
$ dtrace -s script.d -c a.out
dtrace: script 'script.d' matched 1 probe
a=1, b=2
a=2, b=3
dtrace: pid 22355 has exited
1 2 1
2 3 1
The script just aggregates the parameters used in the function call. When the application terminates the aggregation is printed out - showing the expected result of two calls to the routine each call with different parameters.
Posted at 01:13PM Nov 20, 2007 by Darryl Gove in Sun | Comments[4]
Identifying misaligned loads in 32-bit code using dtrace
A previous blog entry talks about handling and detecting misaligned memory accesses. For 64-bit code this is easy to achieve using the Performance Analyzer, for 32-bit code the analysis is a bit more tricky. Fortunately it is possible to do the 32-bit analysis with dtrace
Consider the following program which has a misaligned memory access. The default mode of the compiler (since Sun Studio 9) will compile the binary to trap to fix the misalignment and continue
% more align.c
void main()
{
volatile char a[10];
int i;
for (i=0; i<100000000; i++) {(*(int*)(&a[1]))++;}
}
The following dtrace script will instrument the misaligned data access trap handler and report all the pids that trigger this
% more tr.d
fbt::do_unaligned:entry
{
@p[pid]=count();
}
It can be run with
% sudo dtrace -s tr.d
dtrace: script 'tr.d' matched 1 probe
^C
14873 260932
The script returns the pid which is having misalignment issues. This information is useful, in that it is trivial to recompile the binary with a different setting for -xmemalign and avoid the behaviour. But it would be very useful to know where the traps are occuring in the binary - perhaps most of the traps only happen in one place, and that place can be fixed in the source.
% more tr.d
fbt::do_unaligned:entry
{
@[ustack()]=count();
}
This script produces output that identifies the locations in the binary where the traps are being generated. For the simple test code there are two locations - the load and the store.
sudo dtrace -s tr.d
dtrace: script 'tr.d' matched 1 probe
^C
align`main+0x10
align`_start+0x108
130466
align`main+0x18
align`_start+0x108
130466
The disassembly for the loop is as follows
main()
10b80: 9d e3 bf 90 save %sp, -112, %sp
...
10b90: d0 07 bf f7 ld [%fp - 9], %o0 <<<<<< misaligned
10b94: 90 02 20 01 inc %o0
10b98: d0 27 bf f7 st %o0, [%fp - 9] <<<<<< misaligned
10b9c: ba 07 60 01 inc %i5
10ba0: 80 a7 40 09 cmp %i5, %o1
10ba4: 06 bf ff fb bl main+0x10 ! 0x10b90
10ba8: 01 00 00 00 nop
Posted at 01:00PM Jun 12, 2007 by Darryl Gove in Sun |
Calling dtrace from Java
Recently been playing with calling dtrace scripts from Java. Its a nice api. The initial start page for learning about this is here. However, it may be more effective to look at the documentation that is included in the source. The source for some examples is also available.
Posted at 08:00AM May 25, 2007 by Darryl Gove in Sun |
