The musings of Jon Haslam

All | General
Main | Next page »
20080512 Monday May 12, 2008

DTrace @ JavaOne

Well, I can truly say that I had a great time at this years JavaOne. There were several highlights for me but all of them weren't technical (Smashmouth were absolutely excellent on Thursday night). I guess what I enjoyed the most was seeing what's going off in technology areas that are so different to the one that I do for my day to day activities and I picked up some ideas for pet projects that may (or may not) see the light of day this year...

Down to DTrace and Java - it was exciting to see how much interest there is in the marriage of these two technologies. The demo booths in the pavilion seemed to be permanently rammed with people wanting to see what's going off in this area and the hands on labs were very well attended (hats off to Angelo et. al). Simon Ritter and I got the chance to take up the baton from Adam and Jarod this year to present two sessions on DTrace and Java (well, we did the same session twice anyway!). We had a fantastic time doing this and may thanks to everyone who came to see the show and also to those that asked so many interesting and well thought out questions afterwards.

DTrace tends top be a hands-on technology and demo's and talks are best when code is cranked live so that is what we (mostly) did. One of the issues though when demo'ing DTrace is having a workload to observe that isn't too contrived and simplistic but one that also isn't overly complicated. For our JavaOne talk I wanted some Java code that everyone could feel at home with and, standing on the shoulders of giants , I opted for a game from my youth, Space Invaders. I used Kevin Glass's excellent Space Invader game as it was designed as a tutorial and is therefore simple and perfect for a demo like this. As a bunch of people asked for the scripts for this session I thought I'd put down here what I did. What follows are three examples of what we showed on the day in our session. Apologies if what I write is over prescriptive at times for the DTrace long timers but hopefully a bit of detail may help any newcomers to DTrace.

The main aim of this session was to show how you can use the dvm and hotspot providers to obtain new views on your Java code. I then wanted to show how the great work that is being done by Keith McGuigan and team on Java Statically Defined Tracing can be used to enhance that view. All the Java focussed demo's of the session (there were some plain old DTrace ones as well) were done using the hotspot provider which comes with Java SE6 and the SE7 JVM for the JSDT demo's. I wanted to show the approach that I'd taken when first analysing the behaviour of this game so that the attendees could get a view of the kind of approaches you can take when looking at such an application (which may not be the most optimal way BTW!). Anyway, onto the code and analysis.

Example 1 - Looking at the methods

I had the code in one workspace in a netbeans project so that I could easily modify it with and rebuild it quickly. First off I just wanted to look at the methods that were being called to see if anything of interested turned up:

$ cat meth.d
#!/usr/sbin/dtrace -s

hotspot$target:::method-entry
{
	this->str = strjoin(".", copyinstr(arg3, arg4));
	@[strjoin(copyinstr(arg1, arg2), this->str)] = count();
}

Note that the strjoin()'s that I use are just there to make the output pretty. We grab the pid of the relevant jvm:

$ jps
806 Main
813 Game
814 Jps

...run the script and play the game for a few seconds:

$ dtrace -q -s ./meth.d -p 813
^C
<chop>
  java/util/concurrent/locks/AbstractQueuedSynchronizer.release             5153
  java/util/concurrent/locks/ReentrantLock$NonfairSync.lock             5153
  java/util/concurrent/locks/ReentrantLock$Sync.tryRelease             5153
  sun/java2d/SunGraphics2D.drawImage                             7466
  sun/java2d/pipe/DrawImage.copyImage                            7466
  java/awt/Rectangle.intersects                                108993
  org/newdawn/spaceinvaders/Entity.collidesWith                108994
  java/awt/image/BufferedImage.getHeight                       221725
  java/awt/image/BufferedImage.getWidth                        221725
  java/util/ArrayList.RangeCheck                               227932

So, from the look of the method names this is collision detection code which may or may not be interesting. Let's just take a look at where these methods are being executed by getting stack traces of the RangeCheck method:

$ cat range.d
#!/usr/sbin/dtrace -s

hotspot$target:::method-entry
/copyinstr(arg3, arg4) == "RangeCheck"/
{
	@[jstack(30, 16384)] = count();
}
$ dtrace -q -s ./range.d -p 813
^C
<chop>
              libjvm.so`__1cNSharedRuntimeTdtrace_method_entry6FpnKJavaThread_pnNmethodOopDesc__i_+0x7b
              java/util/ArrayList.RangeCheck(I)V*
              org/newdawn/spaceinvaders/Game.gameLoop()V*
              org/newdawn/spaceinvaders/Game.main([Ljava/lang/String;)V
              StubRoutines (1)
              libjvm.so`__1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_+0x1a3
              libjvm.so`__1cCosUos_exception_wrapper6FpFpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v2468_v_+0x27
              libjvm.so`__1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle_pnRJavaCallArguments_pnGThread__v_+0x2f
              libjvm.so`__1cRjni_invoke_static6FpnHJNIEnv__pnJJavaValue_pnI_jobject_nLJNICallType_pnK_jmethodID_pnSJNI_ArgumentPusher_pnGThread__v_+0x1df
              libjvm.so`jni_CallStaticVoidMethod+0x154
              java`JavaMain+0xd30
              libc.so.1`_thr_setup+0x52
              libc.so.1`_lwp_start
             5153

We see that most of them are coming out of the Game.gameLoop() method which probably makes sense if it's collision detection. Looking at the code for this we see:

        for (int p=0;p<entities.size();p++) {
                for (int s=p+1;s<entities.size();s++) {
			Entity me = (Entity) entities.get(p);
			Entity him = (Entity) entities.get(s);
					
			if (me.collidesWith(him)) {
					me.collidedWith(him);
					him.collidedWith(me);
			}
		}
        }

The entities object is an ArrayList so the RangeCheck calls are coming from there (looks like the get() is being optimised out of the stacktrace). Now that we have the area of code in view we can easily see that there is an inefficiency here. This loop goes along the list of entities (aliens, missiles) and sees if the current entity (me) intersects with any other entities. However, note that the current entity, me, is being retreived on every iteration of the inner loop when it only needs to be fetched on the outer loop. We can hoist that out as shown below, rebuild and see what the effect is:

        for (int p=0;p<entities.size();p++) {
		Entity me = (Entity) entities.get(p);
                for (int s=p+1;s<entities.size();s++) {
			Entity him = (Entity) entities.get(s);
					
			if (me.collidesWith(him)) {
					me.collidedWith(him);
					him.collidedWith(me);
			}
		}
        }

$ dtrace -q -s ./meth.d -p 833
<chop>
  java/util/concurrent/locks/AbstractQueuedSynchronizer.release             4747
  java/util/concurrent/locks/ReentrantLock$NonfairSync.lock             4747
  java/util/concurrent/locks/ReentrantLock$Sync.tryRelease             4747
  sun/java2d/SunGraphics2D.drawImage                             7144
  sun/java2d/pipe/DrawImage.copyImage                            7144
  java/awt/Rectangle.intersects                                104874
  org/newdawn/spaceinvaders/Entity.collidesWith                104874
  java/util/ArrayList.RangeCheck                               117608
  java/awt/image/BufferedImage.getHeight                       213326
  java/awt/image/BufferedImage.getWidth                        213326

And we see that the RangeCheck method has dropped down the list accordingly.

Example 2 - Shots gone Wild!

I wanted to provide an example where we could write some simple D to drill down on a particular aspects of the games behaviour that may be of interest. In a game like this there are really only a few things that are of great interest and any interaction a user has is one of these areas. The example I took next was to look at detecting when a missile has gone off the screen (as opposed to hitting an alien) to demonstrate the kind of things we do to differentiate events in D scripts. The code for moving a missile is the ShotEntity.move method:

	public void move(long delta) {
		// proceed with normal move
		super.move(delta);
		
		// if we shot off the screen, remove ourselfs
                
		if (y < -100) {
			game.removeEntity(this);
		}
	}

So, we just need to look for the case where we calling game.removeEntity when we are in the ShotEntity.move method. The script that I created to show this is:

$ cat shotmiss.d
#!/usr/sbin/dtrace -s

hotspot$target:::method-entry
/basename(copyinstr(arg1,arg2)) == "ShotEntity" &&
	copyinstr(arg3, arg4) == "move"/
{
	self->in = 1;
}

hotspot$target:::method-entry
/self->in && copyinstr(arg3, arg4) == "removeEntity"/
{
	printf("Missile Missed!\n");
}

hotspot$target:::method-return
/basename(copyinstr(arg1,arg2)) == "ShotEntity" &&
	copyinstr(arg3, arg4) == "move" && self->in/
{
	self->in = 0;
}

So, we probe method-entry and build up the predicate to show our interest in ShotEntity.move. Note that I use the D basename() subroutine to pull the class name out of the fully qualified package name. By setting the thread local variable, in, when we enter the move method we can then predicate upon that in the following clause to ensure that we only show interest in removeEntity calls that originate from ShotEntity.move. The third clause makes sure that we unset the in thread local variable when we return from the ShotEntity.move method as we're done.

Java Statically Defined Tracing

Java Statically Defined Tracing is the Java equivalent of USDT and should be arriving in Java SE7. Augmenting the SpaceInvader game to use JSDT was simple. I first defined my provider by extending the Provider interface:

@ProviderName("Invader")
interface MyProvider extends Provider {
    @ProbeName("boom") void death(String type);
    void miss(int nothing);
}

Note that although I have named the provider, MyProvider, I have used an annotation to override this name so that it will appear as Invader. You'll see I also did the same with the probe named death. This is for no other reason than to demonstrate the annotations. As you see above, the provider has two probes defined, death which takes a String as its argument and, miss which takes an integer as its argument.

In the Game class's constructor I created the instance of the ProviderFactory from which I then create my provider (read Keith's blog referenced above for why it's done this way):

        provider = ProviderFactory.getDefaultFactory().createProvider(MyProvider.class);

Note that provider is defined as public MyProvider provider in the Game classes public data. Adding probes is then trivial as we just insert calls in our code where suitable. I had two probe placements, one which is taken from the last hotspot example, to show missiles going off screen. The ShotEntity.move code now becomes:

 public void move(long delta) {
    // proceed with normal move
    super.move(delta);
    
    // if we shot off the screen, remove ourselfs
    if (y < -100) {
      game.provider.miss(1);
      game.removeEntity(this);
    }
  }

I also added the death probe into where Entities (aliens and missiles) are destroyed in Game.removeEntity. I also added in here some code to demonstrate the JSDT mechanism for checking whether a probe is enabled or not. We can use this technique of the code taken to construct probe arguments is costly and therefore we only want to do it when a probe is enabled. The code for this looks like:

	public void removeEntity(Entity entity) {
                Method m = null;
                String ent = null;
                
                try {
                        m = MyProvider.class.getMethod("death", String.class);
                }
                catch (NoSuchMethodException e) {
                        System.out.println("Can't find probe");
                }
                Probe p = provider.getProbe(m);
                
                if (p.isEnabled() ) {
                    // Do argument prep in here
                    if (entity instanceof AlienEntity)
                        ent = "Alien";
                    else if (entity instanceof ShotEntity)
                        ent = "Missile";
                    else
                        ent = "Unknown Object";
                }
 
                provider.death(ent);
		removeList.add(entity);
	}

We obtain a reference to the provider from which we can then verify whether or not a probe is currently enabled. Smart! If it is I then check what type of object is being removed and prepare a suitable string to pass as an argument to the death probe. Building the code and running it we then see the following probes available:

$ dtrace -l -P Invader*
   ID   PROVIDER            MODULE                          FUNCTION NAME
70483 Invader899      java_tracing                       unspecified miss
70484 Invader899      java_tracing                       unspecified boom

I then cranked the following script which shows us how many aliens were destroyed, how many missiles were fired and how many of those went off the screen (i.e. missed their targets)

$ cat alien.d
#!/usr/sbin/dtrace -s

Invader*:::miss
{
	@[probename] = count();
}

Invader*:::boom
{
	@[copyinstr(arg0)] = count();
}

Running this against the game and playing a level we saw:

$ dtrace -q -s ./alien.d    
^C

  miss                                                             18
  Alien                                                            60
  Missile                                                          78
$ 

18 misses out of 78 shots! Poor indeed! One thing that was immediately obvious to those watching this was the much, much reduced probe effect of the JSDT version. With the hotspot techniques we have visible probe effect owing to the mechanisms being used by that provider but the JSDT version is slick and fast.

The last thing I demo'd for fun was the integration of these newly created JSDT based probes with a visualisation technique. I took the Chime Visualisation tool and created a display so we could graph the killing in real time! I did cheat and create the display before the session but it only took a few minutes to do. Here's a screenshot of some realtime destruction:

As you can see from the screenshot, I was a real killing machine until near the end (missile fired in blue , aliens killed in green) when I couldn't hit anything at all for and my miss count (in red) rises rapidly! The point of this demo was not to prove my lack or prowess at SpaceInvaders but rather to show that we have a set of Java API's that you can hook into easily so it's not much effort to plug DTrace into a Java management framework or visualisation tools (just as two examples).

Anyway, that's about it for now. I did do some more hotspot demo's which were interesting but they'll have to save for another time. If you came along to our sessions at JavaOne then I hope you enjoyed it as much as we did.

( May 12 2008, 04:35:50 PM GMT+00:00 ) Permalink Comments [0]

20080318 Tuesday March 18, 2008

dtrace.conf(08)

I had the fortune of attending the first dtrace.conf last Friday in San Francisco and I have to say that it was a fantastic event. There are many aspects that I enjoyed but what really made it for me was the chance to have great technical and social exchanges with such a diverse bunch of engineers. It didn't matter who we all worked for; DTrace was the great leveller. I really had some of the most thought provoking technical conversations I've had in a long time. By the end of the day my head was spinning with the amount and quality of ideas that I'd heard (well, it was either the ideas or Richard McDougall's driving...).

Many thanks to the DTrace Patriarchs for organising the event, Forsythe for funding it and for the many attendees who willingly shared and listened. Absolutely top day. ( Mar 18 2008, 06:40:20 AM GMT+00:00 ) Permalink Comments [1]

20070926 Wednesday September 26, 2007

DTrace @ LOSUG

The London OpenSolaris User Group meets once a month at the Sun Customer Briefing Center on King William Street (the North end of London Bridge). If you ever get the chance to go along to one of the meetings then I can heartily recommend it as a there is always a nice amount of pretty good food and drink supplied by a generous, anonymous benefactor. In addition to this there is a regular group of between 30-50 people who make for good chat during the evening.

Ages ago I had agreed to present something on DTrace at the September event. However, everyone who attends has no doubt been exposed to the Haslam DTrace tour so I couldn't force that upon those poor souls. What I came up with was a quick walk through of some of the new features that have been introduced into DTrace since Solaris 10 first shipped and the slides are given below. The slides are in no way meant to be a complete set of features but just a selection of ones that I thought may be of interest. On the night I actually demo'd various pieces but I have actually included some example code in the slides so you can (hopefully) see what is going off. Features that don't have examples meant that I ran out of time by then!

( Sep 26 2007, 01:10:47 PM GMT+00:00 ) Permalink Comments [3]

20070712 Thursday July 12, 2007

Grab your Irn-Bru: Accessing a programs local variables using DTrace

A question that has raised its head several times on the dtrace-discuss@opensolaris.org forum has been, "how do I access the local variables in my program"? Unfortunately, the answer to this isn't particularly pretty so I thought I'd provide a simple (and contrived...) example of how to this. What we're doing here relies on getting just a teeny bit low down and dirty so you may want to grab your Irn-Bru laddy! The example used here was done on an x86 based system but the principle is exactly the same for any other architecture. Take the extremely simple piece of code below:

void
foo(void)
{
        int i;
        const char *string = "Foo bar test string";

        for (i = 0; i < strlen(string); i++) {
                printf("%c", string[i]);
        }

        printf("%\n");
}
We're going to use DTrace to tell us the value of the loop counter i when we encounter a space in the string that is being iterated over.

Here's where it starts to become slightly unpleasant. The storage for a programs local variables will either be in registers or somewhere in memory, depending upon where in the programs execution you want to view them. In the above code we want to see the value of the i variable when a space character has been found (i.e. when string[i] contains a space). To do this we need to first disassemble the code and we can do that using mdb(1):
> foo::dis
foo:                            pushl  %ebp
foo+1:                          movl   %esp,%ebp
foo+3:                          subl   $0x8,%esp
foo+6:                          leal   0x8050ac8,%eax
foo+0xc:                        movl   %eax,-0x8(%ebp)
foo+0xf:                        movl   $0x0,-0x4(%ebp)
foo+0x16:                       movl   -0x8(%ebp),%eax
foo+0x19:                       pushl  %eax
foo+0x1a:                       call   -0x163   < PLT:strlen >
foo+0x1f:                       addl   $0x4,%esp
foo+0x22:                       movl   -0x4(%ebp),%edx
foo+0x25:                       cmpl   %eax,%edx
foo+0x27:                       jae    +0x32     < foo+0x5b >
foo+0x29:                       movl   -0x8(%ebp),%eax
foo+0x2c:                       addl   -0x4(%ebp),%eax
foo+0x2f:                       movsbl 0x0(%eax),%eax
foo+0x33:                       pushl  %eax
foo+0x34:                       pushl  $0x8050adc
foo+0x39:                       call   -0x172   < PLT:printf >
foo+0x3e:                       addl   $0x8,%esp
foo+0x41:                       movl   -0x4(%ebp),%eax
foo+0x44:                       incl   %eax
foo+0x45:                       movl   %eax,-0x4(%ebp)
foo+0x48:                       movl   -0x8(%ebp),%eax
foo+0x4b:                       pushl  %eax
foo+0x4c:                       call   -0x195   < PLT:strlen >
foo+0x51:                       addl   $0x4,%esp
foo+0x54:                       movl   -0x4(%ebp),%edx
foo+0x57:                       cmpl   %eax,%edx
foo+0x59:                       jb     -0x32     < foo+0x29 >
foo+0x5b:                       pushl  $0x8050ae0
foo+0x60:                       call   -0x199   < PLT:printf >
foo+0x65:                       addl   $0x4,%esp
foo+0x68:                       leave  
foo+0x69:                       ret    

So, from the above instructions we need to figure out where we can insert a probe with DTrace so we can then get our hands on the relevant data! Well, it's not as bad as it looks really. Our two variables i and string have space created for them on the stack early on:

foo+6:                          leal   0x8050ac8,%eax
foo+0xc:                        movl   %eax,-0x8(%ebp)
foo+0xf:                        movl   $0x0,-0x4(%ebp)
So, i is at %ebp-0x4 and string is at %ebp-0x8. There are a couple places in the code where we could insert a probe to inspect whether or not the current index into the string is a space character. The place I'll choose here is where we load up the current character being indexed for the printf() call. You'll notice that for the call to printf() we place on the stack the character to print and the printf format modifier. Here are the instructions which do that:
foo+0x33:                       pushl  %eax
foo+0x34:                       pushl  $0x8050adc
foo+0x39:                       call   -0x172    < PLT:printf >
And we'll just verify that foo+0x34 is loading up the the printf modifier string:
> 0x8050adc/s
0x8050adc:      %c
So, we're in the right place. The character to be printed out (i.e. the current value of string[i]) will be found in the %eax register at foo+0x33 so this is where we'll place a probe with the pid provider. What we can do is to use the DTrace regs[] array to access the %eax register and predicate on whether or not we are currently looking at a space character (the space character is hex value 0x20). If the register contains the value 0x20 we can then extract the current value of the loop counter i which is found on the stack at %ebp-0x4. So our script looks like this:
#cat local1.d
#pragma D option quiet

inline int SPACE = 0x20;

pid$target:a.out:foo:33
/uregs[R_EAX] == SPACE/
{
        this->pos = *(int *)(copyin(uregs[R_EBP]-0x4, sizeof(int)));
        printf("Space found at char %d\n", this->pos + 1);
}
You'll note that to access the memory location for the loop counter i (the programs stack in this case), we need to use the copyin() DTrace subroutine. Executing the above script on this code we get the following output:
# dtrace -s ./local1.d -c ./local
Foo bar test string
Space found at char 4
Space found at char 8
Space found at char 13
Voila! There you have it. It's a bit grim I know and forces you to rely on knowing a bit about the assembly language of the architecture you're executing on. However, it can prove to be invaluable technique for debugging those hard to get at problems.

Good luck!

( Jul 12 2007, 12:46:49 PM GMT+00:00 ) Permalink Comments [1]

20070521 Monday May 21, 2007

The DTrace deadman mechanism [Read More] ( May 21 2007, 03:25:00 PM GMT+00:00 ) Permalink

20070515 Tuesday May 15, 2007

DTrace, JavaU and the BSOD

Last week, my old mucker Phil Harman and I were fortunate enough to present a 3 hour session at JavaU on DTrace . We had a great time doing this session and I hope that those of you who attended got something out of it. We try and do our sessions with as few slides as possible and as much live demo as possible - this makes for a pretty hectic 3 hours for whoever is doing the typing (as the younger of the dynamic duo I don't mind!). I promised to post the scripts that were produced from this session and I will do that shortly (or at least pointers to where you can get better versions of, essentially, the same scripts :-) ).

One of the things I like to cover in my talks are what I call "fun things" to do with DTrace. By "fun" I don't mean side splitting laughter fun but more geek orientated antics that can be got up to with DTrace. One of the examples I give is a bit of a cheat but amusing nonetheless. One day, no doubt tired of hearing me proclaim that DTrace can solve virtually any problem, a Sun colleague challenged me with this one: he bet me that I couldn't make Solaris generate a BSOD . At first this may seem a strange challenge but it's one that DTrace can (kind of) address...

I set about it like this; when a binary is executed the kernel reads the first few bytes of the executable to determine what type of file it is - check out the gexec() code for how we do this. I'm going to use the findexec_by_hdr() function as the buffer that contains the executables magic is passed in as the first argument . By enabling a probe on this function I can look at the first two bytes of the file that have been read and determine whether or not a windows executable has been executed (the first two bytes of a windows 9x executable are 'MZ'). If the predicate evaluates to true then we can then use the system() action to change my GNOME desktop background to that of a BSOD jpeg! See, I said it was a dubious cheat but it usually gets a laugh. The script is:

#!/usr/sbin/dtrace -ws

#pragma D option quiet

fbt::findexec_by_hdr:entry
/ *args[0] == 'M' && *(args[0]+1) == 'Z'/ 
{
        raise(SIGKILL);
        system("/usr/bin/gconftool-2 --type=string --set /desktop/gnome/background/picture_options stretched");
        system("/usr/bin/gconftool-2 --type=string --set /desktop/gnome/background/picture_filename /local/JavaU/bsod/BSOD.jpg");
}
Before the script is ran my desktop looks like:

If I then run the above D script and try to execute a random Windows executable my desktop changes to look like:

Who says you can't do everything with DTrace!

( May 15 2007, 12:58:22 PM GMT+00:00 ) Permalink

20060718 Tuesday July 18, 2006

DTrace at LUGRadio Live 2006 They don't let me out much nowadays but this weekend is a little different. I'll be off to sunny Wolverhampton on Sunday to speak about DTrace at LUGRadio Live 2006. Looks like I've got the death slot just before lunch so I'll try and make it as interesting as I can for those that can come along. Who knows, I may even bring some BrandZ along with me (if time allows).

Hopefully see you there! ( Jul 18 2006, 01:20:00 PM GMT+00:00 ) Permalink

20060623 Friday June 23, 2006

Determining Opteron processor revisions on Solaris

For a while now I've been doing quite a bit of work on the Opteron processor. For some of this work it's important to know which revision of processor you currently have in the machine you've cobbled together. This should be a doddle, right? Well, not quite.
So, how do you know if you have a rev B or a rev E processor then? The processor revision is determined by a bunch of values returned by the CPUID instruction - these include the family, model and step of the processor. These values can then be used in a lookup table which is given in the Opteron revision guide which can be found here. On a Solaris system the family, model and step can be found using psrinfo(1M):

# psrinfo -pv
The physical processor has 2 virtual processors (0 1)
  x86 (AuthenticAMD family 15 model 33 step 2 clock 2190 MHz)
        Dual Core AMD Opteron(tm) Processor 275
The physical processor has 2 virtual processors (2 3)
  x86 (AuthenticAMD family 15 model 33 step 2 clock 2190 MHz)
        Dual Core AMD Opteron(tm) Processor 275
So, armed with these numbers I can go and look up the processor revision. However, if you take a look at the revision guide, you'll see it's not quite as straight forward as you may hope (to say the least).
However, you can cheat a bit. In the amd memory controller driver in Solaris the saliant information contained in the previously mentioned table is represented in the mc_rev_map[] array:
static const mc_rev_map_t mc_rev_map[] = {
        { 0xf, 0x00, 0x0f, MC_REV_PRE_D, "B/C/CG" },
        { 0xf, 0x10, 0x1f, MC_REV_D_E, "D" },
        { 0xf, 0x20, 0x3f, MC_REV_D_E, "E" },
        { 0xf, 0x40, 0x5f, MC_REV_F, "F" },
        { 0, 0, 0, MC_REV_UNKNOWN, NULL }
};
Where a mc_rev_map_t looks like:
typedef struct mc_rev_map {
        uint_t rm_family;
        uint_t rm_modello;
        uint_t rm_modelhi;
        uint_t rm_rev;
        const char *rm_name;
} mc_rev_map_t;
Using the contents of this array you can see which revision you have by simply seeing which range (using the rm_modello and rm_modelhi members) your processors model falls within. We can see from the psrinfo output given earlier, that this machine has Rev E processors.
However, instead of doing this we can use mdb to do our work for us. Every memory controller on the system is represented in a struct mc and these, conveniently, contain a member, mc_revname, which identifies the processor revision of it's associated processor. We can look at all the memory controllers on the system using the mc_list global variable which is a pointer to the front of the list of memory controller structure:
 > *mc_list::list mc_t mc_next | ::print mc_t mc_revname
mc_revname = 0xfffffffff01eb550 "E"
mc_revname = 0xfffffffff01eb550 "E"
It's always more fun with mdb!! ( Jun 23 2006, 09:00:00 AM GMT+00:00 ) Permalink Comments [3]

20060117 Tuesday January 17, 2006

Small sampling interval with cpustat(1M) It never ceases to amaze me that, even with tools that I use quite frequently, I can still learn new things about those tools. I recently had one of those experiences with cpustat(1M), a tool used to read performance counters from CPU's. Like most other observational tools of this ilk, cpustat allows you to specify a sampling interval. However, unlike most other tools, it allows you to specify intervals with a sub-second granularity - this is something I never knew.

The following example is on an Opteron based system. I'm using 2 of it's 4 pic counters (a pic counter is a hardware counter used to record events of interest) to measure user mode level 1 icache misses in pic0 and system mode level 1 icache misses in pic1. The sampling interval is every half a second:

# cpustat -c pic0=IC_miss,pic1=IC_miss,sys1,nouser1 0.5
   time cpu event      pic0      pic1 
  0.507   1  tick       796      3126 
  0.507   0  tick       574      3785 
  1.007   0  tick       263      3564 
  1.007   1  tick     21584      4851 
  1.507   0  tick      1416      5542 
  1.507   1  tick      6116     12318 
  2.007   0  tick       753      3817 
  2.007   1  tick       869      2507 
  2.507   1  tick      1345      3150 
  2.507   0  tick       185      2791 
Simple stuff but something which had evaded me until now. The most excellent Sriram brought this to my attention - thanks Sriram. ( Jan 17 2006, 05:00:00 PM GMT+00:00 ) Permalink

20060109 Monday January 09, 2006

DTrace and mdb(1) Just before Christmas, I was unfortunate enough to do a slot at the London Opensolaris User Group. I say unfortunate because my session was supposed to be about DTrace Internals. Actually, the word Internals is far too grand a word for what I aimed to do. Really I should replace Internals with "a bit more under the hood than you'd normally go...". Anyway, I digress.

I was also unfortunate because Adam Leventhal was in London at the time and came along to lend a hand! My presentation itself (the material) was pretty bad really owing to the lack of time I'd had to put into, what is essentially, a very large topic. However, I hope I wasn't too dull and I hope that I wasn't too innacurate in my description of how little bits of DTrace work.

The vast topic of how DTrace actually works is hugely interesting and I'll do my best to put some of my pitted understanding up here in the not-too-distant future. One thing that I put as an addendum to my presentation was some mdb(1) incantations that allow you to extract a bit of information about what is going on around DTrace itelf.

DTrace has several mdb(1) dcmds of interest that allow us to shine some light on the workings of DTrace. Firstly, let's take a couple of the simple ones starting with the ::dtrace_state dcmd:

# mdb -k
Loading modules: [ unix krtld genunix specfs dtrace ufs ip sctp usba uhci s1394 fctl nca lofs audiosup zfs random nfs sppp ptm ipc crypto logindmux ]
> ::help dtrace_state

NAME
  dtrace_state - print active DTrace consumers
The ::dtrace_state dcmd actually has a very good description which says it all really. You can use it to display all or individual consumers that are running on the system. Used without any arguments we see:

> ::dtrace_state
            ADDR MINOR             PROC NAME                         FILE
ffffffff8155c800     2 ffffffff856be048 dtrace           ffffffff857a84d0
ffffffff8155c000     3 ffffffff856af190 dtrace           ffffffff857a8498
So, we have two consumers running. The intersting bit is the address given on the ADDR column. This is a pointer to a dtrace_state structure. The dtrace_state structure is of huge interst to us when figuring out what a consumer is doing as it holds kernel state for a consumer.

We can then use the ::dtrace dcmd to extract any data that is being held in kernel for that consumer (data that hasn't yet been read by the consumer). Let's take the second consumers dtrace_state address and feed in into the dtrace dcmd:

> ffffffff8155c000::dtrace
  fsflush                                                           1
  Xorg                                                              2
  battstat-applet-                                                  3
  gnome-netstatus-                                                  3
  nautilus                                                          3
  dtrace                                                            4
  mixer_applet2                                                     4
  soffice.bin                                                       4
  mdb                                                               5
  powernowd                                                         6
  java                                                             34
  sched                                                           100
It would be very interesting to know what probes were enabled and what the script looked like that generated that output. Well, that's where things get a bit tougher and I'll do that one in another entry.

I'll leave you with a small piece of mdb that shows you what providers are loaded on your system at the moment. Now, you could do this with dtrace(1M) and a simple bit of awk but that would be dull :-) . If I remember, next time I'll try to explain what it does...

> 1::id2probe | ::print -t dtrace_probe_t dtpr_provider | ::list dtrace_provider
_t dtpv_next | ::print -t dtrace_provider_t dtpv_name 
char *dtpv_name = 0xffffffff83959078 "dtrace"
char *dtpv_name = 0xffffffff8527ed00 "pid689"
char *dtpv_name = 0xffffffff83db07c0 "syscall"
char *dtpv_name = 0xffffffff83db0790 "sdt"
char *dtpv_name = 0xffffffff83db07d0 "mib"
char *dtpv_name = 0xffffffff83db07b0 "io"
char *dtpv_name = 0xffffffff83db0660 "proc"
char *dtpv_name = 0xffffffff83db0690 "sched"
char *dtpv_name = 0xffffffff83db0650 "fpuinfo"
char *dtpv_name = 0xffffffff83db08f8 "vminfo"
char *dtpv_name = 0xffffffff83db0920 "sysinfo"
char *dtpv_name = 0xffffffff83db0928 "vtrace"
char *dtpv_name = 0xffffffff8316cb78 "profile"
char *dtpv_name = 0xffffffff83da8a28 "lockstat"
char *dtpv_name = 0xffffffff83db0900 "fbt"
char *dtpv_name = 0xffffffff831a5658 "fasttrap"
( Jan 09 2006, 05:00:00 PM GMT+00:00 ) Permalink Comments [3]

20050415 Friday April 15, 2005

DTrace'ing at SUPerG

With a spring in my step I'm off to SUPerG on monday. The "spring" here really refers to the fact that I'm really looking forward to this conference. Last year was my first SUPerG and I must say that it is probably the best conference that I have ever been to. A whole lot of really excellent Sun and external engineers come and share their insights with delegates who are genuinely interested in hearing what is being said. The people attending are not only interested but they are also interesting. I met many excellent people from all over the globe who were willing to not only listen but also share their experiences around Sun and the surrounding technologies. I learnt a whole lot and I'm looking forward to much of the same this year.

Last year Phil Harman (author of the infamous Multithreading in the Solaris Operating Environment paper) and I did a TechBrief on DTrace and we had a great time. The format 1 was one of Phil waxing lyrical about real scenarios/problems that you will encounter when dealing with systems in Data Centres whilst I create the D scripts on the fly and execute them, demonstrating how DTrace makes easy meat of all these problems. To really appreciate DTrace you've got to see it and feel it and this is what we aimed to do; slides are off the menu and smokin' keyboards and caffeine are on!

This year we aim to do the same but maybe with a few different laughs along the way. If you're coming along then I look forward to seeing you there. If you're not then try and get there next year!

A whole raft of talent will be presenting and a sample include Richard McDougall , Jim Mauro and Richard Elling .


1 This format was extremely well used by Messrs Cantrill and Shapiro in their "/proc-tologist" show. If they do this in a venue near you, sell your kidney if you have to but make sure you go. Great fun. ( Apr 15 2005, 09:05:00 PM GMT+00:00 ) Permalink Comments [0]

20050404 Monday April 04, 2005

The simple things in life are still cool
I've been messing around with DTrace for some time now and, as with most things in life, a certain level of intimacy means that I've began take the old boy a bit for granted. What was incredibly cool at first becomes normal after a while even though it is still mind boggling. A simple example that came up today reminded me of how the simple things are still incredibly cool.

A colleague of mine was wanting to see the http interaction in a web infrastructure he's developing. On a Solaris 10 system he has multiple zones with portal server, access manager and various other JES pieces. He had never touched DTrace before but being a bright chap he went and got the manual and knocked some stuff up to just look at the write() calls being made. He wanted to just look for http GETs at first but was having a slight problem getting his predicates right. To look for all http GETs being issued by firefox (for example) he just needed this:

#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option strsize=2048

syscall::write:entry
/execname == "firefox-bin" && stringof(copyin(arg1, 3)) == "GET"/
{
	printf("%s\n\n", copyinstr(arg1));
}


All we are saying here is that when a write system call is made, print out the write buffer if firefox made the write and if the first 3 characters being written are "GET". Really simple but a great demonstration of the ability to prune data at source provided by the predicate mechanism. Firefox makes thousands of write calls but I can specify exactly which ones I'm interested in with no post processing at all. Still cool.

For those that like to see output, the above script produces the following sample bumph when loading up www.sun.com:

GET http://www.sun.com/ HTTP/1.1
Host: www.sun.com
User-Agent: Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:1.7) Gecko/20041221
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Cookie: ...


GET http://www.sun.com/2005-0329/images/b3_staroffice8.jpg HTTP/1.1
Host: www.sun.com
User-Agent: Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:1.7) Gecko/20041221
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer: http://www.sun.com/
Cookie: ...


( Apr 04 2005, 03:00:00 PM GMT+00:00 ) Permalink Comments [1]

20050321 Monday March 21, 2005

uname lies ... Well, it's been a while to say the least but I think it's about time to put fingers to keys again and see what comes forth ...

I've mentioned before some of the great things that can be achieved with destructive actions. Indeed many good examples are popping up - check out this cool example from Chris Gerhards blog. Today we'll just take a quick look at another one (I actually have a bunch which I'll try and write-up over the next few weeks).

In the UK we have run a series of events called Make-It-Fly which I've been involved with. Last week I did a session on DTrace which I hope everyone who was present enjoyed (I did anyway!). At the events I usually do quite a bit of hands on demo with most of it being hand cranked. However, one of the scripts I use that always gets a laugh is the following one which I don't hand crank as I can never remember the offsets ... If you've ever wanted to get uname(1) to return something different to normal then this is what you need:

#!/usr/sbin/dtrace -s

#pragma D option destructive

syscall::uname:entry
{
        self->addr = arg0;
}

syscall::uname:return
{
        copyoutstr("SunOS", self->addr, 257);
        copyoutstr("PowerPC", self->addr+257, 257);
        copyoutstr("5.5.1", self->addr+(257*2), 257);
        copyoutstr("gate:1996-12-01", self->addr+(257*3), 257);
        copyoutstr("PPC", self->addr+(257*4), 257);
}
Before we have:
# uname -a
SunOS homer 5.10 SunOS_Development sun4u sparc SUNW,Ultra-5_10
and like magic we morph into something else when the above script is ran:
# uname -a
SunOS PowerPC 5.5.1 gate:1996-12-01 PPC sparc SUNW,Ultra-5_10
Here at Sun we often test pieces of software on versions of Solaris that return something different to that which the software is expecting. Previously I would LD_PRELOAD a library in with my own uname hack. Now I can not only do this without bothering the application but I can present different uname information to different applications/users/whatever as I can predicate accordingly!

Note, that the above script isn't quite complete as it returns the incorrect ISA information. This is me being idle and a bit of twiddling with sysinfo() is all that's needed. Maybe another day. ( Mar 21 2005, 05:20:00 PM GMT+00:00 ) Permalink Comments [1]

20040910 Friday September 10, 2004

Who's at the Door? Doors are an IPC facility that exist within Solaris which allow one process to execute functions within another process. A process that exports a function using the door_create() interface is termed a door server and a process that invokes this function with the door_call() interface is known as a client (suprisingly enough).

I was on the train the other day just tinkering with a few DTrace scripts that I've been developing (see Alans Blog for other DTrace train tinkering) when I happened to notice a reasonable amount of door_call() calls being made by my X server (the Xsun process). Nowadays it's really easy to see what process the door_call() is being made to as the pfiles(1) utility even tells us what process is the door server for this door. Taking a quick look at my X server we see:

# pgrep -fl Xsun
  497 /usr/openwin/bin/Xsun :0 -nobanner -defdepth 24 -auth /var/dt/A:0-PDaa9a
# pfiles 497 | grep -i door
   4: S_IFDOOR mode:0777 dev:214,0 ino:0 uid:0 gid:0 size:0
      O_RDWR  door to svc.configd[9]
  26: S_IFDOOR mode:0777 dev:214,0 ino:0 uid:0 gid:0 size:0
      O_RDWR  door to stfontserverd[695]
How spoilt we are nowadays - it's all so easy to see what is happening! However, I really want to see a systemic view a what door calls are being made so inbetween Surbiton and Waterloo I put the following simple script together:
#!/usr/sbin/dtrace -s

#pragma D option quiet

door_call:entry
{
        self->in = 1;
}

door_lookup:return
/self->in/
{
        self->called = stringof(args[1]->door_target->p_user.u_comm);
}

door_call:return
/self->in/
{
        @[execname, self->called] = count();
        self->in = 0;
        self->called = 0;
}

END
{
        printa("%s called door in %s %@d times\n", @);
}
The above script just takes advantage of the fact that we do a door_lookup() from within a door_call(). We can then extract the name of the door server from the proc structure that is stored in the struct door_node that door_lookup() returns (see sys/door.h).
# ./door.d
^C
kcfd called door in nscd 1 times
zoneadm called door in zoneadmd 1 times
devfsadm called door in devfsadm 1 times
zoneadmd called door in devfsadm 1 times
snmpd called door in nscd 2 times
sshd called door in kcfd 2 times
sshd called door in nscd 2 times
syslogd called door in nscd 2 times
syseventd called door in devfsadm 2 times
chown called door in nscd 2 times
devfsadm called door in syseventd 2 times
ttymon called door in nscd 3 times
dtlogin called door in nscd 3 times
id called door in nscd 4 times
deallocate called door in nscd 4 times
snmpXdmid called door in nscd 4 times
lpstat called door in syslogd 4 times
cron called door in nscd 5 times
svc.startd called door in nscd 6 times
inetd called door in nscd 16 times
svcprop called door in svc.configd 26 times
svccfg called door in svc.configd 54 times
rpcbind called door in svc.configd 117 times
lsvcrun called door in svc.configd 550 times
svc.configd called door in nscd 2034 times
mfstscan called door in svc.configd 2708 times
Xsun called door in stfontserverd 5046 times
svcs called door in svc.configd 7918 times
inetd called door in svc.configd 28908 times
svc.startd called door in svc.configd 37110 times
You'll have to excuse the poor formatting as it's always a weakness of mine but isn't it cool to see how easy we can see the various interactions? That's quite a few doors being opened in the smf(5) world as I was doing a bit of svcs work at the time (see Stephen Hahn's Blog for smf(5) goodness). ( Sep 10 2004, 10:32:00 PM GMT+00:00 ) Permalink Comments [2]

20040826 Thursday August 26, 2004

DTrace and Visualisation As I talk to people about DTrace one of the most frequent requests that arises is for scripts and tools. On the scripts front people want cookbook type stuff as it is always easier to learn a new technology such as DTrace by "re-using" and building upon pre-prepared material. This is an excellent request that, hopefully, Sun and the ever growing band of DTrace addicts will fulfill in the near future.

On the tools front a request that is made is for a DTrace GUI. In the early days of DTrace (about a year or so ago...) my usual base reaction to such a request was to reply that, whilst I never say never, I doubt that this is high on the priority list but I'll feed it back ... However, age and thought has softened me tremendously and I now feel that the union of DTrace and Visualisation techniques is a source of terrific innovation and will be of immense benefit to all.

With DTrace we have the infrastructure to ask almost arbitrary questions about the behaviour of our software stacks from the top of an application through to the bottom of a bus nexus driver. However, some of the data that these complex subsystems generate can be rich in nature. This is not a size issue particularly as DTrace has great primitives for data reduction but more referring to the relationships that exist within the data. Sometimes a bit of visual stimulation is required to get our synapses firing and to force our grey cells into seeing patterns and relationships in the data. I'm not knocking DTrace here; on the contrary I'm saying that DTrace is the nearly infinitely rich architecture with which we can feed the visualisation engines. I'm convinced that in the near future we will see some real innovation in the tools around systemic and application observability as the data is so easily at hand.

For example, take something that a colleague ( Simon Ritter ) and and I have been working on in our copious spare time, the area of call graphs. With DTrace we can trivially discover what functions a given function is calling and in it's simplest form the script looks like:

#usr/sbin/dtrace -s

#pragma D option flowindent

fbt::$1:entry
{
    self->in = 1;
}

fbt:::entry,
fbt:::return
/self->in && probefunc != "$1"/
{
}

fbt::$1:return
/self->in/
{
    self->in = 0;
    exit(1);
}
So, a trivial example would be to look at what activity the clock routine generates:
# ./fbt.d clock
dtrace: script './fbt.d' matched 31751 probes
CPU FUNCTION                                 
  0  -> clock                                 
  0    -> set_anoninfo                        
  0    <- set_anoninfo                        
  0    -> set_freemem                         
  0    <- set_freemem                         
  0    -> wakeup_pcgs                         
  0    <- wakeup_pcgs                         
  0    -> clock_tick                          
  0      -> sys_nullsys                       
  0      <- sys_nullsys                       
  0      -> mutex_owned                       
  0      <- mutex_owned                       
  0      -> rctl_test                         
  0      <- rctl_test                         
  0      -> rctl_test_entity                  
  0        -> mutex_owned                     
  0        <- mutex_owned                     
  0        -> rctl_set_find                   
  0          -> mutex_owned                   
  0          <- mutex_owned                   
  0        <- rctl_set_find                   
  0      <- rctl_test_entity                  
  0      -> rctl_test                         
  0      <- rctl_test                         
  0      -> rctl_test_entity                  
  0        -> mutex_owned                     
  0        <- mutex_owned                     
  0        -> rctl_set_find                   
  0          -> mutex_owned                   
  0          <- mutex_owned                   
  0        <- rctl_set_find                   
  0      <- rctl_test_entity                  
  0      -> rm_asrss                          
  0      <- rm_asrss                          
  0    <- clock_tick                          
  0    -> callout_schedule                    
  0      -> callout_schedule_1                
  0      <- callout_schedule_1                
  0      -> softcall                          
  0        -> mutex_vector_enter              
  0        <- mutex_vector_enter              
  0        -> mutex_vector_exit               
  0        <- mutex_vector_exit               
  0      <- softcall                          
  0      -> siron                             
  0      <- siron                             
  0      -> setsoftint                        
  0      <- setsoftint                        
  0      -> callout_schedule_1                
  0      <- callout_schedule_1                
  0    <- callout_schedule                    
  0   | clock:return                          
  0  <- clock                                 

However, as great as it is to see the above, the shape of the call sequence isn't very apparent and that is a simple routine. Using something such as a ufs_write() you soon see how this gets very complex. It is much better to draw a directed graph of the calls and you can immediately see what is being called by what. An example of a single ufs_write() call can be seen in this image . The image was generated using DTrace to feed a public domain piece of software called JzGraph which is a great Java implementation of the wonderful Graphviz algorithms.

However, the above is very static in its nature and we wanted to generate something much more dynamic. What we have done is to create a Java application which is a dtrace consumer (i.e it links a JNI library which uses libdtrace(3LIB)). Therefore we can interact with DTrace at will and be much more dynamic in the way we present and manipulate the data. The current version of our application allows you to dynamically generate graphs for any kernel function, step through the call sequence, zoom in and out of graphs, collapse and expand nodes, display timing data, display source code for the nodes in a graph (amongst other things).

We have many other ideas of what to do and hopefully we'll get around to them soon. Sun is developing some Java bindings for DTrace and they're looking really good at the moment (see here for Bryans comment on this).

I for one am looking forward to a future in which complex systemic interactions will be visible to all as the data is now readily available. It has to be seen to be believed. ( Aug 26 2004, 01:00:29 PM GMT+00:00 ) Permalink Comments [3]

Calendar

RSS Feeds

Blogroll

Navigation