Darryl Gove's blog

Tuesday Jun 12, 2007

Identifying misaligned loads in 32-bit code using dtrace

A previous blog entry talks about handling and detecting misaligned memory accesses. For 64-bit code this is easy to achieve using the Performance Analyzer, for 32-bit code the analysis is a bit more tricky. Fortunately it is possible to do the 32-bit analysis with dtrace

Consider the following program which has a misaligned memory access. The default mode of the compiler (since Sun Studio 9) will compile the binary to trap to fix the misalignment and continue

% more align.c
void main()
{
  volatile char a[10];
  int i;
  for (i=0; i<100000000; i++) {(*(int*)(&a[1]))++;}
}

The following dtrace script will instrument the misaligned data access trap handler and report all the pids that trigger this

% more tr.d
fbt::do_unaligned:entry
{
  @p[pid]=count();
}

It can be run with

% sudo dtrace -s tr.d
dtrace: script 'tr.d' matched 1 probe
^C


    14873           260932

The script returns the pid which is having misalignment issues. This information is useful, in that it is trivial to recompile the binary with a different setting for -xmemalign and avoid the behaviour. But it would be very useful to know where the traps are occuring in the binary - perhaps most of the traps only happen in one place, and that place can be fixed in the source.

% more tr.d
fbt::do_unaligned:entry
{
  @[ustack()]=count();
}

This script produces output that identifies the locations in the binary where the traps are being generated. For the simple test code there are two locations - the load and the store.

sudo dtrace -s tr.d
dtrace: script 'tr.d' matched 1 probe
^C
              align`main+0x10
              align`_start+0x108
           130466

              align`main+0x18
              align`_start+0x108
           130466

The disassembly for the loop is as follows

main()
        10b80:  9d e3 bf 90  save       %sp, -112, %sp
...
        10b90:  d0 07 bf f7  ld         [%fp - 9], %o0  <<<<<< misaligned
        10b94:  90 02 20 01  inc        %o0
        10b98:  d0 27 bf f7  st         %o0, [%fp - 9]  <<<<<< misaligned
        10b9c:  ba 07 60 01  inc        %i5
        10ba0:  80 a7 40 09  cmp        %i5, %o1
        10ba4:  06 bf ff fb  bl         main+0x10       ! 0x10b90
        10ba8:  01 00 00 00  nop

Comments:

Post a Comment:
Comments are closed for this entry.

Calendar

Search this blog

About

Solaris Application Programming

Book resources

Recent entries

Custom search

Tag cloud

ats bit book c++ cmt communityone compiler cooltools cpu2006 developers dtrace gccfss libraries multithreading openmp opensparc parallelisation parallelization performance performanceanalyzer secondlife solaris solarisapplicationprogramming sparc spot sunstudio t2 ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Navigation

Referers

Feeds