|
Debugging on Sparc
Debugging
on Sparc
While debugging x86/x64 crash dumps has been fairly extensively talked
about at various places (most recently here),
I haven't come across any
resources that talk about debugging sparc dumps (other than numerous
bug reports). Now that
OpenSolaris is live, it'll be
relatively easier for
developers outside Sun to debug problems. Thus, the motivation behind
this entry.
Most of the time when you get a crash dump from a kernel panic, it's
either during development (in which case it's easy to debug because
you *know* exactly what caused the code to fail) or it's while the code
is
in production. It's harder to debug when you're given a dump obtained
from a production machine primarily because first you need to find out
what caused the code to fail and second you need to simulate the
failure
in the lab.
A lot of the times, finding the root cause entails figuring out what
parameters were passed to functions and what do the local variables
look like at a certain point in time. I'll walk through an example to
demonstrate how function arguments and local variables can be excavated
from a dump.
Parameter passing on sparc - A
brief overview
Unlike x86 that passes function arguments on the stack and x64 that
passes
function arguments (atleast most of them) in registers, sparc uses
register windows to pass parameters. Arguments are passed in
%i0, %i1 .. %i5 with %i0 having the first parameter and so on. If there
are more than six input parameters to a function, parameters after the
sixth are passed on the stack. %i6 contains the frame pointer (%fp)
Local variables are allocated at an offset to the frame pointer.
Stack Format
The frame structure is defined in the system file
usr/include/sys/frame.h
and it looks as follows -
struct frame { long fr_local[8]; /* saved locals */ long fr_arg[6]; /* saved arguments [0 - 5] */ struct frame *fr_savfp; /* saved frame pointer */ long fr_savpc; /* saved program counter */ #if !defined(__sparcv9) char *fr_stret; /* struct return addr */ #endif /* __sparcv9 */ long fr_argd[6]; /* arg dump area */ long fr_argx[1]; /* array of args past the sixth */ };
So the input parameters are in the fr_arg array.
Exacavating arguments with an NFSv4 bug
Using the bug 6268686
as an example and referencing OpenSolaris,
let's look
at the stack trace that resulted in the panic -
> $C
000002a1012203d1 vpanic(1295800, 7aabd868, 7aabd880, 851, 2400, 2a1012210fc) 000002a101220481 assfail+0x74(7aabd868, 7aabd880, 851, 18c6000, 1295800, 0) 000002a101220531 nfs4_make_dotdot+0x4f4(2a101220df8, 2388873b24c20, fffffffffffffff8, 301412eb920, 2a101221238, 1) 000002a101220941 nfs4lookupnew_otw+0x7d8(301ef0d4dc0, 2a101221530, 2a101221528, 301412eb920, df8475800, 38285c955c0) 000002a101220a71 nfs4_lookup+0x114(301ef0d4dc0, 2a101221530, 2a101221528, 301412eb920, 0, 391f52d44a8) 000002a101220b41 fop_lookup+0x28(301ef0d4dc0, 2a101221530, 2a101221528, 7aa69c2c, 0, 600045703c0) 000002a101220c01 lookuppnvp+0x344(2a1012217f0, 0, 600045703c0, 2a101221528, 2a101221530, 6000008dbc0) 000002a101220e41 lookuppnat+0x120(301ef0d4dc0, 0, 1, 0, 2a101221930, 0) 000002a101220f01 lookupnameat+0x5c(0, 0, 1, 0, 2a101221930, 0) 000002a101221011 vn_openat+0x164(1, 400, 1, 1, 0, 1) 000002a1012211d1 copen+0x260(ffffffffffd19553, 87aa3, 0, 50400, 0, 1) 000002a1012212e1 syscall_trap32+0x1e8(87aa3, 0, 50400, 0, 0, 0)
To set the context for this bug, we were trying to lookup a directory
and it
so happened that we ended up calling nfs4_make_dotdot to get an rnode.
The
comments in the code explain fairly well under what circumstances this
function
is called -
/* * nfs4_make_dotdot() - find or create a parent vnode of a non-root node. * * Our caller has a filehandle for ".." relative to a particular * directory object. We want to find or create a parent vnode * with that filehandle and return it. .. snip
Like the comments say, we had a filehandle for ".." relative to the
directory
object we're trying to lookup. So, to start off what was the pathname
we're
trying to lookup? To determine this, we'd like to know what are the
arguments
passed into the
nfs4_make_dotdot function. Check the source and the function
is defined in uts/common/fs/nfs/nfs4_subr.c
as -
int nfs4_make_dotdot(nfs4_sharedfh_t *fhp, hrtime_t t, vnode_t *dvp, cred_t *cr, vnode_t **vpp, int need_start_op)
The interesting bit is the passed in directory vnode pointer, dvp, and
it's
passed in in the i2 register. If we can find out the dvp, we'll also
know
the path we're playing with here.
64-bit sparc has a notion of stack
bias and you need to add the stack bias to
the frame pointer in order to get the actual data of the stack frame.
Applying that to the frame pointer for nfs4_make_dotdot and dumping out
the frame, we have -
> 000002a101220531+0x7ff::print struct frame { fr_local = [ 0, 0, 0x381a4c49000, 0x2a101221108, 0x2a101220ee0, 0x2a101221118, 0x7aabd800, 0x7aabd800 ] fr_arg = [ 0x2a101220df8, 0x2388873b24c20, 0xfffffffffffffff8, 0x301412eb920, 0x2a101221238, 0x1 ] fr_savfp = 0x2a101220941 fr_savpc = 0x7aa6b474 fr_argd = [ 0x1, 0x5bc679f3060, 0, 0x2a101221250, 0x200000000, 0x5bc679f3178 ] fr_argx = [ 0 ] }
i2 here looks bogus, darn! Let's backup one function higher to
nfs4lookupnew_otw and see if it we can fish out dvp out of it's frame
easily.
Quick look at the source in uts/common/fs/nfs/nfs4_vnops.c
and -
static int nfs4lookupnew_otw(vnode_t *dvp, char *nm, vnode_t **vpp, cred_t *cr)
The same dvp we're looking for should be in i0 provided it's not been
overwritten. Dump out the frame -
> 000002a101220941+0x7ff::print struct frame { fr_local = [ 0x391f52d4450, 0x381a4c49000, 0x2388873c342a0, 0x600074432a8, 0x2388873b24c20, 0, 0x1, 0x391f52d44e8 ] fr_arg = [ 0x301ef0d4dc0, 0x2a101221530, 0x2a101221528, 0x301412eb920, 0xdf8475800, 0x38285c955c0 ] fr_savfp = 0x2a101220a71 fr_savpc = 0x7aa69d40 fr_argd = [ 0x38f6901b110, 0x311fe247dc0, 0x2a101221704, 0x2a1012216fc, 0x2a101220c71, 0x7aa7ad20 ] fr_argx = [ 0x2a101221264 ] }
Quick check to see if it's been overwritten -
> nfs4lookupnew_otw::dis!grep i0 [ .. elided ]
It's not overwritten, we're in luck! Double check to see if it's a
vnode.
> 0x301ef0d4dc0::whatis 301ef0d4dc0 is 301ef0d4dc0+0, bufctl 301ecba50c8 allocated from vn_cache
It sure is a vnode. Dumping out the path is now easy -
> 301ef0d4dc0::print vnode_t v_data |::print rnode4_t r_svnode.sv_name r_svnode.sv_name = 0x391d8a24090 > 0x391d8a24090::print nfs4_fname_t fn_parent fn_name fn_parent = 0x3b6565ba920 fn_name = 0x3292727cbe0 "uts" > 0x3b6565ba920::print nfs4_fname_t fn_parent fn_name fn_parent = 0x353987b1550 fn_name = 0x4f42a102fe0 "src" > 0x353987b1550::print nfs4_fname_t fn_parent fn_name fn_parent = 0
We're operating on ./src/uts and this isn't handled correctly in the
lookup
handling routine (it's fixed now).
As I mentioned earlier, local variables are stored at an offset to the
frame pointer. Now that we have the frame pointer, we can dig out the
local
variables. The variable of interest in this case was the error structure
declared on the stack for nfs4_make_dotdot here
A close look at the disassembly of the function and we can see -
nfs4_make_dotdot+0x27c: mov %l2, %o0 nfs4_make_dotdot+0x280: mov 0xc, %o3 nfs4_make_dotdot+0x284: call +0x15c58 nfs4_end_fop nfs4_make_dotdot+0x288: mov %l7, %o5 nfs4_make_dotdot+0x28c: ba -0xd0 nfs4_make_dotdot+0x1bc nfs4_make_dotdot+0x290: cmp %i5, 0 nfs4_make_dotdot+0x294: add %fp, 0x797, %o4 nfs4_make_dotdot+0x298: mov %l2, %o0 nfs4_make_dotdot+0x29c: mov 0xc, %o3 nfs4_make_dotdot+0x2a0: mov %l1, %i1 nfs4_make_dotdot+0x2a4: call +0x15c38 nfs4_end_fop nfs4_make_dotdot+0x2a8: mov %l1, %o5 nfs4_make_dotdot+0x2ac: ld [%fp + 0x7bb], %i3 <------ nfs4_make_dotdot+0x2b0: cmp %i3, 0
that it's stored at fp + 0x7bb (fp is the fr_savfp in
nfs4_make_dotdot's frame)
Dump it out -
> 0x2a101220941+0x7bb::print nfs4_error_t { error = 0 stat = 0t10006 (NFS4ERR_SERVERFAULT) rpc_status = 0 (RPC_SUCCESS) }
This reveals a secondary problem in the code which is that there are no
checks for errors like NFS4ERR_SERVERFAULT (again, now fixed).
Technorati Tag: OpenSolaris
Technorati Tag: Solaris
Technorati Tag: NFS
( Jun 14 2005, 12:53:42 PM EDT / Jun 14 2005, 11:50:01 AM EDT )
Permalink
Trackback: http://blogs.sun.com/aalok/entry/debugging_on_sparc
|