I'm in the process of making sure the MDS cleans up memory when a DS reboots. As such, I've been trying to figure out where a refcnt is held or released for a rfs4_dbe. I added two circular buffers to keep track of these events.
I just did 10 back-to-back cthon test runs and turned my head away. I was trying to figure out if mds_gather_devs leaked ds_addrlist_ts. The answer is no, but I did get a panic.
[root@pnfs-17-24 ~]> panic[cpu1]/thread=ffffff01d16dee80: assertion failed: e->refcnt > 1, file: ../../common/fs/nfs/nfs4_db.c, line: 193 ffffff00094836d0 genunix:assfail+7e () ffffff0009483700 nfssrv:rfs4_dbe_rele+98 () ffffff0009483720 nfssrv:rfs41_session_rele+1a () ffffff0009483740 nfssrv:rfs41_compound_state_free+8a () ffffff00094837d0 nfssrv:rfs41_dispatch+1bd () ffffff0009483820 nfssrv:rfs4_minor_version_dispatch+5c () ffffff0009483b20 nfssrv:common_dispatch+7a6 () ffffff0009483b40 nfssrv:rfs_dispatch+2d () ffffff0009483c30 rpcmod:svc_getreq+20d () ffffff0009483ca0 rpcmod:svc_run+197 () ffffff0009483cd0 rpcmod:svc_do_run+81 () ffffff0009484eb0 nfs:nfssys+9f1 () ffffff0009484f00 unix:brand_sys_syscall32+292 () panic: entering debugger (continue to save dump)
And this is probably something I added - not something already existing.
So I have these arrays, how do I figure out what is going on?
[1]> $C fffffffffbc65740 kmdb_enter+0xb() fffffffffbc65760 debug_enter+0x38(fffffffffb961ea0) fffffffffbc65830 panicsys+0x40e(fffffffffbf50518, ffffff0009483660, fffffffffbc65840, 1) ffffff00094835a0 vpanic+0x15d() ffffff0009483690 panic+0x94() ffffff00094836d0 assfail+0x7e(fffffffff81e4910, fffffffff81e48f0, c1) ffffff0009483700 nfssrv`rfs4_dbe_rele+0x98(ffffff01e678fad8) ffffff0009483720 nfssrv`rfs41_session_rele+0x1a(ffffff01e678fb68) ffffff0009483740 nfssrv`rfs41_compound_state_free+0x8a(ffffff027ffb7c20) ffffff00094837d0 nfssrv`rfs41_dispatch+0x1bd(ffffff0009483bb0, ffffff01f18ab940 , ffffff0009483880) ffffff0009483820 nfssrv`rfs4_minor_version_dispatch+0x5c(ffffff0009483bb0, ffffff01f18ab940, ffffff0009483880) ffffff0009483b20 nfssrv`common_dispatch+0x7a6(ffffff0009483bb0, ffffff01f18ab940 , 2, 4, fffffffff81e1e40, ffffffffc011e3e0) ffffff0009483b40 nfssrv`rfs_dispatch+0x2d(ffffff0009483bb0, ffffff01f18ab940) ffffff0009483c30 rpcmod`svc_getreq+0x20d(ffffff01f18ab940, ffffff01d63f2f60) ffffff0009483ca0 rpcmod`svc_run+0x197(ffffff01e4a0bbb8) ffffff0009483cd0 rpcmod`svc_do_run+0x81(1) ffffff0009484eb0 nfs`nfssys+0x9f1(e, fe580fc0) ffffff0009484f00 sys_syscall32+0x1fc()
The rfs4_dbe_t is at ffffff01e678fad8:
[1]> ffffff01e678fad8::print struct rfs4_dbe
{
lock = [
{
_opaque = [ 0xffffff01d16dee80 ]
}
]
refcnt = 0x1
skipsearch = 0x1
invalid = 0x1
reserved = 0x3addcafe
time_rele = 2009 May 5 16:17:41
inval_hint = nfssrv`mds_session_inval+0x9e
id = 0x64
cv = [
{
_opaque = 0
}
]
data = 0xffffff01e678fb68
rtr = {
rtr_count = 0x3
rtr_rele_idx = 0x1
rtr_hold_idx = 0x2
rtr_rele = 0xffffff01e7a9c1b0
rtr_hold = 0xffffff01e7a9c180
}
table = 0xffffff01d9494700
indices = [
{
next = 0
prev = 0
entry = 0xffffff01e678fad8
}
]
}
The rtr structure holds the refcnt tracking data. We are tracking 3 items per buffer and we can see the last ones recorded would have been rele=0 and hold=1. (I.e., we are going to write at rele=1 and hold=2, so we subtract 1 to get the previous record.)
I can use a dcmd to see each array:
[1]> 0xffffff01e7a9c1b0::array caddr_t 3 |::print caddr_t nfssrv`mds_destroysession+0x3e nfssrv`rfs41_compound_state_free+0x8a nfssrv`mds_session_inval+0xa7 [1]> 0xffffff01e7a9c180::array caddr_t 3 |::print caddr_t nfssrv`rfs4_dbsearch+0x154 nfssrv`rfs4_dbsearch+0x154 nfssrv`rfs4_dbsearch+0x154
I need to script this up to run from the command line, but for now this will suffice. Hmm, the rfs4_dbsearch information isn't too useful. It would be better to have recorded the caller of it. I have that capability, I'll just need to make sure to call it in rfs4_dbsearch.
Okay, the buffer size is too small here to help. But I could bump it up and rerun the tests. I'm pretty confident that it will trip again. But, I also happen to know I have made changes here and I can go revisit those changes to see where I made a mistake.