« November 2009
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
     
       
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20090801 Saturday August 01, 2009
One of those memory leaks is still there on the DS

Looks like my new code is not complete:


> ::findleaks
CACHE             LEAKED           BUFCTL CALLER
ffffff01c682a860       1 ffffff01d79523d0 dserv_mds_do_reportavail+0x210
ffffff01c68262e0       4 ffffff01ee2a9118 mds_compound+0x54
ffffff01c682b2e0       3 ffffff01e8721738 mds_compound+0x193
ffffff01c6828020       1 ffffff01f274dc00 mds_get_server_impl_id+0x30
ffffff01c68262e0       1 ffffff01e96acb40 mds_get_server_impl_id+0x58
ffffff01c6826b20       1 ffffff01e128de70 mds_get_server_impl_id+0x8a
ffffff01c6828860       1 ffffff01d87c66d8 modinstall+0x129
ffffff01c682b2e0       1 ffffff01ddb51748 modinstall+0x129
ffffff01c6828860       1 ffffff01d7f8f9b0 modinstall+0x129
ffffff01c6828020       1 ffffff01dd7db798 rpc_init_taglist+0x25
ffffff01c6828020   12741 ffffff01f180fdf8 rpc_init_taglist+0x25
ffffff01c6828020       1 ffffff01e3ede5e8 rpc_init_taglist+0x25
ffffff01c6828020       1 ffffff09ba2e4da0 rpc_init_taglist+0x25
ffffff01c6828020   23152 ffffff01e2e37cc8 rpc_init_taglist+0x25
ffffff01c68265a0       1 ffffff01ee74bd30 tohex+0x32
ffffff01c6828020       2 ffffff01d632b880 xdr_array+0xae
ffffff01c6828020       1 ffffff01fe11ec00 xdr_array+0xae
ffffff01c68282e0       1 ffffff01e5e4c2b0 xdr_bytes+0x70
ffffff01c68262e0    1659 ffffff01eaae3700 xdr_bytes+0x70
ffffff01c68285a0       1 ffffff01fe136de0 xdr_bytes+0x70
ffffff01c68262e0 1571800 ffffff01e7dde3b8 xdr_bytes+0x70
------------------------------------------------------------------------
           Total 1609375 buffers, 26900440 bytes
> ffffff01e7dde3b8$<bufctl_audit
            ADDR          BUFADDR        TIMESTAMP           THREAD
                            CACHE          LASTLOG         CONTENTS
ffffff01e7dde3b8 ffffff01ea091bc8     4d6b30e7ab91 ffffff01d8245b40
                 ffffff01c68262e0 ffffff01c6b37000 ffffff01cc88be60
                 kmem_cache_alloc_debug+0x283
                 kmem_cache_alloc+0xa9
                 kmem_alloc+0xa3
                 xdr_bytes+0x70
                 xdr_mds_sid+0x21
                 xdr_ds_fh_v1+0x68
                 xdr_ds_fh+0x3f
                 xdr_decode_nfs41_fh+0xdd
                 xdr_snfs_argop4+0x5e
                 xdr_COMPOUND4args_srv+0xf4
                 svc_authany_wrap+0x22
                 svc_cots_kgetargs+0x41
                 dispatch_dserv_nfsv41+0x5d
                 svc_getreq+0x20d
                 svc_run+0x197

By the way, those leaks of 1 or 2, those are probably active memory when I forced the core.

So this is the second bug I claimed to have fixed earlier today. Of note is that we never saw a panic, so something at least is correct. And, I decided to fix the rpc_init_taglist bug while I am at it.

I'm going to need to add some DTrace to track down what is happening here...

Aargh! I say, aargh! nfs4_xdr.c belongs to the nfs module and not the nfssrv module. For quick turn around, I've only been rebuilding nfssrv and not the whole kernel. It was only when just changing nfs_xdr.c and trying a dmake in src/uts/intel/nfssrv that I noticed nothing happened. My code may be golden after all! If it compiles that is.

Okay, I did some other changes, but here is my compiling code:

4059                 case OP_PUTFH: {
4060                         nfs_fh4 *obj = &array[i].nfs_argop4_u.opputfh.object;
4061 
4062                         if (obj->nfs_fh4_val == NULL)
4063                                 continue;
4064 
4065                         DTRACE_NFSV4_1(xdr__i__op_putfh_version, uint32_t,
4066                             minorversion);
4067                         if (minorversion != 0) {
4068                                 struct mds_ds_fh        *dsfh =
4069                                     (struct mds_ds_fh *)obj->nfs_fh4_val;
4070 
4071                                 DTRACE_NFSV4_1(xdr__i__op_putfh_type,
4072                                     nfs41_fh_type_t, dsfh->type);
4073 
4074                                 /*
4075                                  * Is it really a DS filehandle?
4076                                  */
4077                                 if (dsfh->type == FH41_TYPE_DMU_DS) {
4078                                         mds_sid *sid = &dsfh->fh.v1.mds_sid;
4079 
4080                                         DTRACE_NFSV4_1(xdr__i__op_putfh_sid,
4081                                             mds_sid *, sid);
4082 
4083                                         if (sid->val) {
4084                                                 kmem_free(sid->val, sid->len);
4085                                         }
4086                                 }
4087                         }
4088 
4089                         kmem_free(obj->nfs_fh4_val, obj->nfs_fh4_len);
4090                         continue;
4091                 }

And I added this simple DTrace script:

[root@pnfs-17-22 ~]> more ds.d 
#!/usr/sbin/dtrace -s

nfsv4:::xdr-i-op_putfh_version
{
        printf("xdr decode a FH -- version == %u",
            (uint32_t)arg0);
}

nfsv4:::xdr-i-op_putfh_type
{
        printf("xdr decode a FH -- type == %s",
            (int)arg0 == 2 ? "DS" : "regular");
}

nfsv4:::xdr-i-op_putfh_sid
{
        sid = (mds_sid *)arg0;

        printf("xdr decode a FH -- sid == %s",
            sid == NULL ? "(null)" : "valid");
}

Which shows:

[root@pnfs-17-22 ~]> ./ds.d
dtrace: script './ds.d' matched 3 probes
CPU     ID                    FUNCTION:NAME
  0   2834 xdr_snfs_argop4_free:xdr-i-op_putfh_version xdr decode a FH -- version == 1
  0   2833 xdr_snfs_argop4_free:xdr-i-op_putfh_type xdr decode a FH -- type == DS
  0   2832 xdr_snfs_argop4_free:xdr-i-op_putfh_sid xdr decode a FH -- sid == valid
  0   2834 xdr_snfs_argop4_free:xdr-i-op_putfh_version xdr decode a FH -- version == 1
  0   2833 xdr_snfs_argop4_free:xdr-i-op_putfh_type xdr decode a FH -- type == DS
  0   2832 xdr_snfs_argop4_free:xdr-i-op_putfh_sid xdr decode a FH -- sid == valid

But I still have to check back later to see if there are memory leaks!

I've been trying to show how you would use kmdb and ::findleaks to track down memory leaks. You need to do this with XDR code, even the machine generated stuff. You also need to do it before you integrate and not after. I've fixed two leaks that were pre-existing. They would probably go until either someone had a regression test session flunk because of accumulated memory leaks (the mds_sid leaks would do it) or we sat down to find them before shipping code.

The other thing about memory leaks is that you have to test after you fix them, you might find more, find out your fix didn't work, or find out your fix uncovered others.

And perhaps it is time to remind you of my other disclaimer, I don't hide my braindead mistakes. I show them in hopes that someone can learn from them - even if it is just me. :->


Originally posted on Kool Aid Served Daily
Copyright (C) 2009, Kool Aid Served Daily

Trackback URL: http://blogs.sun.com/tdh/entry/one_of_those_memory_leaks
Comments:

Post a Comment:

Name:
E-Mail:
URL:

Your Comment:

HTML Syntax: NOT allowed