Jonathan Adams's Weblog
Tuesday Feb 21, 2006
Some block comments about libumem
One of the projects I've been working on recently is a wad covering the following bugs:
4720206 ::findleaks shouldn't cache results across state changes 4743353 libumem's module fails to load on idle targets 6304072 libumem seems to use more heap than it needs 6336202 d4fc7824::typegraph made mdb crashAs part of it, I made some ASCII-art comments describing the layout of a umem buffer and slab, which I thought might be of interest more generally. Here are the block comments:
/* * Each slab in a given cache is the same size, and has the same * number of chunks in it; we read in the first slab on the * slab list to get the number of chunks for all slabs. To * compute the per-slab overhead, we just subtract the chunk usage * from the slabsize: * * +------------+-------+-------+ ... --+-------+-------+-------+ * |////////////| | | ... | |///////|///////| * |////color///| chunk | chunk | ... | chunk |/color/|/slab//| * |////////////| | | ... | |///////|///////| * +------------+-------+-------+ ... --+-------+-------+-------+ * | \_______chunksize * chunks_____/ | * \__________________________slabsize__________________________/ * * For UMF_HASH caches, there is an additional source of overhead; * the external umem_slab_t and per-chunk bufctl structures. We * include those in our per-slab overhead. * * Once we have a number for the per-slab overhead, we estimate * the actual overhead by treating the malloc()ed buffers as if * they were densely packed: * * additional overhead = (# mallocs) * (per-slab) / (chunks); * * carefully ordering the multiply before the divide, to avoid * round-off error. */ ... /* * A malloc()ed buffer looks like: * * <----------- mi.malloc_size ---> * <----------- cp.cache_bufsize ------------------> * <----------- cp.cache_chunksize --------------------------------> * +-------+-----------------------+---------------+---------------+ * |/tag///| mallocsz |/round-off/////|/debug info////| * +-------+---------------------------------------+---------------+ * <-- usable space ------> * * mallocsz is the argument to malloc(3C). * mi.malloc_size is the actual size passed to umem_alloc(), which * is rounded up to the smallest available cache size, which is * cache_bufsize. If there is debugging or alignment overhead in * the cache, that is reflected in a larger cache_chunksize. * * The tag at the beginning of the buffer is either 8-bytes or 16-bytes, * depending upon the ISA's alignment requirements. For 32-bit allocations, * it is always a 8-byte tag. For 64-bit allocations larger than 8 bytes, * the tag has 8 bytes of padding before it. * * 32-byte, 64-byte buffers <= 8 bytes: * +-------+-------+--------- ... * |/size//|/stat//| mallocsz ... * +-------+-------+--------- ... * ^ * pointer returned from malloc(3C) * * 64-byte buffers > 8 bytes: * +---------------+-------+-------+--------- ... * |/padding///////|/size//|/stat//| mallocsz ... * +---------------+-------+-------+--------- ... * ^ * pointer returned from malloc(3C) * * The "size" field is "malloc_size", which is mallocsz + the padding. * The "stat" field is derived from malloc_size, and functions as a * validation that this buffer is actually from malloc(3C). */For more details on how umem works, you can look at the kmem and vmem papers:
The Slab Allocator: An Object-Caching Kernel Memory Allocator, Summer USENIX 1994
Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX 2001
Tags: [ libumem, MDB, OpenSolaris, Solaris ]
Posted at 05:12PM Feb 21, 2006 by jwadams in Solaris | Comments[1]