Jonathan Adams's Weblog

Tuesday Feb 21, 2006

Some block comments about libumem

One of the projects I've been working on recently is a wad covering the following bugs:

4720206 ::findleaks shouldn't cache results across state changes
4743353 libumem's module fails to load on idle targets 
6304072 libumem seems to use more heap than it needs
6336202 d4fc7824::typegraph made mdb crash
As part of it, I made some ASCII-art comments describing the layout of a umem buffer and slab, which I thought might be of interest more generally. Here are the block comments:
/*
 * Each slab in a given cache is the same size, and has the same
 * number of chunks in it;  we read in the first slab on the
 * slab list to get the number of chunks for all slabs.  To
 * compute the per-slab overhead, we just subtract the chunk usage
 * from the slabsize:
 *
 * +------------+-------+-------+ ... --+-------+-------+-------+
 * |////////////|       |       | ...   |       |///////|///////|
 * |////color///| chunk | chunk | ...   | chunk |/color/|/slab//|
 * |////////////|       |       | ...   |       |///////|///////|
 * +------------+-------+-------+ ... --+-------+-------+-------+
 * |            \_______chunksize * chunks_____/                |
 * \__________________________slabsize__________________________/
 *
 * For UMF_HASH caches, there is an additional source of overhead;
 * the external umem_slab_t and per-chunk bufctl structures.  We
 * include those in our per-slab overhead.
 *
 * Once we have a number for the per-slab overhead, we estimate
 * the actual overhead by treating the malloc()ed buffers as if
 * they were densely packed:
 *
 *      additional overhead = (# mallocs) * (per-slab) / (chunks);
 *
 * carefully ordering the multiply before the divide, to avoid
 * round-off error.
 */
...
/*
 * A malloc()ed buffer looks like:
 *
 *      <----------- mi.malloc_size --->
 *      <----------- cp.cache_bufsize ------------------>
 *      <----------- cp.cache_chunksize -------------------------------->
 *      +-------+-----------------------+---------------+---------------+
 *      |/tag///| mallocsz              |/round-off/////|/debug info////|
 *      +-------+---------------------------------------+---------------+
 *              <-- usable space ------>
 *
 * mallocsz is the argument to malloc(3C).
 * mi.malloc_size is the actual size passed to umem_alloc(), which
 * is rounded up to the smallest available cache size, which is
 * cache_bufsize.  If there is debugging or alignment overhead in
 * the cache, that is reflected in a larger cache_chunksize.
 *
 * The tag at the beginning of the buffer is either 8-bytes or 16-bytes,
 * depending upon the ISA's alignment requirements.  For 32-bit allocations,
 * it is always a 8-byte tag.  For 64-bit allocations larger than 8 bytes,
 * the tag has 8 bytes of padding before it.
 *
 * 32-byte, 64-byte buffers <= 8 bytes:
 *      +-------+-------+--------- ...
 *      |/size//|/stat//| mallocsz ...
 *      +-------+-------+--------- ...
 *                      ^
 *                      pointer returned from malloc(3C)
 *
 * 64-byte buffers > 8 bytes:
 *      +---------------+-------+-------+--------- ...
 *      |/padding///////|/size//|/stat//| mallocsz ...
 *      +---------------+-------+-------+--------- ...
 *                                      ^
 *                                      pointer returned from malloc(3C)
 *
 * The "size" field is "malloc_size", which is mallocsz + the padding.
 * The "stat" field is derived from malloc_size, and functions as a
 * validation that this buffer is actually from malloc(3C).
 */
For more details on how umem works, you can look at the kmem and vmem papers:

The Slab Allocator: An Object-Caching Kernel Memory Allocator, Summer USENIX 1994
Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX 2001

Tags: [ , , , ]

Calendar

Feeds

Search

Navigation

Referrers