Detecting Memory Corruption with Solaris' libumem – Getting Started
Memory corruption incidents might be tough to handle. Solaris' libumem (an alternative allocation library) offers a debugging facility, which is useful for memory inquiring, when trying to detect leaking or corruption. Using libumem is anyway recommended when running a multi-threaded application (especially on a multi-processors or other multi-threaded architecture), and might also be used just for the sake of debugging.
There are good documents describing this facility in details (see below). I will try to give here a quick starting guide.
MDB (modular debugger) is used as the front-end tool to retrieve debugging data from what libumem was collecting. It work with a core file, so one should be generated (automatically or manually with gcore). In order to start working with the basic (and powerful) features, you should:
-
pre-load libumem library and set environment variables for debugging
-
get basic familiarity with the libumem buffer structure in debug mode
-
be familiar with a few mdb commands
Here is a short description of those 3 items:
Pre-loading and Environment Variables
Have these settings active when and where you are running your application:
Pre-load libumem:
export LD_PRELOAD=libumem.so.1
(or setenv
LD_PRELOAD in csh)libumem.so.1
Define UMEM_DEBUG and UMEM_LOGGING, like (ksh/bash):
export UMEM_DEBUG=default
(or setenv
in
csh)UMEM_DEBUG default
export UMEM_LOGGING=transaction
(or setenv
in
csh)UMEM_LOGGING transaction
Buffer Structure (when using debug)
Libumem uses memory caches, each contains a set of buffers of a pre-defined size. Thus, there might be one cache for 16 bytes buffers, another one for 512 bytes, etc. Each allocated buffer is structured this way:
|
Metadata (8 bytes) |
User Data
|
Redzone (8 bytes) |
Debug metadata (8 bytes) |
||||
|
|
|
|
|
|
|
|
|
The first 8 bytes metadata are ignored here, we are interested in the user data, redzone and debug metadata segments.
Zooming in to these segments structure:
|
User Data |
||
|
Application available memory (uninitialized memory is set to 0xbaddcafe) |
'0xbb', denotes end of application buffer |
Rest of the allocated buffer (uninitialized memory is set to 0xbaddcafe) |
-
'0xbaddcafe' value is written to all uninitialized memory of the user data segment.
|
Redzone |
|
|
Value of '0xfeedface' (4 bytes) |
Integer value (4 bytes) from which the application allocation size can be calculated |
-
The application allocation size is calculated from the last 4 bytes of the redzone (let's denote their decimal integer by x): allocation-size = ((x – 1) / 251) - 8
Debug metadata
Pointer to
umem_bufctl_auditstructure (4 bytes)Checksum value (4 bytes)
-
We 'll see in a minute that the umem_bufctl_audit structure, which includes the stack trace of the allocation, can be dumped inside mdb
-
XORíng the pointer to umem_bufctl_audit (first 4 bytes) with the checksum value should result in the value of
0xa110c8ed.If not, this segment is probably corrupted.
A Few MDB Commands to start with and references to examples
Invoke mdb on a core file, simply by:
# mdb core-file
Within the mdb prompt, you might:
scan allocated buffers for potential out of boundary writes:
> ::umem_verify
You will get a list like:
...
umem_alloc_64 2e608 clean
umem_alloc_80 2e808 1 corrupt buffer
...
note that “_64” or “_80” are the sizes of the user data described before. Use the address in the following column for the next step.
You can then run ::umem_verify on a the specific cache:
> address::umem_verify
The latter will give you addresses of the corrupted buffer. Dump the amount of bytes you need in order to get to the bufctl_audit structure:
> buffer-address/numberOfBytesX (i.e., > 37f88/90X)
Match the buffer structure (explained before) with the dumped data, and retrieve the pointer to the bufctl_audit structure. Then run
> bufctl_audit-ptr::bufctl_audit
And if the debug data is not corrupted, you will get the buffer information, including the allocation stack trace.
See an example here, look for 'Traditional Memory Corruption'
still on out of boundary writes
Sometimes the allocation stack is not sufficient. To generate a core immediately after such a malicious write occurs, you might try to use a hidden feature, but with performance impact and memory overhead, so it probably will not fit all cases.
Set UMEM_DEBUG="firewall=1" UMEM_OPTIONS="backend=mmap" and run your application.
check memory status
> ::umem_status
This will help you to detect modify-after-read incidents. See here
See Also
Identifying
Memory Management Bugs Within Applications Using the libumem
Library
Using libumem to detect modify-after-free corruptions
Using libumem to detect write-beyond-what-you-allocate errors
http://blogs.sun.com/jwadams/entry/debugging_with_libumem_and_mdb