Memory corruption incidents might be tough to handle. Solaris' libumem (an alternative allocation library) offers a debugging facility, which is useful for memory inquiring, when trying to detect leaking or corruption. Using libumem is anyway recommended when running a multi-threaded application (especially on a multi-processors or other multi-threaded architecture), and might also be used just for the sake of debugging.

There are good documents describing this facility in details (see below). I will try to give here a quick starting guide.

MDB (modular debugger) is used as the front-end tool to retrieve debugging data from what libumem was collecting. It work with a core file, so one should be generated (automatically or manually with gcore). In order to start working with the basic (and powerful) features, you should:

  • pre-load libumem library and set environment variables for debugging

  • get basic familiarity with the libumem buffer structure in debug mode

  • be familiar with a few mdb commands

Here is a short description of those 3 items:

Pre-loading and Environment Variables

Have these settings active when and where you are running your application:

Pre-load libumem:

export LD_PRELOAD=libumem.so.1

(or setenv LD_PRELOAD libumem.so.1 in csh)

Define UMEM_DEBUG and UMEM_LOGGING, like (ksh/bash):

export UMEM_DEBUG=default

(or setenv UMEM_DEBUG default in csh)

export UMEM_LOGGING=transaction

(or setenv UMEM_LOGGING transaction in csh)



Buffer Structure (when using debug)

Libumem uses memory caches, each contains a set of buffers of a pre-defined size. Thus, there might be one cache for 16 bytes buffers, another one for 512 bytes, etc. Each allocated buffer is structured this way:



Metadata (8 bytes)

User Data


Redzone

(8 bytes)

Debug metadata

(8 bytes)








The first 8 bytes metadata are ignored here, we are interested in the user data, redzone and debug metadata segments.

Zooming in to these segments structure:


User Data

Application available memory

(uninitialized memory is set to 0xbaddcafe)

'0xbb', denotes end of application buffer

Rest of the allocated buffer

(uninitialized memory is set to 0xbaddcafe)

  • '0xbaddcafe' value is written to all uninitialized memory of the user data segment.

Redzone

Value of '0xfeedface' (4 bytes)

Integer value (4 bytes) from which the application allocation size can be calculated

  • The application allocation size is calculated from the last 4 bytes of the redzone (let's denote their decimal integer by x): allocation-size = ((x – 1) / 251) - 8

    Debug metadata

    Pointer to umem_bufctl_audit structure (4 bytes)

    Checksum value (4 bytes)

  • We 'll see in a minute that the umem_bufctl_audit structure, which includes the stack trace of the allocation, can be dumped inside mdb

  • XORíng the pointer to umem_bufctl_audit (first 4 bytes) with the checksum value should result in the value of 0xa110c8ed. If not, this segment is probably corrupted.

A Few MDB Commands to start with and references to examples

Invoke mdb on a core file, simply by:

# mdb core-file

Within the mdb prompt, you might:

scan allocated buffers for potential out of boundary writes:

> ::umem_verify

You will get a list like:

...

umem_alloc_64 2e608 clean

umem_alloc_80 2e808 1 corrupt buffer

...

note that “_64” or “_80” are the sizes of the user data described before. Use the address in the following column for the next step.

You can then run ::umem_verify on a the specific cache:

> address::umem_verify

The latter will give you addresses of the corrupted buffer. Dump the amount of bytes you need in order to get to the bufctl_audit structure:

> buffer-address/numberOfBytesX (i.e., > 37f88/90X)

Match the buffer structure (explained before) with the dumped data, and retrieve the pointer to the bufctl_audit structure. Then run

> bufctl_audit-ptr::bufctl_audit

And if the debug data is not corrupted, you will get the buffer information, including the allocation stack trace.

See an example here, look for 'Traditional Memory Corruption'

still on out of boundary writes

Sometimes the allocation stack is not sufficient. To generate a core immediately after such a malicious write occurs, you might try to use a hidden feature, but with performance impact and memory overhead, so it probably will not fit all cases.

Set UMEM_DEBUG="firewall=1" UMEM_OPTIONS="backend=mmap" and run your application.

check memory status

> ::umem_status

This will help you to detect modify-after-read incidents. See here

See Also

Identifying Memory Management Bugs Within Applications Using the libumem Library

Using libumem to detect modify-after-free corruptions

Using libumem to detect write-beyond-what-you-allocate errors

http://blogs.sun.com/jwadams/entry/debugging_with_libumem_and_mdb

mdb/kmdb, libumem (pdf)


Comments:

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2009 by Amit Hurvitz