Musings on realtime The jel's weblog

Thursday May 31, 2007

An important aspect of real-time performance is avoidance of page faults. Whenever the a program touches a page of memory that is not yet resident on the system, the kernel must locate the backing store for the page, find a page of physical memory to connect to the data on the backing store and fill the page with the data from the backing store. The data may be executable code ("text"), initialized data from the program executable file, uninitialized data from the program executable file, data from a page added to the program's heap or a stack segment. In some cases, a file is read and in other cases the page is simply zeroed out. In all cases, these operations take a relatively long time and are not something you wish to have happen in the middle of time-sensitive processing.

The next few blogs will focus on how to avoid having faults of various kinds occur at the wrong times. They still need to occur proper program functioning but you do have a great deal of control over where they occur.

Memory Locking
Once the physical pages representing some portion of the process' address space are attached to a process they will remain attached unless the system is low on memory and must reclaim pages. If so, the physical page will be removed. The virtual address remains part of the process' address space and the next time the page is accessed, the page fault process will occur all over again. One way of ensuring that pages remain attached to a process once they are faulted in is to "lock" them into memory.

There are three interfaces for memory locking. Two are commonly used - mlock(3C)/munlock(3C) and mlockall(3C)/munlockall(3C) - and are actually veneers over the third interface, memcntl(3C).

mlock() and munlock()
The most important point to remember is that the memory locking interfaces mlock()/munlock() act on specified address ranges. There are some important caveats to these interfaces:

  • The starting and ending addresses must be aligned on a page boundary. The default page size can be obtained from sysconf(3C) or from getpagesize(3C) which is a veneer over sysconf().
  • The entire address range must be valid for that process. One cannot lock current gaps in the address space of a process for future use.
  • Memory locking requires privilege. A process needs to have the {PRIV_PROC_LOCK_MEMORY} privilege (see privileges(5)) so as to be able to tie memory down. Why? Think denial of service. If someone were able to lock a very large portion of the physical memory of a system down, then other processes would either not run, would fail or would start frenziedly paging against themselves.
  • The amount of memory which can be locked down is constrained by the amount of physical memory on the system. The operating system does reserve a portion of physical memory which can never be locked down.

Let's look at a sample program:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>

int
main()
{
        int psize;
        char *buffer;

        psize = getpagesize();
        buffer = memalign(psize, psize);
        if (0x0 != buffer) {
                if (0x0 == mlock(buffer, psize))
                        printf("worked!\n");
                else
                        perror("mlock");
        } else {
                perror("memalign");
        }
}

Compiling and running this as a normal user produces the output of:

mlock: not owner

If you remember, we need to have privilege to lock memory down. Let's add the PRIV_PROC_LOCK_MEMORY to our login with:

# usermod -K defaultpriv=basic,priv_proc_lock_memory ourid

We now see this in /etc/user_attr for "ourid":

ourid::::type=normal;defaultpriv=basic,priv_proc_lock_memory

We now need to log out and log back in since privileges are picked up when you first log in and then passed on.

% ./t
worked!


We've now locked down a chunk of our address space. We have not, however, done anything about actually associating a physical page with that address range. That occurs when we
first touch the page by reading or writing to any location within the page. Let's add a memset routine to our program:

        (void) memset(buffer, 0, psize);

When we touch the first byte starting at address buffer, the system will recognize that we have a valid virtual address but no physical page associated with that address and do the work necessary to create that association. We can monitor faults within our process by using truss with the -m fltpage
option. With this code:

        ...
        if (0x0 == mlock(buffer, psize)) {
                printf("buffer:0x%p\n", buffer);
                (void) memset(buffer, 0, psize);
                ...

We see, at the end of the truss output, these lines:

buffer:0x8061000
write(1, " b u f f e r : 0 x 8 0 6".., 17)      = 17
    Incurred fault #11, FLTPAGE  %pc = 0xCFEC8C50  addr = 0x08061000

This is where the physical page is attached to our process. Since we previously told the system to lock that address range in our process, the page is now locked down.

Truss is a useful tool to understand the faulting behavior of your program in a development environment. It should not be used in production since it has a severe negative effect on application performance.

Memory locks are not held across a fork. If you lock memory before forking, the same pages will need to be locked in the child if appropriate.

As a final note, you can lock the same page multiple times but a single munlock() operation will unlock the page - the locks do not nest.

mlockall() and munlockall()

Unlike mlock()/munlock(), these interfaces do not take address
ranges. They take a single flag or the logical or of the two flags.
The two flags are:

        MCL_CURRENT - mark current address space mappings as locked.
                Memory currently associated with the mapping is locked
                as is future memory faulted in for that mapping.

        MCL_FUTURE - lock down memory associated with any new mappings
                made from this point forward. As an example, memory
                associated with a file mmaped into the process'
                address space would be locked down when it is faulted
                in.

The same caveats as to privilege hold for this interface as well as for locks held over an a fork() call and the lack of nesting of mlockall()/munlockall() calls.

An important difference between munlock() and munlockall() is that munlockall() takes no flags. It unlocks everything that has been locked across the entire process.

Comments:

Post a Comment:
Comments are closed for this entry.