The next few blogs will focus on how to avoid having faults of various kinds occur at the wrong times. They still need to occur proper program functioning but you do have a great deal of control over where they occur.
Memory Locking
Once the physical pages representing some portion of the process' address space are attached to a process they will remain attached unless the system is low on memory and must reclaim pages. If so, the physical page will be removed. The virtual address remains part of the process' address space and the next time the page is accessed, the page fault process will occur all over again. One way of ensuring that pages remain attached to a process once they are faulted in is to "lock" them into memory.
There are three interfaces for memory locking. Two are commonly used - mlock(3C)/munlock(3C) and mlockall(3C)/munlockall(3C) - and are actually veneers over the third interface, memcntl(3C).
mlock() and munlock()
The most important point to remember is
that the memory locking interfaces mlock()/munlock() act on specified
address ranges. There are some important caveats to these interfaces:
- The starting and ending addresses must be aligned on a page boundary. The default page size can be obtained from sysconf(3C) or from getpagesize(3C) which is a veneer over sysconf().
- The entire
address range must be valid for that process. One cannot lock current
gaps in the address space of a process for future use.
- Memory locking requires privilege. A process needs to have the {PRIV_PROC_LOCK_MEMORY} privilege (see privileges(5)) so as to be able to tie memory down. Why? Think denial of service. If someone were able to lock a very large portion of the physical memory of a system down, then other processes would either not run, would fail or would start frenziedly paging against themselves.
- The amount of memory which can be locked down is constrained by the amount of physical memory on the system. The operating system does reserve a portion of physical memory which can never be locked down.
Let's look at a sample program:
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
int
main()
{
int psize;
char *buffer;
psize = getpagesize();
buffer = memalign(psize, psize);
if (0x0 != buffer) {
if (0x0 == mlock(buffer, psize))
printf("worked!\n");
else
perror("mlock");
} else {
perror("memalign");
}
}
Compiling and running this as a normal user produces the output of:
mlock: not owner
If you remember, we need to have privilege to lock memory down. Let's add the PRIV_PROC_LOCK_MEMORY to our login with:
# usermod -K defaultpriv=basic,priv_proc_lock_memory ourid
We now see this in /etc/user_attr for "ourid":
ourid::::type=normal;defaultpriv=basic,priv_proc_lock_memory
We now need to log out and log back in since privileges are picked up when you first log in and then passed on.
% ./t
worked!
We've now locked down a chunk of our address space. We have not, however, done anything about actually associating a physical page with that address range. That occurs when we
first touch the page by reading or writing to any location within the page. Let's add a memset routine to our program:
(void) memset(buffer, 0, psize);
When we touch the first byte starting at address buffer, the system will recognize that we have a valid virtual address but no physical page associated with that address and do the work necessary to create that association. We can monitor faults within our process by using truss with the -m fltpage
option. With this code:
...
if (0x0 == mlock(buffer, psize)) {
printf("buffer:0x%p\n", buffer);
(void) memset(buffer, 0, psize);
...
We see, at the end of the truss output, these lines:
buffer:0x8061000
write(1, " b u f f e r : 0 x 8 0 6".., 17) = 17
Incurred fault #11, FLTPAGE %pc = 0xCFEC8C50 addr = 0x08061000
This is where the physical page is attached to our process. Since we previously told the system to lock that address range in our process, the page is now locked down.
Truss is a useful tool to understand the faulting behavior of your program in a development environment. It should not be used in production since it has a severe negative effect on application performance.
Memory locks are not held across a fork. If you lock memory before forking, the same pages will need to be locked in the child if appropriate.
As a final note, you can lock the same page multiple times but a single munlock() operation will unlock the page - the locks do not nest.
mlockall() and munlockall()
Unlike mlock()/munlock(), these interfaces do not take address
ranges. They take a single flag or the logical or of the two flags.
The two flags are:
MCL_CURRENT - mark current address space mappings as locked.
Memory currently associated with the mapping is locked
as is future memory faulted in for that mapping.
MCL_FUTURE - lock down memory associated with any new mappings
made from this point forward. As an example, memory
associated with a file mmaped into the process'
address space would be locked down when it is faulted
in.
The same caveats as to privilege hold for this interface as well as for locks held over an a fork() call and the lack of nesting of mlockall()/munlockall() calls.
An important difference between munlock() and munlockall() is that munlockall() takes no flags. It unlocks everything that has been locked across the entire process.