Darryl Gove's blog
Page size and memory layout
Support for large pages has been available since Solaris 9, I've previously talked about the various ways that an application can be coaxed into using large pages. However, I wanted to quickly write up how the large pages are laid out in memory. Take the following code that allocates a large chunk of memory, and then iterates over it for enough time to run pmap -xs on it:
#include <stdlib.h>
void main()
{
int x,y;
char *c;
c=(char*)malloc(sizeof(char)*300000000);
for (y=0; y<; y++)
for (x=0; x<300000000; x++) { c[x]=c[x]+y;}
}
Compiling this code to use 4MB pages and then running the resulting executable produces a pmap output like:
% cc -xpagesize=4M t.c % a.out& [1] 15501 % pmap -xs 15501 15501: a.out Address Kbytes RSS Anon Locked Pgsz Mode Mapped File 00010000 8 8 - - 8K r-x-- a.out 00020000 8 8 8 - 8K rwx-- a.out 00022000 3960 3960 3960 - 8K rwx-- [ heap ] 00400000 290816 290816 290816 - 4M rwx-- [ heap ] ...
Notice that the heap starts on 8KB pages, and uses these up until the memory reaches a 4MB boundary and then starts using 4MB pages. In this case it means that nearly 4MB of the memory is not using 4MB pages - if this happens to be where the majority of the program's active data resides, then there will still be plenty of TLB misses.
Fortunately, it is possible to tell the linker where to start the heap. There are some mapfiles provided in /usr/lib/ld/ for various scenarios, the one that we need is map.bssalign. Recompiling with this produces the following memory layout:
% cc -M /usr/lib/ld/map.bssalign -xpagesize=4M t.c % a.out& [1] 19077 % pmap -xs 19077 19077: a.out Address Kbytes RSS Anon Locked Pgsz Mode Mapped File 00010000 8 8 - - 8K r-x-- a.out 00020000 8 8 8 - 8K rwx-- a.out 00400000 294912 294912 294912 - 4M rwx-- [ heap ]
With this change the heap now starts on a 4MB boundary and is entirely mapped with 4MB pages.
Posted at 10:01AM Feb 07, 2008 by Darryl Gove in Sun | Comments[2]

<snip>
[..]
for (y=0; y<; y++)
[..]
</snip>
Are you sure your code compiled, or do you guys use a super-intelligent, AI-based compiler from Sun labs that's in beta testing ? :)
Posted by Dinuk on February 12, 2008 at 03:21 PM PST #
Definitely the latter option;) Just add a number for the comparison... I managed to delete it when hacking the html.
Posted by Darryl Gove on February 25, 2008 at 01:24 PM PST #