Darryl Gove's blog

Thursday Feb 07, 2008

Page size and memory layout

Support for large pages has been available since Solaris 9, I've previously talked about the various ways that an application can be coaxed into using large pages. However, I wanted to quickly write up how the large pages are laid out in memory. Take the following code that allocates a large chunk of memory, and then iterates over it for enough time to run pmap -xs on it:

#include <stdlib.h>

void main()
{
  int x,y;
  char *c;
  c=(char*)malloc(sizeof(char)*300000000);
  for (y=0; y<; y++)
  for (x=0; x<300000000; x++) { c[x]=c[x]+y;}
}

Compiling this code to use 4MB pages and then running the resulting executable produces a pmap output like:

% cc -xpagesize=4M t.c
% a.out&
[1] 15501
% pmap -xs 15501
15501:  a.out
 Address  Kbytes     RSS    Anon  Locked Pgsz Mode   Mapped File
00010000       8       8       -       -   8K r-x--  a.out
00020000       8       8       8       -   8K rwx--  a.out
00022000    3960    3960    3960       -   8K rwx--    [ heap ]
00400000  290816  290816  290816       -   4M rwx--    [ heap ]
...

Notice that the heap starts on 8KB pages, and uses these up until the memory reaches a 4MB boundary and then starts using 4MB pages. In this case it means that nearly 4MB of the memory is not using 4MB pages - if this happens to be where the majority of the program's active data resides, then there will still be plenty of TLB misses.

Fortunately, it is possible to tell the linker where to start the heap. There are some mapfiles provided in /usr/lib/ld/ for various scenarios, the one that we need is map.bssalign. Recompiling with this produces the following memory layout:

% cc -M /usr/lib/ld/map.bssalign -xpagesize=4M t.c
% a.out&
[1] 19077
% pmap -xs 19077
19077:  a.out
 Address  Kbytes     RSS    Anon  Locked Pgsz Mode   Mapped File
00010000       8       8       -       -   8K r-x--  a.out
00020000       8       8       8       -   8K rwx--  a.out
00400000  294912  294912  294912       -   4M rwx--    [ heap ]

With this change the heap now starts on a 4MB boundary and is entirely mapped with 4MB pages.

Comments:

<snip>
[..]
for (y=0; y<; y++)
[..]
</snip>

Are you sure your code compiled, or do you guys use a super-intelligent, AI-based compiler from Sun labs that's in beta testing ? :)

Posted by Dinuk on February 12, 2008 at 03:21 PM PST #

Definitely the latter option;) Just add a number for the comparison... I managed to delete it when hacking the html.

Posted by Darryl Gove on February 25, 2008 at 01:24 PM PST #

Post a Comment:
Comments are closed for this entry.

Calendar

Search this blog

About

Solaris Application Programming

Book resources

Recent entries

Custom search

Tag cloud

ats bit book c++ cmt communityone compiler cooltools cpu2006 developers dtrace gccfss hpc multithreading openmp opensparc parallelisation parallelization performance performanceanalyzer secondlife solaris solarisapplicationprogramming sparc spot sunstudio t2 ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Navigation

Referers

Feeds