Why large ISM pages are not as large as I expected.
Wednesday Jul 29, 2009
I was pondering why a large SGA segment was made up of 4M pages rather than 256M pages and decided to experiment. A simple as can be bit of code to create an ism segment
#include < sys/types.h >
#include < sys/ipc.h >
#include < sys/shm.h >
#include < stdlib.h >
#include < unistd.h >
#include < stdio.h >
int main(int argc, char **argv)
{
int sz;
int sid;
void *a;
sz = atoi(argv[1]);
if ((sid = shmget(getpid(), sz * (1024 * 1024), IPC_CREAT)) == -1)
{
perror("shmget failed");
exit(1);
}
if ((a = shmat(sid, (void *)0, SHM_SHARE_MMU)) == -1)
{
perror("shmat failed");
exit(1);
}
sleep(60);
}
In a system with UltraSparc VI+ cpu's (panther) I found by default asking for a 1G ISM segment, we were still producing 4M pages according to pmap -xs. A little bit of kernel code reading and we found the decision is made in map_pgszism which looks like this
map_pgszism(caddr_t addr, size_t len)
591 {
592 uint_t szc;
593 size_t pgsz;
594
595 for (szc = mmu_page_sizes - 1; szc >= TTE4M; szc--) {
596 if (disable_ism_large_pages & (1 << szc))
597 continue;
598
599 pgsz = hw_page_array[szc].hp_size;
600 if ((len >= pgsz) && IS_P2ALIGNED(addr, pgsz))
601 return (pgsz);
602 }
603
604 return (DEFAULT_ISM_PAGESIZE);
605 }
606
A little poking around with mdb shows the value of disable_ism_large_pages to be 0x36. In the common code it is set to 0x2, so must be some platform specific code resetting this value. Poking disable_ism_large_pages to 0x2 with mdb meant the pages for the ISM segment were now 256M in size as reported by pmap. No recommended as a spur of the moment action for your production E25K running Oracle.
disable_ism_large_pages gets set in hat_init_pagesize as an or of disable_large_pages which is set to a shifting and bitmasking perturbation of mmu_exported_pagesize_mask. So a few more hops leads to bugid 6313025 which describes why 32M and 256M pages were turned off for the Panther cpu. Executing application code from the larger (>4M) pages caused nasty thing to happen. The bug is dated 2005 and I had a very distant memory of it, but it was worth tracking down the specifics.










