Interconnectedness of all things

Why large ISM pages are not as large as I expected.

Wednesday Jul 29, 2009

I was pondering why a large SGA segment was made up of 4M pages rather than 256M pages and decided to experiment. A simple as can be bit of code to create an ism segment

#include < sys/types.h >
#include < sys/ipc.h >
#include < sys/shm.h >
#include < stdlib.h >
#include < unistd.h >
#include < stdio.h >

int main(int argc, char **argv)
{
  int sz;
  int sid;
  void *a;

  sz = atoi(argv[1]);

  if ((sid = shmget(getpid(), sz * (1024 * 1024), IPC_CREAT)) == -1)
    {
      perror("shmget failed");
      exit(1);
    }

  if ((a = shmat(sid, (void *)0, SHM_SHARE_MMU)) == -1) 
    {
      perror("shmat failed");
      exit(1);
    }

  sleep(60);
}

In a system with UltraSparc VI+ cpu's (panther) I found by default asking for a 1G ISM segment, we were still producing 4M pages according to pmap -xs. A little bit of kernel code reading and we found the decision is made in map_pgszism which looks like this

map_pgszism(caddr_t addr, size_t len)
    591 {
    592 	uint_t szc;
    593 	size_t pgsz;
    594 
    595 	for (szc = mmu_page_sizes - 1; szc >= TTE4M; szc--) {
    596 		if (disable_ism_large_pages & (1 << szc))
    597 			continue;
    598 
    599 		pgsz = hw_page_array[szc].hp_size;
    600 		if ((len >= pgsz) && IS_P2ALIGNED(addr, pgsz))
    601 			return (pgsz);
    602 	}
    603 
    604 	return (DEFAULT_ISM_PAGESIZE);
    605 }
    606 
A little poking around with mdb shows the value of disable_ism_large_pages to be 0x36. In the common code it is set to 0x2, so must be some platform specific code resetting this value. Poking disable_ism_large_pages to 0x2 with mdb meant the pages for the ISM segment were now 256M in size as reported by pmap. No recommended as a spur of the moment action for your production E25K running Oracle.

disable_ism_large_pages gets set in hat_init_pagesize as an or of disable_large_pages which is set to a shifting and bitmasking perturbation of mmu_exported_pagesize_mask. So a few more hops leads to bugid 6313025 which describes why 32M and 256M pages were turned off for the Panther cpu. Executing application code from the larger (>4M) pages caused nasty thing to happen. The bug is dated 2005 and I had a very distant memory of it, but it was worth tracking down the specifics.

Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

Post a Comment:
Comments are closed for this entry.