Weblog

All | General | Solaris
« Previous day (Aug 1, 2005) | Main | Next day (Aug 3, 2005) »
20050802 Tuesday August 02, 2005

swap space full; processes are dumping core... swap full; processes are dumping core... Last week, I was looking at a few core files, along with a colleague of mine.
Fujitsu was doing some kind of stress testing on an s10 box and they were
seeing core dumps from random processes.

The stress test involved filling up the whole 'swap' space on the system and then
issuing 'init 6'.

Some of the cores are because of SIGBUS and some are from SIGSEGV.
All the cores have atleast one thread with following kind of stack :

fec7bbe0 libc_psr.so.1`memset+0x88(fec7bca8, ff1bc214, fc150c00, 
b400000, fc000, fb4fc000)
fec7bc40 libc.so.1`_thrp_create+0x1f0(0, fc000, 0, 0, 80, fec7bed4)
fec7be68 libc.so.1`pthread_create+0x1e8(fec7bf3c, 0, 32e7c, 15a608, 0,
ff1e8bc0)
fec7bed8 startd_thread_create+0x10(32e7c, 15a608, 0, f, 52000, 15a608)
fec7bf40 restarter_event_thread+0x1f4(3eb800, 57a460, f5ef8, f5ef8,
56f50, 56f1c)
fec7bfa0 libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)

After an hour of doing the usual things (look at the regs, look at the instructions that caused these
signals etc), it suddenly occurred to us, that memory for thread stacks is allocated through mmap(2)
system call, and is passed an important option:  MAP_NORESERVE - which says don't reserve
swap space for this mapping. If this stack ever needs to be written out to swap, system looks for
swap space that time and if space is not available then, process will terminate with either SIGBUS
or SIGSEGV as mentioned clearly in the mmap() man page...So this is what is causing the random
MT processes to dump core in our case.  Spent 1hr on a behaviour which is clearly documented...
hence, I thought nobody should repeat this mistake and so this post...

I believe, this flag was introduced in the first place for the following reason:
default stack size of a userland thread is 1Mb, and not many threads could potentially have stacks
that run into multiple pages (with a page size of 8Kb); So if mmap() sets aside swap space as well,
for every thread created, with a few thousand threads itself, swap requirements of the system will run
into few Gbytes whereas these threads may never actually need to be paged out at all...
( Aug 02 2005, 01:04:11 AM PDT ) Permalink Comments [2]

Calendar

RSS Feeds

Search

Links

Navigation

Referers