Weblog

All | General | Solaris
« Previous day (Oct 9, 2007) | Main | Next day (Oct 11, 2007) »
20071010 Wednesday October 10, 2007

Dump device configuration - part2 Dump device configuration - part2 To my earlier post on 'why the dump device configured so late', I got a very interesting reply, questioning the claim: 'window will be significantly reduced'. Let me be frank, as of the first post, I didn't conduct any studies as to quantify the gain.
So the mail prompted me to do some tests to quantify the gain.
I want to understand how many function calls get made from the word 'go' [i.e from main()] till dumpinit() gets called; And also figure out how many modules get loaded in the same window.

That seems a pretty arduous task. At the same time I can't spend days trying to just get this data. This is where OpenSolaris rocks - if not for DTrace, I would have given up on this. DTrace kernel module gets loaded very early in the boot right after the kernel starts [i.e unix/genunix/krtld modules are loaded]. DTrace supports a feature called 'Anonymous tracing' - which can be used in this context just to get the data I am interested. This won't get me right from main() but as close to main() as possible.  This option is good enough to get an idea of what the numbers look like.

I needed answers for atleast the following questions :
    - How many function calls are made from the point DTrace can start to the point of dumpinit() ?
    - How many function calls are made from the point root is mounted to the point of dumpinit() ?
    - How many function calls are made from start_other_cpus() [this routine starts the rest of the CPUs in a multi cpu system; Till this point, only the boot cpu does all the work; This opens up scope for race conditions / corruption / hangs - So even if we could capture crash file from this point, that would still be a nice win] to the point of dumpinit() ?

I put together the following simple script to answer the 3rd question.

# cat mp_init.d
BEGIN {
        dumpinit = 0;
}
/* start collecting data here */
fbt::start_other_cpus:return
/dumpinit == 0/
{
        dumpinit = 1;
}
fbt:::entry
/dumpinit == 1/
{
        @[probefunc]=count();
}
/* stop collecting data here */
fbt::dumpinit:return
/dumpinit == 1/
{
        dumpinit = 0;
}
#
And invoke the script as follows :
# dtrace -As ./mp_init.d
dtrace: saved anonymous enabling in /kernel/drv/dtrace.conf
dtrace: added forceload directives to /etc/system
dtrace: run update_drv(1M) or reboot to enable changes
#

Then reboot the system and when system comes back, just run :
#dtrace -ae > mp_init.out

mp_init.out has thousands of lines which look something like :
  tsc_scalehrtime                                              369164
  syscall_exit                                                 375467

Module information can be gotten using 'kmdb' - For this I booted the system under kmdb, put a break point in start_other_cpus() and let it continue.  When the break point is hit, I ran '::modinfo' and saved the o/p; Then I put one more break point in dumpinit() and again let it continue; When the new breakpoint is hit collect the modinfo o/p again and comparing this with the previous one gives you an idea of how many modules got loaded in between.

With simple changes to the script, I got answers for the 1st and 2nd questions as well. Module info can be gotten as  described above.

Then I used awk(1) to post process mp_init.out  and the resulting numbers look like this :
     Q) How many function calls are made from the point DTrace can start to the point of dumpinit() ?
     A) 59 million function calls were made from the point dtrace can start   to the point of dumpinit:return

     Q) How many function calls are made from the point root is mounted to the point of dumpinit() ?
      A) 52 million calls were made from mounting root to dumpinit:return
           - 120 modules got loaded in the same window
    
     Q) How many function calls are made from start_other_cpus()  to the point of dumpinit() ?
     A)  43 million calls were made from start_other_cpus() to dumpinit:return
          - 52 modules got loaded in the same window

The data speaks for itself !! Data modified by 43million routines can be captured in a dump, if we could advance dump device configuration to just before start_other_cpus(). Gain is more if it could be advanced closer to mount of root FS.
( Oct 10 2007, 05:36:01 AM PDT ) Permalink

Calendar

RSS Feeds

Search

Links

Navigation

Referers