Richard McDougall's Weblog
Richard McDougall's Weblog
Commentary from Race Control

20050708 Friday July 08, 2005

Solaris Internals: 2nd Edition!

It's no secret that we hope to get an updated Solaris Internals book out. Jim and I have had this AI on our desk for a while. The good news is that it's been making quite a bit of progress of late!

The idea is to update the existing book from Solaris 7 to Solaris 10, highly leveraging OpenSolaris, DTrace and mdb. There's a lot to add, given the onslaught of development: substantially revised virtual memory, a new file system interface, a new threads model, zones, ZFS, Least privilege, SMF and the list goes on. We scoped adding all of this, and we'd have a 2000+ page book when we're done.

What we've decided to do is break up the work into smaller deliverable chunks, and deliver it in parts. Yes, we're taking the Knuth approach: Solaris 10 Internals will have more than one volume. We're splitting some of the new material and most of the performance discussion out into the subsequent volume. We're enlisting a few helpers for the subsequent volume, to make it more of a community effort.

So that you can keep the pressure on us, I thought I'd share where we are with the current volume. Our target is to be done with this volume in the next couple of months.

Part Chapter Primary Old pages Target pages Left

Preface JM 6 7 2
I – Intro Introduction Phil Harman/JM 36 40 10






Running Page Total




782
















II – Tools Introduction JM 2 2 2

Dtrace Jon 0 30 5

MDB RMC 5 5 5

Kstat Boothby/RMC 0 15 0






III – Memory VM Intro RMC 6 6 0

VM Monitoring RMC 44 44 0

Large pages RMC 14 14 0

Memory Arch RMC 36 36 0

Physical Mem Mngmnt RMC 20 20 0

HAT Tariq 12 20 7

Kernel Memory RMC 48 48 0






IV – Platform Sync Intro JM 16 16 5

Sync Impl JM 16 16 4

NUMA/CMT RMC/Saxe/Chew 16 18 4

Kernel Services JM 37 38 20

Kernel Modules & Linker JM 0 20 20






V – Process Model Process Model JM 48 48 20

Sched Classes & Disp JM 65 50 40

ProcFS JM 22 22 6

Signals JM 18 20 10

Resource Management JM 8 20 12

IPC JM 48 48 10






VI – Files & File Systems Files RMC 40 40 4

Intro RMC 18 22 6

FS Architecture RMC 46 70 0

UFS Shawn 24 30 6

NFS Spencer/Sameer 0 30 0

ZFS RMC 20 0 20












Appendix A ELF File Format JM 12 12 12
Appendix B Kernel Maps RMC 12 12 0




819 260

Technorati Tag: OpenSolaris

Technorati Tag: Solaris

Technorati Tag: DTrace

( Jul 08 2005, 05:45:47 PM PDT ) Permalink Comments [11]


Tracing the Solaris 10 File System Interface

Here's a quick script to trace activity though the central file system interface. Until there is a general file system provider, this script should serve as a basic framework help construct other file system tracing scripts.

# ./voptrace.d /tmp
Event           Device                                                Path  RW     Size   Offset
fop_putpage     -          /tmp//filebench/bin/i386/fastsu                   -     4096     4096
fop_inactive    -          /tmp//filebench/bin/i386/fastsu                   -        0        0
fop_putpage     -          /tmp//filebench/xanadu/WEB-INF/lib/classes12.jar  -     4096   204800
fop_inactive    -          /tmp//filebench/xanadu/WEB-INF/lib/classes12.jar  -        0        0
fop_putpage     -          /tmp/filebench1.63_s10_x86_sparc_pkg.tar.Z        -     4096  7655424
fop_inactive    -          /tmp/filebench1.63_s10_x86_sparc_pkg.tar.Z        -        0        0
fop_putpage     -          /tmp//filebench/xanadu/WEB-INF/lib/classes12.jar  -     4096   782336
fop_inactive    -          /tmp//filebench/xanadu/WEB-INF/lib/classes12.jar  -        0        0
fop_putpage     -          /tmp//filebench/bin/amd64/filebench               -     4096    36864

The source is below:

#!/usr/sbin/dtrace -s

/*
 * Trace the vnode interface
 *
 * USAGE: voptrace.d [/all | /mountname ]
 *
 * Author: Richard McDougall
 *
 * 7/8/2005
 */

#pragma D option quiet

:::BEGIN
{
        printf("%-15s %-10s %51s %2s %8s %8s\n",
                "Event", "Device", "Path", "RW", "Size", "Offset");
        self->trace = 0;
        self->path = "";
}


::fop_*:entry
/self->trace == 0/
{
        /* Get vp: fop_open has a pointer to vp */
        self->vpp = (vnode_t **)arg0;
        self->vp = (vnode_t *)arg0;
        self->vp = probefunc == "fop_open" ? (vnode_t *)*self->vpp : self->vp;

        /* And the containing vfs */
        self->vfsp = self->vp ? self->vp->v_vfsp : 0;

        /* And the paths for the vp and containing vfs */
        self->vfsvp = self->vfsp ? (struct vnode *)((vfs_t *)self->vfsp)->vfs_vnodecovered : 0;
        self->vfspath = self->vfsvp ? stringof(self->vfsvp->v_path) : "unknown";

        /* Check if we should trace the root fs */
        ($1 == "/all" ||
         ($1 == "/" && self->vfsp && \
         (self->vfsp == `rootvfs))) ? self->trace = 1 : self->trace;

        /* Check if we should trace the fs */
        ($1 == "/all" || (self->vfspath == $1)) ? self->trace = 1 : self->trace;
}

/*
 * Trace the entry point to each fop
 *
 */
::fop_*:entry
/self->trace/
{
        self->path = (self->vp != NULL && self->vp->v_path) ? stringof(self->vp->v_path) : "unknown";
        self->len = 0;
        self->off = 0;

        /* Some fops has the len in arg2 */
        (probefunc == "fop_getpage" || \
         probefunc == "fop_putpage" || \
         probefunc == "fop_none") ? self->len = arg2 : 1;

        /* Some fops has the len in arg3 */
        (probefunc == "fop_pageio" || \
         probefunc == "fop_none") ? self->len = arg3 : 1;

        /* Some fops has the len in arg4 */
        (probefunc == "fop_addmap" || \
         probefunc == "fop_map" || \
         probefunc == "fop_delmap") ? self->len = arg4 : 1;

        /* Some fops has the offset in arg1 */
        (probefunc == "fop_addmap" || \
         probefunc == "fop_map" || \
         probefunc == "fop_getpage" || \
         probefunc == "fop_putpage" || \
         probefunc == "fop_seek" || \
         probefunc == "fop_delmap") ? self->off = arg1 : 1;

        /* Some fops has the offset in arg3 */
        (probefunc == "fop_close" || \
         probefunc == "fop_pageio") ? self->off = arg3 : 1;

        /* Some fops has the offset in arg4 */
        probefunc == "fop_frlock" ? self->off = arg4 : 1;

        /* Some fops has the pathname in arg1 */
        self->path = (probefunc == "fop_create" || \
         probefunc == "fop_mkdir" || \
         probefunc == "fop_rmdir" || \
         probefunc == "fop_remove" || \
         probefunc == "fop_lookup") ?
                strjoin(self->path, strjoin("/", stringof(arg1))) : self->path;
        printf("%-15s %-10s %51s %2s %8d %8d\n",
                probefunc,
                "-", self->path, "-", self->len, self->off);
        self->type = probefunc;
}

::fop_*:return
/self->trace == 1/
{
        self->trace = 0;
}


/* Capture any I/O within this fop */
io:::start
/self->trace/
{
        printf("%-15s %-10s %51s %2s %8d %8u\n",
                self->type, args[1]->dev_statname,
                self->path, args[0]->b_flags & B_READ ? "R" : "W",
                args[0]->b_bcount, args[0]->b_blkno);

}

Technorati Tag: OpenSolaris

Technorati Tag: Solaris

Technorati Tag: DTrace

( Jul 08 2005, 04:05:55 PM PDT ) Permalink Comments [1]



Archives
Referrers