Sherry Q. Moore's Weblog

Sherry Q. Moore's Weblog


20080130 Wednesday January 30, 2008

 Honorary Members of the ZFS Team



( Jan 31 2008, 10:05:47 AM PST / Jan 30 2008, 01:53:15 PM PST ) Permalink
Trackback: http://blogs.sun.com/sherrym/entry/zfs_recognization_dinner1

20070221 Wednesday February 21, 2007

 Feeling philosophical

"The second-named motive, ambition or, in milder terms, the aiming
at recognition and consideration, lies firmly fixed in human nature.
With absence of mental stimulus of this kind, human cooperation
would be entirely impossible; the desire for the approval of one's
fellowman certainly is one of the most important binding powers of
society.  In this complex of feelings, constructive and destructive
forces lie closely together.  Desire for approval and recognition
is a healthy motive; but the desire to be acknowledged as better,
stronger or more intelligent than a fellow being or fellow scholar
easily leads to an excessively egoistic psychological adjustment,
which may become injurious for the individual and for the community."

--From Albert Einstein's Out of my later years, 
On Education



( Feb 21 2007, 10:16:02 PM PST / Feb 21 2007, 10:16:02 PM PST ) Permalink
Trackback: http://blogs.sun.com/sherrym/entry/feeling_philosophical

20051116 Wednesday November 16, 2005

 Congratulations to the zfs team!

My two children and I are proud members of the zfs family team. The kids thought everybody at Sun worked on zfs, and were shocked to find out otherwise. I can see their point: if not for the fun problems and the goofy people, at least join for the wonderful parties and Bill's famous cheese cake. :)

The kids are thrilled that they finally get to see Daddy again. They make it sound like such a treat that we feel terribly guilty.

My favorite thing about zfs integration is that, now I can start every conversation with Bill like this:
  • "Now that zfs is in the gate, maybe you can mow the lawn?"
  • "Now that zfs is in the gate, can you take the kids to their tennis lesson on Saturday?"
  • "Now that zfs is in the gate, maybe I should get an iPOD nano?"
  • ...
:) Congratulations!

( Nov 16 2005, 10:16:27 AM PST / Nov 16 2005, 09:00:00 AM PST ) Permalink Comments [20]
Trackback: http://blogs.sun.com/sherrym/entry/congratulations

20050614 Tuesday June 14, 2005

 Whose bug is it anyway?

Whose bug is it anyway?

In the process of trying to get Solaris compiled with the Sun Studio 10, aka, Vulcan compilers, I debugged numerous problems, some of which were not obvious at the time whose bugs they were. Here is one of them:

When libc.so.1 was compiled with GCC, everything worked fine; when it was compiled with Vulcan, all multithreaded programs hung. After some more debugging, the problem seemed to be in the Vulcan compiled usr/src/lib/libc/port/threads/synch.c: if I linked all the object files with a GCC compiled synch.o, everything worked. "It must be a compiler bug!"

I was in the middle of debugging 5 other panics and hangs at the time, so I made an offer to my compiler buddies, "Beer and lunch is on me for whoever figured it out." They tried, but at the end of the day, there was still no root cause. So I took a closer look. It appeared that the hung thread was waiting for a mutex, but nobody owned the mutex, yet the thread was not woken up. I looked at synch.c, and something caught my eye: the various lock routines calling swap32. swap32 is an inline function, and GCC and Vulcan have different inline implementations. If there is a bug there, that could explain why the GCC compiled version worked but not the Vulcan compiled verison.


        .inline swap32, 0
        xchgl   (%rdi), %esi
        .end

Let's see how it can be used:


void
spin_lock_clear(mutex_t *mp)
{
        ulwp_t *self = curthread;

        mp->mutex_owner = 0;
        if (swap32(&mp->mutex_lockword, 0) & WAITERMASK) {
                (void) ___lwp_mutex_wakeup(mp);
                if (self->ul_spin_lock_wakeup != UINT_MAX)
                        self->ul_spin_lock_wakeup++;
        }
        preempt(self);
}

Ah ha! So we did the swap, but we never returned anything to the caller. In spin_lock_clear, we were checking whatever happened to be in %rax to see if there were waiters. If %rax happened to be 0, the calling thread would think that there is no waiter to wake up, leaving the poor thread waiting for the mutex looping forever!

To fix the problem, I changed swap32 to the following:


        .inline swap32, 0
        movl    %esi, %eax
        xchgl   (%rdi), %eax
        .end

So the moral of the story is that, things are not always what they seem on the surface.


Technorati Tag:
Technorati Tag:

( Jun 14 2005, 08:37:19 AM PDT / Jun 14 2005, 08:36:14 AM PDT ) Permalink Comments [42]
Trackback: http://blogs.sun.com/sherrym/entry/whose_bug_is_it_anyway

20050517 Tuesday May 17, 2005

 Compilation Options for Best Performance

Compilation Options
Target Hardware Compilation Options
32-bit x86, no SSE -xtarget=pentium{3|4}
32-bit x86, SSE -xtarget=pentium{3|4} -xarch=sse
32-bit amd64 -xtarget=opteron
64-bit amd64 -xtarget=opteron -xarch=amd64

* -xtarget=opteron implies -xarch=sse2, -xchip=opteron, and -xcache=64/64/2:1024/64/16



( May 17 2005, 05:28:17 PM PDT / May 17 2005, 05:26:33 PM PDT ) Permalink Comments [6]
Trackback: http://blogs.sun.com/sherrym/entry/compilation_options_for_best_performance

20050513 Friday May 13, 2005

 Obtaining Function Arguments on AMD64

Now that you have experienced enough pain debugging on AMD64 platforms without arguments, you would be delighted to hear that there are options out there to help you!

The Studio 10 patch compilers (minimum patch number is 117846-03, use ube -V to verify) offers an option -Wu,-save_args on amd64 for saving INTEGER type function arguments passed via registers on the stack. When this option is specified, up to 6 arguments are saved on the stack on function entry, and will not be modified through out the life of the routine (the checkpoint effect we have all dreamed about). For example,
        void
        foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7)
        {
        ...
        }
Disassembled code will look something like the following:
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $0x30, %rsp                     **
        movq    %rdi, -0x8(%rbp)
        movq    %rsi, -0x10(%rbp)
        movq    %rdx, -0x18(%rbp)
        movq    %rcx, -0x20(%rbp)
        movq    %r8, -0x28(%rbp)
        movq    %r9, -0x30(%rbp)
        ...
**: The space being reserved is in addition to what the current function prolog already reserves.

return PC
%rbp
%rdi
%rsi
%rdx
%rcx
%r8
%r9

Nothing special is done for arguments beyond the first 6. If there are odd number of arguments to a function, additional space should be reserved on the stack to maintain 16-byte alignment. For example,
        argc == 0: no argument saving.
        argc == 3: save 3, but reserve space for 4 to maintain stack alignment.
        argc == 7: save 6.
The -save_args flag has no direct association with the optimization level. In other words, you can use various optimization level along with -save_args.

A new Dwarf attribute has been introduced to indicate if a function has been compiled with -save_args:
        DW_AT_SUN_amd64_parmdump        = 0x2224
The attribute has the value of 1 or 0. The attribute is only added when the value is 1. The attribute is attached to DW_TAG_subprogram tag.

You might wonder about the following:
  • How does the extra argument saving affect performance?

    With a 20-deep small function calls stack each with 6 arguments (to cause maximum argument saving), the impact of the extra saving is 18 nanosections, around a 10% hit.
            #define FUNC(i, j) \
                    static int      \
                    func##i(int i1, int i2, int i3, int i4, int i5, int i6) \
                    {                                                       \
                            i3 = i1 + i2;                                   \
                            i4 = i2 + i3;                                   \
                            i5 = i3 + i4;                                   \
                            i6 = func##j(i1, i2, i3, i4, i5, i6);           \
                            return (i3 + i4 + i5 + i6);                     \
                    }
        
    This is on hot cache where the first store to the stack won't suffer a page fault. Since in reality functions actually do something more complicated, the actual hit should be much smaller. If it turns out the -save_args option does affect performance of your particular application, you can always turn it off in production code.

  • Why was it implemented as callee-saved instead of caller-saved?

    • Smaller code size when functions are called by many callers.
    • Avoids useless argument saving when calling assembly functions.
    • Can be enabled only on the module that's being debugged.


  • So what does the output look like?

    Ha, I thought you would never ask!
    
    stack pointer for thread fffffe8123debe80: fffffe80006296c0
    [ fffffe80006296c0 unix`_resume_from_idle+0xde() ]
      fffffe8000629700 unix`swtch+0x241()
      fffffe8000629730 genunix`cv_wait+0x83(ffffffff82a44ed8, ffffffff82a44ed0)
      fffffe80006297a0 ufs`ufs_check_lockfs+0x14c(ffffffff82a44e00, ffffffff82a44eb0, 80000030)
      fffffe8000629800 ufs`ufs_lockfs_begin+0x14e(ffffffff82a44e00, fffffe8000629840, 80000030)
      fffffe8000629920 ufs`ufs_readlink+0x7e(ffffffff90377300, fffffe8000629980, ffffffff832e9428)
      fffffe8000629950 genunix`fop_readlink+0x24(ffffffff90377300, fffffe8000629980, ffffffff832e9428)
      fffffe80006299d0 genunix`pn_getsymlink+0x66(ffffffff90377300, fffffe8000629b20, ffffffff832e9428)
      fffffe8000629bc0 genunix`lookuppnvp+0x3f5(fffffe8000629ca0, 0, 1, 0, fffffe8000629e10, ffffffff8c907b80)
      fffffe8000629c60 genunix`lookuppnat+0x13e(fffffe8000629ca0, 0, 1, 0, fffffe8000629e10, 0)
      fffffe8000629d40 genunix`lookupnameat+0x88(805bd38, 0, 1, 0, fffffe8000629e10 , 0)
      fffffe8000629dd0 genunix`cstatat_getvp+0x17d(ffd19553, 805bd38, 1, 1, fffffe8000629e10, fffffe8000629e18)
      fffffe8000629e60 genunix`cstatat32+0x68(ffd19553, 805bd38, 1, fcfdbef8, 0, 10
      fffffe8000629e80 genunix`stat32+0x33(805bd38, fcfdbef8)
      fffffe8000629eb0 genunix`xstat32+0x26(2, 805bd38, fcfdbef8)
      fffffe8000629f00 unix`sys_syscall32+0x1ff()
    
        


( May 16 2005, 08:58:25 AM PDT / May 13 2005, 10:31:05 AM PDT ) Permalink Comments [4]
Trackback: http://blogs.sun.com/sherrym/entry/obtaining_function_arguments_on_amd64

20050506 Friday May 06, 2005

 Welcome

I currently work in Solaris Kernel Development at Sun Microsystems. My projects over the last 1 1/2 years include:
  • Solaris port to AMD64 platforms, for which we won the 2005 Chairman's Award.
  • Improved write performance by 80-120% on AMD64 as measured by libMicro.
  • Got -save_args option implemented by Sun Studio compilers for AMD64 so that function arguments passed via register are available to the debugger (more on this later).
  • Improved debugability on AMD64 in general.
Prior to this new adventure in x86 land, I spent 6 1/2 years working in Sun's Enterprise Server Group, mostly on the SunFire 4800-6800 product line (Code name Serengeti). I designed and implemented
  • POST (Power On Self Test)
  • Parts of the System Controller software (test sequencer, domain console and domain communication channel)
  • The Solaris driver for communicating with the system controller
  • The Solaris drivers for DR (Dynamic Reconfiguration).
Prior to Sun I worked at Intel, and still have fond memories of the Pentium II launch party held at OMSI.

In addition to my day job, I also play mommy for two wonderful young children. When at times I exclaimed, "I found the bug!", my son would respond with the same enthusiasm, "Did you kill it?".


( Feb 20 2007, 02:44:06 PM PST / May 06 2005, 10:14:16 AM PDT ) Permalink Comments [78]
Trackback: http://blogs.sun.com/sherrym/entry/welcome


« August 2008
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
      
Today


XML



blogs.sun.com
Weblog
Sherry Q. Moore's Weblog
About
Login






Today's Page Hits: 316