20040716 Friday July 16, 2004

Buffer overflow, register window and register allocation.



I work on Sun's compiler, especially the SPARC code generator part. The inevitable (and sometimes boring, and sometimes the most interesting) part of my job is to evaluate bugs and (of course) fix them if I can. But as any engineers working on a complex software know, more often than not, a bug turns out to be an user error - in compiler's case, it could mean the user code has a bug.


This is a story of one recent case of not-a-bug.

One of our largest ISVs filed a bug where their application receives SIGSEGV when the program is compiled at -xO4 or above with our S1S8 compiler. The program worked just fine with WS6U2 at the same optimization level, so the customer naturally thought this is a compiler bug. I can't fault them for that since they had experienced quite a few compiler bugs in the past.

Because the bug went away whenever you turned off the global register allocator, it was sent to me (since I was the author of the register allocator). This particular ISV application was one of the most difficult ones to deal with, because this ISV, like most other large ISVs, does not allow their code to be shipped to us, thus we have to rely on either their engineer or our support engineer working on their site.

Since there's always a possiblity of a user error, running dbx's rtc or purify like tools is one way to exclude some of the most common programming errors. Unfortunately, this application was too large and complex for dbx rtc or purify to handle correctly and produce a userful report.

The symptom was quite simple - the program gets SEGV and at the time of SEGV, the stack trace showed that one pointer parameter had upper 32bit of 64bit pointer "zero"ed. So obviously the caller of the function was the first suspect. Upon manual inspection of the disassembly, it was clear that the code was quite correct because the code looked like following:

add %fp,1xxx,%l0

...bunch code including many calls...

call problematic_func
mov %l0,%o0


On dbx, %l0 contained a correct value right after the add but somehow the upper 32bit of %l0 got zeroed out when the control reached the problematic call. Subsequent dbx printout showed that %l0 gets changed after a call to a certain function.

Assuming save-restore are correctly placed, the only other way to modify %l0 is to change the register window save area. It just so happened that the %l0 is the first entry in the register window save area. Since SPARC is big-endian, the upper 32bit (MSB) is stored in the lower address. This all suggested the function in question was overwriting the first 32bit of the register window save area. This can happen, among others, if there's a buffer overflow on a local array. Because the compiler allocates stack space for local variables from the higher address to the lower address in the order of appearance in the source, the first variable is usually placed at the top, thus right below the current %fp (or the %sp of the caller). Of course, optimization can move stuff around and get rid of variables, and most scalar variables are allocated in the register so there's no guarantee for the above rule.

The preprocessed source code for the function in question looked like following:

returntype func(something *ptr,...) {
   wchar_t a[81];
   wchar_t b[81];

   ...initialize b by calling some initfunc...
   for(i = 0;i < wslen(b);i++) {
      ...do some operation on b[i]...
   }
   b[i] = 0;
   ...more code...
}


The array "a" wasn't used in the function, so the compiler didn't bother to allocate it on the stack. Thus "b" was at the top of the stack. If b was to overflow, the window save area could be overwritten - i.e. b[81] = 0 would overwrite the upper 32bit of %l0 save area.

After hearing the above analysis, our support engineer looked at the code of the initfunc and found a bug as expected, and the bug was closed as not a bug.

One may wonder why this code worked fine in the past. That's because %l0 wasn't live across that particular function call. The morale of the story is that any slight change in the register assignment can reveal a user error.

( Jul 16 2004, 03:44:06 PM PDT ) Permalink Comments [6]
Comments:

This is an awesome blob entry!! Do you happen to have an exampe stack frame to illustrate where each item (locals, stack frame save area, etc.) is actually stored? I am curious to see where where in memory these items would go. Thanks for the awesome post, - Ryan

Posted by Ryan on September 26, 2004 at 06:28 AM PDT #

Check out here or here. Or, you can even take a look at SPARC Compiance Definition, which defines the SPARC application binary interface.

Posted by Seongbae Park on September 27, 2004 at 11:02 AM PDT #

Thanks for the links. I have read through this link before: http://www.sics.se/~psm/sparcstack.html but am confused by your statement "The array "a" wasn't used in the function, so the compiler didn't bother to allocate it on the stack. Thus "b" was at the top of the stack. If b was to overflow, the window save area could be overwritten - i.e. b[81] = 0 would overwrite the upper 32bit of %l0 save area." Wouldn't the local array be offset 0 bytes from the frame pointer, and the local register save area offset 0 bytes from the stack pointer? The diagram from that URL seems to indicate this (unless I am misinterpreting it). Thanks again for the awesome blog entries, - Ryan

Posted by Ryan on September 27, 2004 at 06:12 PM PDT #

Thanks for the links. I have read through this link before: http://www.sics.se/~psm/sparcstack.html but am confused by your statement "The array "a" wasn't used in the function, so the compiler didn't bother to allocate it on the stack. Thus "b" was at the top of the stack. If b was to overflow, the window save area could be overwritten - i.e. b[81] = 0 would overwrite the upper 32bit of %l0 save area." Wouldn't the local array be offset 0 bytes from the frame pointer, and the local register save area offset 0 bytes from the stack pointer? The diagram from that URL seems to indicate this (unless I am misinterpreting it). Thanks again for the awesome blog entries, - Ryan

Posted by Ryan on September 27, 2004 at 08:43 PM PDT #

The figure 3 in http://www.sics.se/~psm/sparcstack.html is a bit confusing - the high address is at the bottom. So the local aray ("b" in my example) is actually at fp-81 to fp-1 (inclusive). fp is the sp of the caller, thus 4 bytes (or 8 bytes, in v9) at fp+0 is actually sp+0 of the caller, and thus caller's %l0 save area. Does this make sense ?

Posted by Seongbae Park on September 28, 2004 at 09:09 AM PDT #

Yeah! Thanks for the calrification. - Ryan

Posted by Ryan on October 02, 2004 at 05:42 PM PDT #

Post a Comment:

Comments are closed for this entry.