Andrew Rutz's blog

Monday Jun 25, 2007

Who needs a compiler ?

Yes, compilers are useful tools if one is going to run the created executable MANY times, but if the executable is only run a FEW times, then Interpretation is the way to go... or... if you were in the situation I was in, then binary rewriting is the best of both worlds.

I wanted to try a fix to libpicldevtree.so, but I did not have a Solaris 9 development environment at my disposal, and I did not want to wait the several hours it takes to download and build the closure of the files needed to build my lib.

My modification was to the picld binary, which runs as a daemon. I connected mdb to it and disassembled the function-of-interest. Specifically, the instruction at offset 0x28 was the one that needed to be modified:

# ps -ef | grep pi[cl]
    root   341     1  0 16:15:16 ?        0:01 /usr/lib/picl/picld
# mdb -p 341
Loading modules: [ ld.so.1 libthread.so.1 libc.so.1 libnvpair.so.1 ]
> libdevinfo_init::dis -n 20
libpicldevtree.so.1`libdevinfo_init:    save      %sp, -0x70, %sp
libpicldevtree.so.1`libdevinfo_init+4:  call      +8            <libpicldevtree.so.1`libdevinfo_init+0xc>
libpicldevtree.so.1`libdevinfo_init+8:  sethi     %hi(0x13800), %o1
libpicldevtree.so.1`libdevinfo_init+0xc:mov       %i0, %l3
libpicldevtree.so.1`libdevinfo_init+0x10:       add       %o1, 0x218, %o1
libpicldevtree.so.1`libdevinfo_init+0x14:       mov       %i1, %l2
libpicldevtree.so.1`libdevinfo_init+0x18:       add       %o1, %o7, %l1
libpicldevtree.so.1`libdevinfo_init+0x1c:       sethi     %hi(0xdc00), %o1
libpicldevtree.so.1`libdevinfo_init+0x20:       ld        [%l1 + 0x16c], %o0
libpicldevtree.so.1`libdevinfo_init+0x24:       call      +0x140c8      <PLT=libdevinfo.so.1`di_init>
libpicldevtree.so.1`libdevinfo_init+0x28:       add       %o1, 0x307, %o1
libpicldevtree.so.1`libdevinfo_init+0x2c:       orcc      %g0, %o0, %l0
I wanted to change the instruction so that 0x327 was added to %o1, not 0x307. This can easily be done to the executing process by using these mdb commands:
> libdevinfo_init+0x28/i
libpicldevtree.so.1`libdevinfo_init+0x28:       add       %o1, 0x307, %o1
> libdevinfo_init+0x28/X
libpicldevtree.so.1`libdevinfo_init+0x28:       92026307
> libdevinfo_init+0x28/W92026327
libpicldevtree.so.1`libdevinfo_init+0x28:       0x92026307      =       0x92026327
> libdevinfo_init+0x28/i
libpicldevtree.so.1`libdevinfo_init+0x28:       add       %o1, 0x327, %o1
However, picld is executed during init(1M) processing, and so the modification must be done to picld's binary file, so that the modification would be available on each of the several reboots that I was doing. (I was measuring the modification's effect on boot time, and therefore my needs were biased towards one who selects Interpretation over Compilation: e.g., I only need to run my binary several times; I was experimenting; I was not yet "married" to my modification).

As a result, I used the wonderful utility dd(1M) to modify libpicldevtree.so so that I replaced the original ADD instruction with the ADD instruction that I needed. The following script does the trick:

# cat a.sh
#!/usr/bin/ksh

in=libpicldevtree.so.1
out=${in}.new

# copy the first 4483. four-byte "blocks" from the input-file to the output-file.
#  note that the blocks are numbered from 0 to 4482.
#
dd if=$in of=$out bs=4 count=4483

# write the 32 bits of our new ADD instruction to the output-file at block 4483.:
#
echo "\0222\002\0143\047" | dd of=$out bs=4 seek=4483 count=1

# skip over the four-byte blocks (0-4483.) in both the input-file and output-file
#  and then commence copying each four-byte block from input-file to output-file until
#  end-of-file is reached.
dd if=$in of=$out bs=4 skip=4484 oseek=4484
The only remaining issue is: "where did all those magic numbers come from ?"

We begin with disassembling the libdevinfo_init function using the dis(1) command:

# dis -F libdevinfo_init libpicldevtree.so.1
                ****   DISASSEMBLER  ****

disassembly for libpicldevtree.so.1

section .text
libdevinfo_init()
        45e4:  9d e3 bf 90         save         %sp, -0x70, %sp
        45e8:  40 00 00 02         call         0x45f0
        45ec:  13 00 00 4e         sethi        %hi(0x13800), %o1
        45f0:  a6 10 00 18         mov          %i0, %l3
        45f4:  92 02 62 18         add          %o1, 0x218, %o1
        45f8:  a4 10 00 19         mov          %i1, %l2
        45fc:  a2 02 40 0f         add          %o1, %o7, %l1
        4600:  13 00 00 37         sethi        %hi(0xdc00), %o1
        4604:  d0 04 61 6c         ld           [%l1 + 0x16c], %o0
        4608:  40 00 50 32         call         0x186d0
        460c:  92 02 63 07         add          %o1, 0x307, %o1
        4610:  a0 90 00 08         orcc         %g0, %o0, %l0
We can see our ADD instruction in bold at byte offset 0x460c (which is relative to the start of the binary file). Our goal is, therefore, to copy bytes 0 through (0x460c - 1), replace the four bytes at 0x460c, then copy the remaining bytes starting at (0x460c + 4) until the end-of-file.

Even though 0x460c is a hexidecimal number in units of bytes, it will be easier to think of our problem in units of 32-bit words, as this is the size of the item we are re-writing (an instruction, which is four bytes on this SPARC binary). As a result, the following command computes how many four-byte "blocks" reside before byte offset 0x460c:

# echo "460c%4=D" | mdb
                4483
This command divides the hexidecimal number 0x460c by the number of bytes in a four-byte block and then translates the value into decimal, as that is the radix that dd uses. The "prolific" use of "4483" in the above script now becomes evident: we copy blocks 0 through 4482, re-write block 4483, then copy blocks 4484 until the end-of-file.

The only "incantation" left to resolve is this line in the script:

# write the 32 bits of our new ADD instruction to the output-file at block 4483.:
#
echo "\0222\002\0143\047" | dd of=$out bs=4 seek=4483 count=1
This line writes an octal representation of our new ADD instruction's hexidecimal representation (0x92026327) to stdin. dd uses this as the value of the input-file and then writes this four-byte block to the output-file. The seek argument tells dd to move the current file pointer (in the output-file) to block index 4483; the writing commences there.

0x92026327 was shown above to be the encoding of the ADD instruction that we want. Since echo only seems to be able to write the binary representation of a single octal digit, we have to compute the octal representation of each of the four, eight-bit bytes of 0x92026327. Using mdb, one gets:

sh> mdb
> 92=O
                0222
> 2=O
                02
> 63=O
                0143
> 27=O
                047
We can check our work by issuing this command and seeing the bolded digits, which represent the instruction that we modified:
# dd if=libpicldevtree.so.1.new bs=4 skip=4483 | od -x |more
0000000 9202 6327 a090 0008 1280 0004 0100 0000
0000020 81c7 e008 91e8 2001 4000 502e 0100 0000
0000040 e204 6024 9290 0008 1280 0004 d024 6000
0000060 81c7 e008 91e8 2001 9010 0018 9210 0019
[...]
However, the greatest satisfaction would have been if the file's checksum would have been identical, yet the number-of-blocks and number-of-bytes are identical:
# sum libpicldevtree.so.1*
10840 162 libpicldevtree.so.1.new
10808 162 libpicldevtree.so.1.orig
# ls -l libpicldevtree.so.1*
-rw-r--r--   1 root     other      82692 Jun 25 15:46 libpicldevtree.so.1.new
-rwxr-xr-x   1 root     sys        82692 Jul  4  2005 libpicldevtree.so.1.orig
In the end, though, after manually stopping picld, installing my binary, and re-starting picld, the output of prtdiag was correct, which meant that picld had successfully loaded. Also, /var/adm/messages did not show any errors).

As a result, a quick script to perform some binary re-writing greatly improved the turnaround time regarding running my experiment. Have fun (and danger!) with your own binary-rewriting. One last tip: the binary encoding for a SPARC NOP instruction is 0x100.0000. ...or, in our "pig-octal", it is:

echo "\01\00\00\00"

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed

Calendar

Search my blog

Links

Navigation