Andrew Rutz's blog
Who needs a compiler ?
Yes, compilers are useful tools if one is going to run the created executable MANY times, but if the executable is only run a FEW times, then Interpretation is the way to go... or... if you were in the situation I was in, then binary rewriting is the best of both worlds.
I wanted to try a fix to libpicldevtree.so, but I did not have a Solaris 9
development environment at my disposal, and I did not want to wait the several hours it takes
to download and build the closure of the files needed to build my lib.
My modification was to the picld binary, which runs as a daemon. I connected mdb to it
and disassembled the function-of-interest. Specifically, the instruction at offset
0x28 was the one that needed to be modified:
# ps -ef | grep pi[cl]
root 341 1 0 16:15:16 ? 0:01 /usr/lib/picl/picld
# mdb -p 341
Loading modules: [ ld.so.1 libthread.so.1 libc.so.1 libnvpair.so.1 ]
> libdevinfo_init::dis -n 20
libpicldevtree.so.1`libdevinfo_init: save %sp, -0x70, %sp
libpicldevtree.so.1`libdevinfo_init+4: call +8 <libpicldevtree.so.1`libdevinfo_init+0xc>
libpicldevtree.so.1`libdevinfo_init+8: sethi %hi(0x13800), %o1
libpicldevtree.so.1`libdevinfo_init+0xc:mov %i0, %l3
libpicldevtree.so.1`libdevinfo_init+0x10: add %o1, 0x218, %o1
libpicldevtree.so.1`libdevinfo_init+0x14: mov %i1, %l2
libpicldevtree.so.1`libdevinfo_init+0x18: add %o1, %o7, %l1
libpicldevtree.so.1`libdevinfo_init+0x1c: sethi %hi(0xdc00), %o1
libpicldevtree.so.1`libdevinfo_init+0x20: ld [%l1 + 0x16c], %o0
libpicldevtree.so.1`libdevinfo_init+0x24: call +0x140c8 <PLT=libdevinfo.so.1`di_init>
libpicldevtree.so.1`libdevinfo_init+0x28: add %o1, 0x307, %o1
libpicldevtree.so.1`libdevinfo_init+0x2c: orcc %g0, %o0, %l0
I wanted to change the instruction so that 0x327 was added to %o1, not 0x307. This can
easily be done to the executing process by using these mdb commands:
> libdevinfo_init+0x28/i libpicldevtree.so.1`libdevinfo_init+0x28: add %o1, 0x307, %o1 > libdevinfo_init+0x28/X libpicldevtree.so.1`libdevinfo_init+0x28: 92026307 > libdevinfo_init+0x28/W92026327 libpicldevtree.so.1`libdevinfo_init+0x28: 0x92026307 = 0x92026327 > libdevinfo_init+0x28/i libpicldevtree.so.1`libdevinfo_init+0x28: add %o1, 0x327, %o1However, picld is executed during init(1M) processing, and so the modification must be done to picld's binary file, so that the modification would be available on each of the several reboots that I was doing. (I was measuring the modification's effect on boot time, and therefore my needs were biased towards one who selects Interpretation over Compilation: e.g., I only need to run my binary several times; I was experimenting; I was not yet "married" to my modification).
As a result, I used the wonderful utility
dd(1M)
to modify libpicldevtree.so so that I
replaced the original ADD instruction with the ADD instruction that I needed. The following
script does the trick:
# cat a.sh
#!/usr/bin/ksh
in=libpicldevtree.so.1
out=${in}.new
# copy the first 4483. four-byte "blocks" from the input-file to the output-file.
# note that the blocks are numbered from 0 to 4482.
#
dd if=$in of=$out bs=4 count=4483
# write the 32 bits of our new ADD instruction to the output-file at block 4483.:
#
echo "\0222\002\0143\047" | dd of=$out bs=4 seek=4483 count=1
# skip over the four-byte blocks (0-4483.) in both the input-file and output-file
# and then commence copying each four-byte block from input-file to output-file until
# end-of-file is reached.
dd if=$in of=$out bs=4 skip=4484 oseek=4484
The only remaining issue is: "where did all those magic numbers come from ?"
We begin with disassembling the libdevinfo_init function using the
dis(1)
command:
# dis -F libdevinfo_init libpicldevtree.so.1
**** DISASSEMBLER ****
disassembly for libpicldevtree.so.1
section .text
libdevinfo_init()
45e4: 9d e3 bf 90 save %sp, -0x70, %sp
45e8: 40 00 00 02 call 0x45f0
45ec: 13 00 00 4e sethi %hi(0x13800), %o1
45f0: a6 10 00 18 mov %i0, %l3
45f4: 92 02 62 18 add %o1, 0x218, %o1
45f8: a4 10 00 19 mov %i1, %l2
45fc: a2 02 40 0f add %o1, %o7, %l1
4600: 13 00 00 37 sethi %hi(0xdc00), %o1
4604: d0 04 61 6c ld [%l1 + 0x16c], %o0
4608: 40 00 50 32 call 0x186d0
460c: 92 02 63 07 add %o1, 0x307, %o1
4610: a0 90 00 08 orcc %g0, %o0, %l0
We can see our ADD instruction in bold at byte offset 0x460c (which is
relative to the start
of the binary file). Our goal is, therefore, to copy bytes 0 through (0x460c - 1), replace
the four bytes at 0x460c, then copy the remaining bytes starting at (0x460c + 4) until the
end-of-file.
Even though 0x460c is a hexidecimal number in units of bytes, it will be easier to think of our problem in units of 32-bit words, as this is the size of the item we are re-writing (an instruction, which is four bytes on this SPARC binary). As a result, the following command computes how many four-byte "blocks" reside before byte offset 0x460c:
# echo "460c%4=D" | mdb
4483
This command divides the hexidecimal number 0x460c by the number of bytes in a four-byte
block and then translates the value into decimal, as that is the radix that dd uses. The
"prolific" use of "4483" in the above script now becomes evident: we copy blocks 0 through
4482, re-write block 4483, then copy blocks 4484 until the end-of-file.
The only "incantation" left to resolve is this line in the script:
# write the 32 bits of our new ADD instruction to the output-file at block 4483.: # echo "\0222\002\0143\047" | dd of=$out bs=4 seek=4483 count=1This line writes an octal representation of our new ADD instruction's hexidecimal representation (0x92026327) to stdin. dd uses this as the value of the input-file and then writes this four-byte block to the output-file. The
seek argument tells dd to move the current file pointer (in the output-file)
to block index 4483; the writing commences there.
0x92026327 was shown above to be the encoding of the ADD instruction that we want. Since
echo only seems to be able to write the binary representation of a single
octal digit, we have to compute the octal representation of each of the four,
eight-bit bytes of 0x92026327. Using mdb, one gets:
sh> mdb
> 92=O
0222
> 2=O
02
> 63=O
0143
> 27=O
047
We can check our work by issuing this command and seeing the bolded digits, which represent
the instruction that we modified:
# dd if=libpicldevtree.so.1.new bs=4 skip=4483 | od -x |more 0000000 9202 6327 a090 0008 1280 0004 0100 0000 0000020 81c7 e008 91e8 2001 4000 502e 0100 0000 0000040 e204 6024 9290 0008 1280 0004 d024 6000 0000060 81c7 e008 91e8 2001 9010 0018 9210 0019 [...]However, the greatest satisfaction would have been if the file's checksum would have been identical, yet the number-of-blocks and number-of-bytes are identical:
# sum libpicldevtree.so.1* 10840 162 libpicldevtree.so.1.new 10808 162 libpicldevtree.so.1.orig # ls -l libpicldevtree.so.1* -rw-r--r-- 1 root other 82692 Jun 25 15:46 libpicldevtree.so.1.new -rwxr-xr-x 1 root sys 82692 Jul 4 2005 libpicldevtree.so.1.origIn the end, though, after manually stopping picld, installing my binary, and re-starting picld, the output of prtdiag was correct, which meant that picld had successfully loaded. Also,
/var/adm/messages did not show any errors).
As a result, a quick script to perform some binary re-writing greatly improved the turnaround time regarding running my experiment. Have fun (and danger!) with your own binary-rewriting. One last tip: the binary encoding for a SPARC NOP instruction is 0x100.0000. ...or, in our "pig-octal", it is:
echo "\01\00\00\00"
Posted at 04:47PM Jun 25, 2007 by Andrew Rutz in debug | Comments[0]