I've got this new code to XDR encode in userland and XDR decode in the kernel. I can run the kernel and debug issues live, but that is bulky. So, instead, I'm going to write some tools to help me. First off, I already have routines for printing a policy, so I've already got the hard part out of the way.
Or do I?
You can't just call printf() in the kernel. And while you can use syslog() or cmn_err(), they have the nasty habit of inserting a line feed after every call. So this
printf("Hello ");
printf("Nurse!\n");
Would yield
Hello Nurse!
On a strict translation. So, the easiest way to proceed is to collect it all in a buffer:
void
spe_print_policy_list(spe_policy_t *policies)
{
spe_policy_t *sp;
char *buf;
int offset;
buf = kmem_zalloc(MAX_PRINT_BUF + 1, KM_SLEEP);
offset = 0;
for (sp = policies; sp; sp = sp->next) {
spe_print_policy(sp, buf, &offset);
cmn_err(CE_WARN, "%s", buf);
buf[0] ='\0';
offset = 0;
}
free(buf, MAX_PRINT_BUF + 1);
}
Note that this is debug code and probably will be thrown away. So, I still make sure to do garbage cleanup correctly, but I don't care about whether it should be CE_WARN or CE_NOTE.
Why am I passing an offset?
Well, we have the following functions, most of which emit information:
spe_bitcount(unsigned int n) spe_print_address(uint_t addr, char *buf) spe_print_netmask(uint_t mask, char *buf) spe_print_leaf(spe_leaf_t *sl, char *buf) spe_print_thunk(spe_thunk_t st, char *buf) spe_print_attribute(spe_interior_t *si, char *buf) spe_print_policy(spe_policy_t *sp, char *buf) spe_print_policy_list(spe_policy_t *policies)
If I pass in just the buffer, I could advance it every time I wrote:
len = sprintf(buf, "%c%u", sep, rev);
spe_print_fool(fool, &buf[len + 1]);
Note, I wrote that second line off the top of my head, don't trust it!
But that doesn't help you during a tree traversal. You print out the left side, come back out and you don't know much (or if at all) spe_print_fool changed the offset. So, instead I also pass in the offset to be used:
void
spe_print_netmask(uint_t mask, char *buf, int *poffset)
{
int t;
int len;
t = spe_bitcount(mask);
len = sprintf(&buf[*poffset], "/%u", t);
*poffset = len + 1;
}
This way, the caller of spe_print_netmask will know where to write.
Now, this may be embarrassing, but we really need to debug this debug code before we add it into the kernel. I.e., we want to make sure we are focused on debugging our real code and not the support code. Luckily, we can write some stand-alone code and test it out there.
Since I know the print code used to dump out a policy correctly, I know know that this hunch was correct:
[th199096@aus1500-home ~/tmp]> ./a.out 10, 200 20, 1 30 40,/
And here was the input:
10, 16, 64k, 35865356092523570, uid == 200096 20, 32, 2k, 35865356092523570, uid == 1066 30, 64, 1k, 35865356092523570, uid == 0 40, 2, 2k, 35865356092523570, subnet == 10.1.233.0/24
It is easy to see in the debugger that I reset poffset instead of incrementing it:
96 len = sprintf(&buf[*poffset], "%s",
8: *poffset = 28
7: len = 0
(gdb)
98 *poffset = len + 1;
8: *poffset = 28
7: len = 3
(gdb)
94 for (i = 0; i < Attribute_count; i++) {
8: *poffset = 4
7: len = 3
That should be *poffset += len + 1.
Better, but not a go:
[th199096@aus1500-home ~/tmp]> ./a.out 10, 16, 64000, 4054008882l 20, 32, 2000, 4054008882l 30, 64, 1000, 4054008882l 40, 2, 2000, 4054008882l
The offending line starts with:
len = sprintf(&buf[*poffset], "%u, %u, %u, %ul ", sp->sp_id, sp->sp_stripe_count,
sp->sp_interlace, sp->sp_guuids[0]);
I suspect that the %ul is coming out as %ul. I.e., print an unsigned followed by an 'l'. We see that the data is correct in memory:
(gdb) p *sp
$2 = {sp_id = 10, sp_stripe_count = 16, sp_interlace = 64000,
sp_attr_expr = 0x80687e0, sp_name = 0x0, sp_guuids = {35865356092523570, 0,
0, 0, 0, 0, 0, 0}, next = 0x80699f0}
Crud, that is a bad formatting and I've already fixed it once! Looks like I lost some code. :-< (see here for prior head banging Putting that throw away code to good use). And I'll bet the remaining problem has something to do with code changes I made due to the move to xdr!
And I would be wrong. Things look okay, but no change:
96 len = sprintf(&buf[*poffset], "%s", 5: spe_attribute_table[i].at_attr = SPE_ATTR_UID 4: buf = 0x8066748 "10, 16, 64000, 35865356092523570 " 3: sl->sl_attr = SPE_ATTR_UID 1: i = 13
I now suspect I am off by 1!
(gdb) p &buf[33] $5 = 0x8066769 "" (gdb) p &buf[34] $6 = 0x806676a "uid" (gdb) p spe_attribute_table[i].at_name $7 = 0x8054ded "uid"
Yes, and now Thunderbirds are go!
[th199096@aus1500-home ~/tmp]> ./a.out 10, 16, 64000, 35865356092523570 uid == 200096 20, 32, 2000, 35865356092523570 uid == 1066 30, 64, 1000, 35865356092523570 uid == 0 35, 2, 2000, 35865356092523570 subnet == 10.1.233.0/24 && gid == 55 40, 2, 2000, 35865356092523570 subnet == 10.1.233.0/24
Note I added a new test case to handle a more complex rule!
Hi Tom,
good job! About 'Lessons Learned':
>If your environment is too complex for easy testing, write a stand-alone test suite.
In our project we spend two years with writing our own test suite and finally adopted existing one for a month.
It's hard to get the code to be testable, but if it's the case try to use existing frameworks. There are lot of advanced staff available. 'Delegate' part of your work to some one else :)
Tigran.
Posted by Tigran Mkrtchyan on September 13, 2008 at 02:58 AM CDT #