« May 2008
SunMonTueWedThuFriSat
    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
       
Today
XML

Tom Haynes

loghyr.com
excfb.com

Blogs to Gander At

Navigation

Editing

AllMarks

Referers

Today's Page Hits: 1630

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

www.alesti.org

Add to Alesti RSS Reader

South Park as I was 10 years ago

South Park Fantasy

South Park today

South Park Reality

I have more hair and it isn't so grey. :->

10 years ago, really

Toon Tom

Today, literally

Tom Today

Site notes

This page validates as XHTML 1.0, and will look much better in a browser that supports web standards, but it is accessible to any browser or Internet device. It was created using techniques detailed at glish.com/css/.

Main | Next page »
20080328 Friday March 28, 2008
Followup om pNFS testing under VMWare

I got a pNFS community and client to run under VMWare on a XP box. Okay, so I made sure to make independent clones of the same machine as before. And this time I went from 512M to 1G of available RAM. The other thing I changed is that when building the cthon tests, I changed the config a bit.

I went from 512M to 1G because the archive updates were taking forever on the clones with 512M but went fast on the one with about 2G:

updating /platform/i86pc/boot_archive
updating /platform/i86pc/amd64/boot_archive

I'm talking 45 minutes or more. Once I pushed the memory up, these updated much faster.

I used the following configuration for using gcc on a 64bit OpenSolaris:

[tdh@m-client cthon04]> diff tests.init ~/cthon04/tests.init
57c57
< PATH=/opt/SUNWspro/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
---
> #PATH=/opt/SUNWspro/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
61c61
< #PATH=/opt/gnu/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
---
> PATH=/usr/sfw/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
133c133
< CC=/opt/SUNWspro/bin/cc
---
> #CC=/opt/SUNWspro/bin/cc
135c135
< #CC=/opt/gnu/bin/gcc
---
> CC=/usr/sfw/bin/gcc
138c138
< CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG`
---
> #CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG`
145c145
< #CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG -m64`
---
> CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG -m64 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE`
150c150
< LOCKTESTS=`echo tlocklfs tlock64`
---
> #LOCKTESTS=`echo tlocklfs tlock64`
152c152
< #LOCKTESTS=`echo tlocklfs`
---
> LOCKTESTS=`echo tlocklfs`

In the previous run, I hadn't set -D_LARGEFILE64_SOURCE and I didn't fix LOCKTESTS correctly. While the -D_LARGEFILE64_SOURCE might have been what was killing me, I don't think so.

The DS hung during the write/read of the 30 MB file. It was unresponsive on the console. I heard the disk chugging, I killed off Thunderbird and Firefox. And it did come back. My guess is that the 512M on the earlier systems was insufficient. I've had problems in the past with virtual machines that were trying large IO. (A real machine wrote a 50G file to a NFS simulator and they complained about the speed. They shut up when I had them go against the real box.)

So the experiment works. I'm working on VirtualBox on the side still.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Trying to do pNFS testing under VMWare

I'm trying to get a NFSv4.1 (aka pNFS) DS, MDS, and client all running as VMWare machines on my XP box. I took a base nevada 85 system (with a Core Custom load - which only eats up 1.4G of disk space) and loaded the on-pnfs-draft19-onnv85-bfu-20080324.i386.tar.bz2 bfu bits (check out our pNFS Download page in OpenSolaris: http://www.opensolaris.org/os/project/nfsv41/downloads/).

I then cloned the resulting system into m-client, m-ds, and m-mds. I was able to configure everything up okay, but system is locking up during the NFS Cthon tests:


write/read 30 MB file

After some investigation, I don't think this is a pNFS issue. The m-ds machine is hanging, consistently. It will hang even if I don't run the test. It isn't dropping into kmdb and is totally unresponsive on the console.

Either I didn't clone the original machine correctly or I'm running out of resources. I've had at least 3 machines running concurrently in the past, so I doubt it is resources. Also, the machines each are limited to 512M of memory.

I may play with this a bit more or try VirtualBox, which can be hosted under Solaris and OpenSolaris. I can run it on my w2100z. It now has a whopping 16G of RAM and should be able to handle plenty of virtual machines.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080320 Thursday March 20, 2008
auth records will not load if no share enabled

If you fail to share a zfs pool on the mds, then nfs will not be enabled and therefore the mds auth records will not load.

#  mdsadm -o add -t auth -a ip=10.1.233.117
adding: IP Addr - 10.1.233.117
Mar 20 20:37:41 pnfs-3-15 nfs: NFS Server not loaded
# zpool create -f nippy /dev/dsk/c1t0d0s7
#  echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k

mdb: failed to perform walk: unknown walk name

Above we see mdsadm clearly telling us we have an issue. And the fact that we can't use our simple mdb command to see the auth records is a worry.Just as I rebooted, I realized what the problem was. And we can check it out:

#  mdsadm -o add -t auth -a ip=10.1.233.117
adding: IP Addr - 10.1.233.117
Mar 20 20:42:54 pnfs-3-15 nfs: NFS Server not loaded
# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
mdb: failed to perform walk: unknown walk name
# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
mdb: failed to perform walk: unknown walk name
# zfs set sharenfs=rw,anon=0 nippy
#  echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k

We can see that once the share is enabled, we can walk the structures. We can redo the mdsadm command:

# mdsadm -o add -t auth -a ip=10.1.233.117
adding: IP Addr - 10.1.233.117
# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k

What is wrong now? We need to enable the ds:

# dservadm enable

And:

# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
{
    dbe = 0xffffff02f4156f08
    dev_addr = {
        na_r_netid = 0xffffff02f61bfc80 "tcp"
        na_r_addr = 0xffffff02efde8a88 "10.1.233.117.147.49"
    }
    dev_flags = 0x3
    dev_infop = 0xffffff02f4157f78
    dev_list_next = {
        list_next = 0xffffff02f4157fb8
        list_prev = 0xffffff02f4157fb8
    }
}

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080319 Wednesday March 19, 2008
Some behind the scenes tips for getting a community working

In no particular order...


20080306 Thursday March 06, 2008
sped not being called

So I have a new daemon and I have it loaded:

# ./sped
10, 16, 64000, 1647890964l uid == 501
20, 32, 2000, 1647890964l uid == 1066
30, 64, 1000, 1647890964l uid == 0
40, 2, 2000, 1647890964l subnet ==  192.168.2.0/24

And I have a door upcall which starts in do_rfs4_op_mknod(). A door is an IPC mechanism used to communicate from the kernel to user land. Also, since the sped hands out policies for open creation, this should be a good point to call it from. Notice the foreshadowing invoked by the word "should" and remember that foreshadowing is the sign of good journalism (I think this is a quote from Bloom County?).

Okay, it didn't work - the sped is not being called. I can whip out kmdb and struggle through trying to see if the door upcall is invoked. But first, we should do a sanity check to see if do_rfs4_op_mknod() is even being called:

# dtrace -m nfssrv > /tmp/dtrace.log
dtrace: description 'nfssrv' matched 2262 probes
^C# grep do /tmp/dtrace.log
 0  51547              mds_do_lookup:entry
 0  51548             mds_do_lookup:return
 0  52741         do_rfs4_op_getattr:entry
 0  52742        do_rfs4_op_getattr:return
 0  52379          mds_findopenowner:entry
 0  52380         mds_findopenowner:return
 0  51593            mds_do_opennull:entry
 0  52127          do_rfs4_set_attrs:entry
 0  52128         do_rfs4_set_attrs:return
 0  51777             vop_fid_pseudo:entry
 0  51778            vop_fid_pseudo:return
 0  52641           do_41_deleg_hack:entry
 0  52642          do_41_deleg_hack:return
 0  51591                mds_do_open:entry
 0  51592               mds_do_open:return
 0  51594           mds_do_opennull:return
 0  52741         do_rfs4_op_getattr:entry
 0  52742        do_rfs4_op_getattr:return
 0  52741         do_rfs4_op_getattr:entry
 0  52742        do_rfs4_op_getattr:return
 1  52127          do_rfs4_set_attrs:entry
 1  52128         do_rfs4_set_attrs:return
 1  51547              mds_do_lookup:entry
 1  51548             mds_do_lookup:return
 1  52741         do_rfs4_op_getattr:entry
 1  52742        do_rfs4_op_getattr:return 

Remember, the rfs component of the function name means that is in the nfssrv module and not the nfs module. Also, note that you won't see "mds_" names in code drops from Nevada. You will need to download the pNFS tree over at: OpenSolaris Project: NFS version 4.1 pNFS

Okay, it was never called. So either dtrace is hosed or something is going wrong. What is being called? mds_createfile()

 0  51593            mds_do_opennull:entry
 0  51987             mds_createfile:entry
 0  52647            nfsauth4_access:entry
 0  52377             nfsauth_access:entry
 0  52378            nfsauth_access:return
 0  52648           nfsauth4_access:return
 0  52399       nfs4_ntov_table_init:entry
 0  52400      nfs4_ntov_table_init:return 

I'm off to build a new nfssrv to see if making my upcall from mds_createfile() works.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080228 Thursday February 28, 2008
How to determine MDS, DS, and client

With the OpenSolaris pnfs binary drops, things are still pretty raw. (Hey, we could hold onto the code until it is fully baked!) A common problem is telling which machines plays which role - i.e., I've got a pNFS community set up, I've done some testing, and I've rebooted the machines. How do I remember which machine is which?

The dservadm command will help you. If it shows nothing, you have either a client or the MDS:

# dservadm listpools
zpools:

Okay, we find the DS:

# dservadm listpools
zpools:
    train
# uname -a
SunOS silent 5.11 pnfs-curr i86pc i386 i86pc

Now, how do I tell which of the remaining two are the MDS and client? Well, dservadm can tell me a bit more:

# dservadm listmds
mds:
    192.168.2.119.8.1
# host 192.168.2.119
119.2.168.192.in-addr.arpa domain name pointer stealth

Don't worry about the extra numbers at the end of the mds identifier - just realize the first four are octets. Okay, we can now start back up with our testing.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080131 Thursday January 31, 2008
Finishing the spedebugger

Sometimes you just get busy and can't recall where your time went. I just finished up the spedebugger. I wish I had blogged more about it, but it isn't going to happen - I need to turn it into a real spe daemon in two weeks time. The Austin Bake-A-Thon is coming up and I need a demo.

Anyway, here is the finished product, which still does not support IPv6: spedebug.tar.bz2

Besides reading the README.txt, here is a quick howto get it up and running:

> bzcat spedebug.tar.bz2 | tar xf -
> ./make.sh
spedebug.c:27: warning: ignoring #pragma ident
In file included from spedebug.c:49:
speadm.h:27: warning: ignoring #pragma ident
> ./spedebug tests/domain.txt
Looking for a matching server policy:
No matching policy, default would apply.
> ./spedebug -v tests/domain.txt
Here are the server policies:
1, 25, 34000, uid != 1066 && domain == centrl.sun.com
10, 25, 34000, uid == 1066 && weekday == sun
20, 25, 34000, domain == excfb.com
Looking for a matching server policy:
No matching policy, default would apply.
> ls -la
total 260
drwxr-xr-x   4 tdh      staff         10 Jan 30 12:49 .
drwxr-xr-x  12 tdh      staff         16 Jan 30 12:41 ..
-r--r--r--   1 tdh      staff       3583 Jan 30 12:38 README.txt
drwxr-xr-x   2 tdh      staff          5 Jan 30 12:46 SCCS
-rwxr-xr-x   1 tdh      staff         65 Jan 30 12:02 make.sh
-r--r--r--   1 tdh      staff       2641 Jan 30 12:38 speadm.h
-rwxr-xr-x   1 tdh      staff      44156 Jan 30 12:46 spedebug
-r--r--r--   1 tdh      staff      49826 Jan 30 12:46 spedebug.c
-rw-r--r--   1 tdh      staff      21778 Jan 30 12:48 spedebug.tar.bz2
drwxr-xr-x   2 tdh      staff         17 Jan 29 23:04 tests

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080122 Tuesday January 22, 2008
Exploring pulling apart a path

Given a directory path entry to a file, I want to find:

I can probably find a standard function to do this, but where is the fun in that?

You might notice I am being vague with the extension and basename, I'm giving the definition we know from common usage, not a precise one. We'll see why in a bit. And I am assuming we are passing in a file and not a directroy.

I've got the code sitting in the policy prototype, but I pulled it out to focus on and to do some fun unit testing. So, here is the code:

#include <stdio.h>
#include <stdarg.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

typedef struct {
	char	*ext;
	char	*path;
	char	*base;
	char	*file;
} policyAttributes_t;

int
main (int argc, char *argv[])
{
	policyAttributes_t	pat;

	int	rc = 0;

	char	*junk;
	char	*s;

	/*
	 * Do some initialization.
	 */
	memset(&pat, '\0', sizeof(pat));

	pat.path = strdup(argv[1]);
	if (pat.path == NULL) {
...
	}

	/*
	 * Now we need the file.
	 */
	s = strrchr(argv[1], '/');
	if (s) {
		junk = strdup(s+1);
		if (junk == NULL) {
...
		}

		pat.file = strdup(junk);
		if (pat.file == NULL) {
...
		}

		s = strrchr(junk, '.');
		if (s) {
			pat.ext = strdup(s+1);
			if (junk == NULL) {
...
			}

			*s = '\0';
		}

		pat.base = junk;
	}

cleanup:

	if (pat.path) {
		printf("path = <%s>\n", pat.path);
		free(pat.path);
	}

	if (pat.ext) {
		printf("ext  = <%s>\n", pat.ext);
		free(pat.ext);
	}

	if (pat.base) {
		printf("base = <%s>\n", pat.base);
		free(pat.base);
	}

	if (pat.file) {
		printf("file = <%s>\n", pat.file);
		free(pat.file);
	}

	return (rc);
}

You can see I pulled it out of the other code - I preserved the pat variable. Anyway, I was mostly worried about the pointer manipulation, but I shouldn't have:

% ./a.out /foo/bar.c
path = </foo/bar.c>
ext  = <c>
base = <bar>
file = <bar.c>
% ./a.out /foo/grabba/yabba/bar.c
path = </foo/grabba/yabba/bar.c>
ext  = <c>
base = <bar>
file = <bar.c>
% ./a.out /foo/grabba/yabba/bar
path = </foo/grabba/yabba/bar>
base = <bar>
file = <bar>

Now, in a traditional CIFS only world, everything would be hunky-dory. I could probably even do the 8+3 filename stuff. But we are wearing big boy pants now, so another test yields a result I do not agree with:

% ./a.out /foo/grabba/yabba/.cshrc
path = </foo/grabba/yabba/.cshrc>
ext  = <cshrc>
base = <>
file = <.cshrc>

I would argue that the base name is '.cshrc' and there is no extension.

Anyone have any different views?


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Subnet evaluation is working

I knew I was close to having a lot more work last night. I spent a little bit of time this morning getting the error handling done the way I wanted for the evaluation (does a return of FALSE mean an error or not to match?). I also spliced the network parsing into the command line. Still needs to be reworked a bit, a network address is not a subnet after all. But, it now handles evaluating addresses correctly:

% /a.out -r tests/hulk.txt -d 12 -u 1167 -a 192.168.3.211
4, 25, 34000, uid == 1066
5, 30, 32000, uid == 1067 || uid == 1065
6, 15, 2000, uid == 1068 && gid == 500
7, 40, 8000, gid == 500
9, 40, 8000, day == 10
15, 3, 15000, subnet ==  192.168.3.0/24
16, 3, 15000, subnet ==  10.10.20.0/24
25, 2, 48000, ip ==  192.168.2.211
1114, 30, 32000, !(uid == 1167 || uid == 1165)
1115, 40, 8000, !(day == 11)
The matching policy is: 15, 3, 15000, subnet ==  192.168.3.0/24
% ./a.out -r tests/hulk.txt -d 12 -u 1167 -a 192.168.2.211
...
The matching policy is: 25, 2, 48000, ip ==  192.168.2.211
% ./a.out -r tests/hulk.txt -d 12 -u 1167 -a 10.10.20.5
...
The matching policy is: 16, 3, 15000, subnet ==  10.10.20.0/24

Looks like I should have an option to dump the policies and by default not do it.

What I am struggling with now is how to differentiate between essentially testing this on the server and on any other machine. If it is the server, then I can state that the only command line parameters for attributes are those that I can pull out of a NFS request. I don't need anything else.

But if this is being run on a different computer, then I might need to provide some mock-up ability. E.g., I might want to set a netmask, a domain name, etc. In short, any of the secondary attributes.

I'm thinking of taking these off of the command line and putting them in a .speadm file. I want to focus on the command line looking like a NFS packet, but still allow for flexibility when debugging.

The latest code is at: speadm.c and speadm.h.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Simple policy evaluation is working

I'm not going to walk through the code right now. It is incomplete, mostly the command line parsing, and has XXXs all over the place. I pushed ahead just enough to get some simple integers working:

% cat tests/hulk.txt 
4, 25, 34k, uid == 1066
6, 40, 8k, gid == 500
6, 40, 8k, day == 10
% ./a.out -r tests/hulk.txt -d 10 -u 1066
4, 25, 34000 uid == 1066
6, 40, 8000 gid == 500
6, 40, 8000 day == 10
The matching policy is: 4, 25, 34000 uid == 1066
% ./a.out -r tests/hulk.txt -d 10 
4, 25, 34000 uid == 1066
6, 40, 8000 gid == 500
6, 40, 8000 day == 10
The matching policy is: 6, 40, 8000 day == 10

BTW: Notice what I meant about duplicate ids. I still need to fix that.

If you are trying the current code out, avoid the strings for right now. That would also include the networking tests.

What is in there now is the evaluation code, parsing of network and single machine addresses in the policies file, etc. I think I added 1k of code today and I can't figure out when. Well, 650 lines came in since the last blog entry.

The evaluation code stole the framework from the printing code. I'd love to abstract out the looping from the action to be performed. But, I think that may end up being more trouble than it is worth.

Hmm, I beefed up the tests just a bit and found that both '||' and '&&' are working:

% cat tests/hulk.txt 
4, 25, 34k, uid == 1066
5, 30, 32k, uid == 1067 || uid == 1065
6, 15, 2k, uid == 1068 && gid == 500
7, 40, 8k, gid == 500
9, 40, 8k, day == 10
% ./a.out -r tests/hulk.txt -u 1067
4, 25, 34000 uid == 1066
5, 30, 32000 uid == 1067 || uid == 1065
6, 15, 2000 uid == 1068 && gid == 500
7, 40, 8000 gid == 500
9, 40, 8000 day == 10
The matching policy is: 5, 30, 32000 uid == 1067 || uid == 1065
% ./a.out -r tests/hulk.txt -u 1065
4, 25, 34000 uid == 1066
5, 30, 32000 uid == 1067 || uid == 1065
6, 15, 2000 uid == 1068 && gid == 500
7, 40, 8000 gid == 500
9, 40, 8000 day == 10
The matching policy is: 5, 30, 32000 uid == 1067 || uid == 1065
% ./a.out -r tests/hulk.txt -u 1068 -g 500
4, 25, 34000 uid == 1066
5, 30, 32000 uid == 1067 || uid == 1065
6, 15, 2000 uid == 1068 && gid == 500
7, 40, 8000 gid == 500
9, 40, 8000 day == 10
The matching policy is: 6, 15, 2000 uid == 1068 && gid == 500
% ./a.out -r tests/hulk.txt -u 1068 -g 501
4, 25, 34000 uid == 1066
5, 30, 32000 uid == 1067 || uid == 1065
6, 15, 2000 uid == 1068 && gid == 500
7, 40, 8000 gid == 500
9, 40, 8000 day == 10
No matching policy, default would apply.

What about '!'?

% ./a.out -r tests/hulk.txt -d 12 -u 1167
4, 25, 34000, uid == 1066
5, 30, 32000, uid == 1067 || uid == 1065
6, 15, 2000, uid == 1068 && gid == 500
7, 40, 8000, gid == 500
9, 40, 8000, day == 10
10, 30, 32000, !(uid == 1167 || uid == 1165)
15, 40, 8000, !(day == 11)
The matching policy is: 10, 30, 32000, !(uid == 1167 || uid == 1165)

Looks wrong (notice I fixed a missing ',' before the policy attribute-expression). Hmm, here is the bug:

               bLHS = spe_eval_thunk(si->si_branches[0], pat, &sa);

                /*
                 * Lazy, but only 1 op - which is '!'.
                 */
                if (b == TRUE) {
                        b = FALSE;
                } else {
                        b = TRUE;
                }

That should have been a test on bLHS. Again, quick coding leads to bugs. With the fix:

% ./a.out -r tests/hulk.txt -d 12 -u 1167
4, 25, 34000, uid == 1066
5, 30, 32000, uid == 1067 || uid == 1065
6, 15, 2000, uid == 1068 && gid == 500
7, 40, 8000, gid == 500
9, 40, 8000, day == 10
10, 30, 32000, !(uid == 1167 || uid == 1165)
15, 40, 8000, !(day == 11)
The matching policy is: 15, 40, 8000, !(day == 11)

The latest code is at: speadm.c and speadm.h.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080121 Monday January 21, 2008
theShepler asks about inet_ntoa

theShepler asked me offline why I wasn't looking at inet_ntoa(3SOCKET) for Reversing a network address from an uint_t. His email was dry, but I think there was a "D'oh!" in there somewhere.

I had considered those routines, but I went with the stuff in mountd because for the standalone, we don't have sockets being attached. In the finished product, I'll be looking at this. Also, since I will need to support IPv6, I'll have more need to not be doing my own parsing. But for right now, I'm going to proceed without IPv6 and without inet_aton(3SOCKET).

Okay, I did two things with the little network parsing code:

And here is the code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

/*
 * http://infolab.stanford.edu/~manku/bitcount/bitcount.html
 */
int
spe_bitcount (unsigned int n)     
{
        int count = 8 * sizeof(int) ;
        n ^= (unsigned int) -1 ;
        while (n) {
                count-- ;
                n &= (n - 1) ;
        }

        return (count);
}


int
main (int argc, char *argv[])
{
        char    *p = argv[1];
        char    *junk;
        uint_t  addr = 0, mask = 0;
        uint_t  rev;
        int     i;
        int     shift;
        int     t;
        char    sep = ' ';

        unsigned char   b = 0;

        for (i = 0; i < 4; i++) {
                t = strtol(p, &junk, 10);
                if (junk && junk[0] == '/' && i == 3) {
                        b = 1;
                } else if (junk && junk[0] != '.' && junk[0] != '\0') {
                        fprintf(stderr, "%s is wrong in the %d octet,"
                                " %s has non-digits as |%s|\n", argv[1],
                                i+1, p, junk);
                        exit (-1);
                } else if (t > 255 || t < 0) {
                        fprintf(stderr, "%s is wrong in the %d octet,"
                                " %d is out of range\n", argv[1],
                                i+1, t);
                        exit (-1);
                }

                addr |= t << ((3-i) * 8);
                p = strchr(p, '.');
                if (p == NULL)
                        break;
                p++;
        }

        if (b) {
                p = &junk[1];   /* advance over '/' */
                t = strtol(p, &junk, 10);

                if (junk && junk[0] != '\0') {
                        fprintf(stderr, "%s is wrong in the netmask,"
                                " %s has non-digits as |%s|\n", argv[1],
                                p, junk);
                        exit (-1);
                } else if (t < 0 || t > 32) {
                        fprintf(stderr, "%s is wrong in the netmask,"
                                " %d is out of range\n", argv[1], t);
                        exit (-1);
                }

                mask = t ? ~0 << ((sizeof (struct in_addr) * NBBY) - t) : 0;
        } else {
                if ((addr & 0x00ffffff) == 0)
                        mask = 0xff000000;
                else if ((addr & 0x0000ffff) == 0)
                        mask = 0xffff0000;
                else if ((addr & 0x000000ff) == 0)
                        mask = 0xffffff00;
                else
                        mask = 0xffffffff;
        }

        printf("%s yields %u (%u) and that reverse maps to ", argv[1],
                addr, mask);

        for (i = 0; i < 4; i++) {
                shift = ((3-i) * 8);

                rev = (addr & (0xff << shift)) >> shift;

                printf("%c%u", sep, rev);
                sep = '.';
        }

        sep = '(';
        t = 0;
        for (i = 0; i < 4; i++) {
                shift = ((3-i) * 8);

                rev = (mask & (0xff << shift)) >> shift;

                printf("%c%u", sep, rev);
                sep = '.';
        }

        t = spe_bitcount(mask);
        printf(" - %u)\n", t);

        return (0);
}

And some test runs:

% ./a.out 192.168.2.0/23
192.168.2.0/23 yields 3232236032 (4294966784) and that reverse maps to  192.168.2.0(255.255.254.0 - 23)
% ./a.out 192.168.2.0/24
192.168.2.0/24 yields 3232236032 (4294967040) and that reverse maps to  192.168.2.0(255.255.255.0 - 24)
% ./a.out 192.168.2.0/31
192.168.2.0/31 yields 3232236032 (4294967294) and that reverse maps to  192.168.2.0(255.255.255.254 - 31)
% ./a.out 192.168.2.0/32
192.168.2.0/32 yields 3232236032 (4294967295) and that reverse maps to  192.168.2.0(255.255.255.255 - 32)
% ./a.out 192.168.2.0/0
192.168.2.0/0 yields 3232236032 (0) and that reverse maps to  192.168.2.0(0.0.0.0 - 0)
% ./a.out 192.168.2.0/1
192.168.2.0/1 yields 3232236032 (2147483648) and that reverse maps to  192.168.2.0(128.0.0.0 - 1)
% ./a.out 192.168.2.0/8
192.168.2.0/8 yields 3232236032 (4278190080) and that reverse maps to  192.168.2.0(255.0.0.0 - 8)
% ./a.out 192.168.2.0/33
192.168.2.0/33 is wrong in the netmask, 33 is out of range

I just need to stuff this code into my prototype.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Wrapping up the debugging work

Okay, the amount of delta turns out to be small to getting working code. I spent more time thinking about things. I tried putting the '(' code handling into the RHS and struggled with what node do I attach it to? I understood why for the LHS case it went to si, but after I created some new storage, I could not figure out how I knew si already had 2 children allocated.

Consider (path == /db) && (ext == jpg). From just doing the '(' code from the LHS, I knew there was no way that I had a correct si to attach the LHS ((ext == jpg)). Which is when I looked for where I had the correct code, which was in the get_compound state. And once I realized that, I realized I could keep the get_lhs focused on just handling data. The changes I had to make were in the OP code:

        case ('|') :
                /* 
                 * Rewind so that the compound operator can handle this correctly.
                 */
                if (*expression-- != '|') {
                        fprintf(stderr,
...
                }
                
                goto get_compound;
        
        case ('&') :
                /* 
                 * Rewind so that the compound operator can handle this correctly.
                 */
                if (*expression-- != '&') {
..
                }
                
                goto get_compound;

This also means that I only have to worry about '!' and '(' in the get_lhs state. This code may not be what a generator would have produced and it may make me groan when I come back to it, but it works.

Anyway, the current set of tests pass:

1, 16, 64000 (path == /db)
2, 16, 64000  !(path == /db)
3, 16, 64000 path == /db && ext == jpg
4, 16, 64000 path == /db && ext == jpg || ext == jpeg
5, 16, 64000 (path == /db && ext == jpg) || ext == jpeg
6, 16, 64000 path == /db && (ext == jpg || ext == jpeg)
7, 16, 64000 path == /db
8, 16, 64000 (path == /db && ext == jpg) || (path != /db && ext == jpeg)
9, 16, 32000 (path == /db && ext == jpg)
10, 16, 32000  !(path == /db && ext == jpg)
% cat tests/parens.txt
1, 16, 64k, (path == /db)
2, 16, 64k, !(path == /db)
3, 16, 64k, path == /db && ext == jpg
4, 16, 64k, path == /db && ext == jpg || ext == jpeg
5, 16, 64k, (path == /db && ext == jpg) || ext == jpeg
6, 16, 64k, path == /db && (ext == jpg || ext == jpeg)
7, 16, 64k, path == /db
8, 16, 64k, (path == /db && ext == jpg) || (path != /db && ext == jpeg)
9, 16, 32k, (path == /db && ext == jpg)
10, 16, 32k, !(path == /db && ext == jpg)

I need to go back and strip out all of the debug printfs, fix the storage of data in the get_rhs state, finish off the command line processing, etc. All of which needs to be done before I write the evaluation code. And that will be the real test case. I also need to add a new command line option to just verify the input. I've basically done that with the attribute-expression parsing, but I'd like a mode which does not load into the "global" set of policies. Note, I am thinking about the final product here.

Also, the code assumes that all id's are unique. If I were to have multiple ids in my rules-files, they would just load. BTW: I need to start calling it my policies-file. I think what I want to do in the final product is treat the loading of a policies-file to be atomic. To that end, they would load into a local list (which now has to become a parameter) and a warning would be spit out if there were duplicate ids. Only when the policies are all loaded correctly would the local copy be merged with the global one. And at that point, there would be no warnings about duplicates.

The latest code is at: speadm.c and speadm.h.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Some more debugging work

Turns out test case 4 is also horked, so let us work on it first:

4, 16, 64000 path == /db && ext == jpg && ext == jpeg

versus the expected

4, 16, 64000 path == /db && ext == jpg && ext == jpeg

5 and 6 also have this problem... Pretty easy to find once we noticed it:

        case ('|') :
                if (*expression++ != '|') {
...
                }                

                sop = SPE_OP_AND;
                break;

        case ('&') :
                if (*expression++ != '&') {
...
                }

                sop = SPE_OP_AND;
                break;

...
        /*
         * Both children are interior nodes...!
         */
        si_new->si_op = sop;

Cutting and pasting when in a hurry or tired is going to lead to bugs. Fixed and working.

Back to 5 and 6:

5, 16, 64000 (path == /db) && ext == jpg || ext == jpeg
6, 16, 64000 path == /db && (ext == jpg) || ext == jpeg

I did the expressions out on paper and I think the basic mechanism may be correct here. It might be that the recording of the parentheses is the problem. I.e., they look to be in the correct area, but on the wrong interior node. Hard to explain what I mean, let me see if I can find the bug and illustrate what I suspect.

First I'll add two new test cases:

8, 16, 64k, (path == /db && ext == jpg) || (path != /db && ext == jpeg)
9, 16, 32k, (path == /db && ext == jpg)

And look at the results:

8, 16, 64000 (path == /db) && ext == jpg || (path != /db) && ext == jpeg
9, 16, 32000 (path == /db) && ext == jpg

So my parentheses handling only works for the simple case of !(path == /db). It does not work for compound operators at all. As a matter of fact, a new test case suggests itself:

10, 16, 32k, !(path == /db && ext == jpg)

And we get:

10, 16, 32000  !(path == /db) && ext == jpg

I think what we should be doing is that when we encounter a '(' we recursively parse the subexpression and when that comes back, we throw the parentheses over the root of the subtree. And that reminds me, we are only handling match detecting for when we hit '!'.

We can test this claim pretty quickly as the '!' almost does this exact algorithm. A quick test shows with this:

                if (!(si->si_branches[0].st_node =
                    spe_parse_attribute_expression(si->si_branches[0].st_node, sub_expr,
                    pf, piLine, prc))) {
                        goto cleanup; 
                }
        
                {
                        spe_interior_t  *sip;
                
                        if (si->si_branches[0].st_is_interior == FALSE) {
...
                        }
                
                        sip = (spe_interior_t *)si->si_branches[0].st_node;
                        sip->si_parens = TRUE;
                }

bears fruit:

2, 16, 64000  !(path == /db)
10, 16, 32000  !(path == /db && ext == jpg)

What I'm trying to get at is that I believe the parse trees are correct, but it is just the annotation on them to produce parentheses which is incorrect. I.e., they would evaluate file creations correctly. And this is a very dangerous claim to make - it may simply be that my test cases support it and new test cases would tear it apart. I'm not going to investigate this angle, I have other fish to fry.

And while this latest change appears to work, what I really need to do is push the parentheses handling out of the '!' case and make it a first class citizen.

The big difference here is that we need to consume the parentheses first when getting the sub-expression:

                t = expression;
                iLen = strlen(t) + 1;
                b = FALSE;

                iParens = 1;
                for (i = 0; i < iLen - 1; i++) {

If we don't consume it, then we will never have a base case to halt recursion. The new code, works, kinda. I actually was predicting the failure to myself.

spe_parse_attribute_expression: EOS when parsing operator for Line 10 in 
1, 16, 64000 (path == /db)
2, 16, 64000  !(path == /db)
3, 16, 64000 path == /db && ext == jpg
4, 16, 64000 path == /db && ext == jpg || ext == jpeg
5, 16, 64000 (path == /db && ext == jpg)
6, 16, 64000 path == /db && (ext == jpg || ext == jpeg)
7, 16, 64000 path == /db
8, 16, 64000 (path == /db && ext == jpg)
9, 16, 32000 (path == /db && ext == jpg)
10, 16, 32000  !(path == /db && ext == jpg)

Note the warning I've included. The problem is the the code was structured to handle LHS OP RHS, where the allowed ops were '==' and '!='. This worked when we had the latest parentheses bug. You can see it fail in policies 5 and 8 where the op is '||'.

We need to allow the ops '||' and '&&' to now appear in the OP slot. We also need to allow a '(' to appear in the RHS slot.

I'm trying to decide if the code is about to become too kludgey or not. I think I put too much emphasis in trying to parse it without recursion. You might have noticed my hesitation about some things earlier (like the labels).

I think the best thing is to continue down the path of finishing the prototype. Then I can decide if I need to rewrite it for the final product.

I'm going to cut this entry off here - this is a big enough of an issue. The final code is at: speadm.c and speadm.h.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Debugging the print routine

I quickly wrote a recursive print routine for an attribute-expression and also stuffed all RHS data into the string data type. Here is the resulting code:

char *spe_ops_list[] = {
	"&&",
	"||",
	"!",
	"==",
	"!=",
	NULL
};

...

void
spe_print_leaf (spe_leaf_t *sl)
{
	if (sl->sl_is_attr == TRUE) {
	} else {
		switch (sl->sl_type) {
		case (SPE_DATA_STRING) :
			fprintf(stdout, "%s", sl->sl_data.sz);
			break;
...
		default :
			fprintf(stderr,
			    "spe_print_leaf: Unknown type %d\n", sl->sl_type);
			break;
		}
	}
}

void
spe_print_thunk (spe_thunk_t st)
{
	if (st.st_is_interior == TRUE) {
		spe_print_attribute((spe_interior_t *)st.st_node);
	} else {
		spe_print_leaf((spe_leaf_t *)st.st_node);
	}
}

void
spe_print_attribute (spe_interior_t *si)
{
	int	i;

	if (!si) {
		return;
	}

	if (si->si_children == 1) {
		fprintf(stdout, " %s", spe_ops_list[si->si_op]);
		spe_print_thunk(si->si_branches[0]);
	} else {
		if (si->si_parens) {
			fprintf(stdout, "(");
		}

		spe_print_thunk(si->si_branches[0]);
		fprintf(stdout, " %s ", spe_ops_list[si->si_op]);
		spe_print_thunk(si->si_branches[1]);

		if (si->si_parens) {
			fprintf(stdout, ")");
		}
	}
}

void
spe_print_policy (spe_policy_t *spe)
{
	if (!spe) {
		return;
	}

	fprintf(stdout, "%u, %u, %u ", spe->sp_id, spe->sp_stripe_count,
	    spe->sp_interlace);

	spe_print_attribute(spe->sp_attr_expr);

	fprintf(stdout, "\n");
}

void
spe_print_policy_list (void)
{
	spe_policy_t	*spe;

	for (spe = spe_policy_list; spe; spe = spe->next) {
		spe_print_policy(spe);
	}
}

/*
 * When we parse an attribute-expression...
 */
spe_interior_t *
spe_parse_attribute_expression (spe_interior_t *si, char *expression, FILE *pf,
    int *piLine, int *prc)
{
...
		/*
		 * The sub_expr is basically the tested.
		 */
		printf("RHS is %s\n", sub_expr);

		/*
		 * XXX: Hack to get to debugging!
		 */
		sl = (spe_leaf_t *)si->si_branches[0].st_node;
		sl->sl_is_attr = FALSE;
		sl->sl_type = SPE_DATA_STRING;
		sl->sl_data.sz = sub_expr;

		if (sub_expr) {
#if 0
			free(sub_expr);
#endif
			sub_expr = NULL;
		}
	}

And here is some output:

% cat !$
cat tests/parens.txt
1, 16, 64k, (path == /db)
2, 16, 64k, !(path == /db)
3, 16, 64k, path == /db && ext == jpg
4, 16, 64k, path == /db && ext == jpg || ext == jpeg
5, 16, 64k, (path == /db && ext == jpg) || ext == jpeg
6, 16, 64k, path == /db && (ext == jpg || ext == jpeg)
% ./a.out -r tests/parens.txt 
1, 16, 64000 (/db == (null))
2, 16, 64000  !(/db == (null))
3, 16, 64000 jpg == (null)
4, 16, 64000 jpeg == (null)
5, 16, 64000 jpeg == (null)
6, 16, 64000 jpeg == (null)

I've already spotted a bug in spe_print_leaf(), a quick fix of:

void
spe_print_leaf (spe_leaf_t *sl)
{
        int     i;
        
        if (!sl) {
                return;
        }       
        
        if (sl->sl_is_attr == TRUE) {
                for (i = 0; i < attribute_count; i++) {
                        if (spe_attribute_table[i].at_attr == sl->sl_attr) {
                                fprintf(stdout, "%s", spe_attribute_table[i].at_name);
                        }       
                }       
        } else {

Hmm, sl_attr could come out of the table, the index should suffice. But that is for later. What results do we get?

1, 16, 64000 (/db == (null))
2, 16, 64000  !(/db == (null))
3, 16, 64000 jpg == (null)
4, 16, 64000 jpeg == (null)
5, 16, 64000 jpeg == (null)
6, 16, 64000 jpeg == (null)

I'm actually quite happy with that, I didn't think that should have been the issue. Okay, what do we know? Well, first off, the attribute expression looks backwards. Hmm, the parentheses in 1 and the negation in 2 appear correct.

A little exploration in gdb shows that the first node is storing the path:

(gdb) print si
$1 = (spe_interior_t *) 0x100160
(gdb) print *si
$2 = {
  si_op = SPE_OP_EQUAL, 
  si_parens = 1 '\001', 
  si_children = 2, 
  si_branches = 0x100170
}
(gdb) print si->si_branches[0]
$3 = {
  st_is_interior = 0 '\0', 
  st_node = 0x100180
}
(gdb) print (spe_leaf_t *)si->si_branches[0].st_node
$4 = (spe_leaf_t *) 0x100180
(gdb) print *(spe_leaf_t *)si->si_branches[0].st_node
$5 = {
  sl_is_attr = 0 '\0', 
  sl_attr = SPE_ATTR_PATH, 
  sl_type = SPE_DATA_STRING, 
  sl_data = {
    uid = 1049088, 
    gid = 1049088, 
    i = 1049088, 
    sz = 0x100200 "/db", 
    net = {
      sn_netname = 0x100200 "/db", 
      sn_network = 0, 
      sn_netmask = 0
    }
  }
}

Okay, the bug is in the quick hack to get printing to work, wrong index here:

		/*
		 * XXX: Hack to get to debugging!
		 */
		sl = (spe_leaf_t *)si->si_branches[0].st_node;

Fix it and:

1, 16, 64000 (path == /db)
2, 16, 64000  !(path == /db)
3, 16, 64000 ext == jpg
4, 16, 64000 ext == jpeg
5, 16, 64000 ext == jpeg
6, 16, 64000 ext == jpeg
7, 16, 64000 path == /db

Great, I added a new simple test 7 and it with the first 2 tests are working correctly. As thought though, the more complex parentheses tests are failing. Just not the way I expected. Sheesh! By the way, I'm leaking memory like crazy as well.

Walking through the code, I think the bug is here, where I only allocate 1 branch and not 2:

        /*
         * Both children are interior nodes...!
         */
        si_new->si_op = sop;
        si_new->si_children = 2;
        si_new->si_branches = (spe_thunk_t *)calloc(1, sizeof(spe_thunk_t));

Crap, my fix was this:

        /*
         * Both children are interior nodes...!
         */
        si_new->si_op = sop;
        si_new->si_children = 2;
        si_new->si_branches = (spe_thunk_t *)calloc(si_new->si_children, sizeof(spe_thunk_t));

And in applying it everywhere I did the calloc, I found where I must have cut and paste the bug from:

                /*
                 * We do not yet know what the operator is...
                 */
                si->si_children = 2;
                si->si_branches = (spe_thunk_t *)calloc(1, sizeof(spe_thunk_t));

While nasty, those are not the bug which is killing me. I think it is this code:

	/*
	 * The first child is the previous root of the parse tree.
	 */
	si_new->si_branches[0].st_is_interior = TRUE;
	si_new->si_branches[0].st_node = si;

	/*
	 * This is temporary until we get the second child allocated.
	 * This is done to allow proper cleanup in the case of a
	 * memory allocation when getting the second child.
	 */
	si = si_new;

	si_new->si_branches[1].st_is_interior = TRUE;
	si_new->si_branches[1].st_node = (spe_interior_t *)calloc(1,
	    sizeof(spe_interior_t));
	if (!si_new->si_branches[1].st_node) {
		fprintf(stderr, "spe_parse_attribute_expression:"
		    " Not enough memory\n");
		*prc = ENOMEM;
		goto cleanup;
	}

	/*
	 * Now we get the correct subtree to fill out.
	 */
	si = (spe_interior_t *)si_new->si_branches[1].st_node;

	/*
	 * Finally, we realize that we have more input to process.
	 */
	szError = expression;
	goto get_lhs;

The problem is that what I want to be returning (si_new) and what I want to be working on (si) are not the same. This may be where not using recursion is really going to bite me! Hmm, is it as simple as making a recursive call right here?

Yes, that is a good thing:

	/*
	 * The first child is the previous root of the parse tree.
	 */
	si_new->si_branches[0].st_is_interior = TRUE;
	si_new->si_branches[0].st_node = si;

	/*
	 * We hoist the new node into place.
	 */
	si = si_new;

	si_new->si_branches[1].st_is_interior = TRUE;
	si_new->si_branches[1].st_node = (spe_interior_t *)calloc(1,
	    sizeof(spe_interior_t));
	if (!si_new->si_branches[1].st_node) {
		fprintf(stderr, "spe_parse_attribute_expression:"
		    " Not enough memory\n");
		*prc = ENOMEM;
		goto cleanup;
	}

	/*
	 * Handle the rest by recursion.
	 */
	if (!(si_new->si_branches[1].st_node =
	    spe_parse_attribute_expression(si_new->si_branches[1].st_node, expression,
	    pf, piLine, prc))) {
		goto cleanup;
	}

	/*
	 * XXX: Do we want to check to make sure we've exhausted input? And how?
	 */

cleanup:

Which yields:

1, 16, 64000 (path == /db)
2, 16, 64000  !(path == /db)
3, 16, 64000 path == /db && ext == jpg
4, 16, 64000 path == /db && ext == jpg && ext == jpeg
5, 16, 64000 (path == /db) && ext == jpg && ext == jpeg
6, 16, 64000 path == /db && (ext == jpg) && ext == jpeg
7, 16, 64000 path == /db

Only 5 and 6 are horked. But those are the ones I suspected from the start would be that way.

And I'm okay with that for right now. I'm getting tired and I almost zapped my source while making a backup! Here is the current speadm.c and speadm.h.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080120 Sunday January 20, 2008
And presto, more code is done

Okay, I had to grind out some code and do a bunch of unit testing. I couldn't chart my journey, I had to really focus on what was going on. If I wrote about what I was doing as I did it, it would have tripled the development time, and I'm already on a tight self-imposed deadline.

Anyway, I have the attribute-expression parsing "done". By that I mean everything is mostly loaded in memory (except the data from the RHS), but I don't have a clue as to whether it is correctly loaded in memory. I have a sneaky suspicion that if I had the policy:

6, 16, 64k, path == /db && (ext == jpg || ext == jpeg)

Then if printed, it would be incorrect. I.e., I suspect my parentheses handling is very bogus.

It is getting to the point where my deltas between posts is getting huge. The raw source is at speadm.c and speadm.h. I'm going to version these so that you get to see changes. Note that the CDDL has been applied. :->

What I do want to do is go over the ~400 lines of parsing code that I have and annotate parts of it. I'm not happy with some things and others have tripped me up.

The first thing which I tried to do was have a simple while loop processing the input string. That wasn't going to work - I need to be parsing out LHS OP RHS and repeating when I find a compound. I know you are not supposed to use goto with a target before the goto, but this seemed the natural way to express that I was processing a token. These labels form a state table and you will see me jumping backwards at least once from the end of the RHS to the start of an OP.

        /*
         * When processing the LHS, we know that the only recursion we have
         * to face is when we encounter a '!'. Otherwise, it has to be the case
         * that we are eventually consuming an attribute followed by an
         * operation.
         */
get_lhs:
        switch (*expression++) {
        case (' ') :
        case ('\t') :
                goto get_lhs;

        /*
         * Whoa, we did not expect this, did we?
         */
        case ('\0') :
...
                goto cleanup;

And I'll look at the ! handling again, you can contrast it against the earlier post on Parsing the ! operator:

        case ('(') :
                si->si_parens = TRUE;
                goto get_lhs;

        case ('!') :
                /*
                 * Scan ahead and make sure the next token is '('.
                 */
                t = expression;
                iLen = strlen(t) + 1;
                b = FALSE;

We need to both make sure that we find the start token for the parentheses:

                for (i = 0; i < iLen - 1; i++) {
                        if (t[i] == ' ' || t[i] == '\t') {
                                continue;
                        } else if (t[i] == '(') {
                                b = TRUE;
                                break;
                        }

                        break;
                }

But we have to scan ahead to make sure that we find a balanced ending token. I tripped up here by starting on the '(' and counting it twice.


                /*
                 * Okay, now we need to scan ahead and find the end ')'.
                 * Note that we must be sitting on a '(' from above, so we
                 * will need to advance past it.
                 */
                b = FALSE;
                iParens = 0;
                for (; i < iLen - 1; i++) {
                        if (t[i] == '(') {
                                iParens++;
                        } else if (t[i] == ')') {
                                iParens--;
                                if (iParens == 0) {
                                        b = TRUE;
                                        sub_expr = (char *)calloc(i+i, 1);
                                        if (!sub_expr) {
...
                                        }
                                        strncpy(sub_expr, expression, i+1);
                                        expression = &t[i+1];
                                        break;
                                }
                        }
                }

I know this part is how I want it. I send in the new expression to parse and attach it below the operator for the !.

                if (!(si->si_branches[0].st_node =
                    spe_parse_attribute_expression(si->si_branches[0].st_node, sub_expr,
                    pf, piLine, prc))) {
                        goto cleanup;
                }

And we have a case to end the recursion. I think I caught this one before I hit it.

                /*
                 * Note that we can be done with this attribute-expression
                 * if it were "!(path == /db)".
                 */
                if (*expression == '\0') {
                        /*
                         * We could not have changed the parent interior node,
                         * so leave it alone.
                         */
                        goto cleanup;
                }

The next thing I want to cover is the handling for what happens if you process an attribute-expression and discover you have more:

3, 16, 64k, path == /db && ext == jpg

I.e., after we handle /db, we are done with the RHS of an attribute-expression. But we know that if we have anymore text, then we must already be in the middle of another attribute-expression:

get_compound:
        
        /*
         * We are done processing the RHS. That means if there is anything
         * left, then we are part of a compound statement. The first thing
         * in the remaining expression better be either && or ||.
         */     
        switch (*expression++) {
        case (' ') :
        case ('\t') :
                goto get_compound;

Note: I invariably caused a cut and paste bug with the above, it would go to get_lhs and the input would be really screwed up. I don't like this use of labels, but I find it very elegant and I like avoiding as many function calls here as possible.

                
        case ('\0') :
                /*
                 * Not an error, done with input. 
                 */     
                goto cleanup;   
                                
        case ('|') :            
                if (*expression++ != '|') {
...                             
                }               
                                
                sop = SPE_OP_AND;
                break;

        case ('&') :
                if (*expression++ != '&') {
...
                }

                sop = SPE_OP_AND;
                break;

        default :
...
                goto cleanup;
        }

Now that we have verified that we have a valid compound operator, we need to glue it into the tree.

Another huge difference from earlier postings is that you will start to see comments appearing in the code. This isn't just for reviewers, I need it to understand what my intent was. I also use it as times as a prod for come back here later and see if a change is needed.

        /*
         * Now we know we have a valid operation. So we need to
         * allocate a new internal node and place it above the
         * existing one.
         */
        si_new = (spe_interior_t *)calloc(1, sizeof(*si));
        if (!si_new) {
...     
        }
        
        /* 
         * Both children are interior nodes...!
         */
        si_new->si_op = sop;
        si_new->si_children = 2;
        si_new->si_branches = (spe_thunk_t *)calloc(1, sizeof(spe_thunk_t));
        if (!si->si_branches) {
...     
        }
        
        /* 
         * The first child is the previous root of the parse tree.
         */
        si_new->si_branches[0].st_is_interior = TRUE;
        si_new->si_branches[0].st_node = si;
        
        /* 
         * This is temporary until we get the second child allocated.
         * This is done to allow proper cleanup in the case of a
         * memory allocation when getting the second child.
         */
        si = si_new;
        
        si_new->si_branches[1].st_is_interior = TRUE;
        si_new->si_branches[1].st_node = (spe_interior_t *)calloc(1,
            sizeof(spe_interior_t));
        if (!si_new->si_branches[1].st_node) {
...     
        }
        
        /* 
         * Now we get the correct subtree to fill out.
         */
        si = (spe_interior_t *)si_new->si_branches[1].st_node;

And here is where I use the labels as a state transition:

        
        /* 
         * Finally, we realize that we have more input to process.
         */
        szError = expression;
        goto get_lhs;

To recap, I'm at the point where I need to add the RHS data into the parse trees, which means all of the validity checking. Hmm, I could put it in as all strings for right now. :->

I also have to write a real routine to print out my parse trees. It is only by comparing the input versus the output that I will see if I am constructing the trees correctly.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

Copyright (C) 2007, Kool Aid Served Daily