This is already outdated, but here is the latest high level design note for the spe -- spe: And the design thereof.
And here are the notes based on the implementation of the new design -- spe: A first implementation. It too is a a bit dated, but mainly because the implementation is not finished. In particular, getting the guids in the /etc/policies.spe is not implemented. As you might have noticed in the earlier article about XDR encoding (An XDR example from scratch), I am going to implement /etc/npools.spe
The evolution of the design to the implementation and the mutation thereof is quite interesting. The group is in a state of flux. We aren't quite sure if we are going to use pnfsadm and I want to integrate the spe back into our dev gate. As it is a dev gate and not the nightly ON gate, I can get away with adding two files, /etc/policies.spe and /etc/npool.spe. These will be gone by the time the dev gate integrates back with ON.
By using them, I can get the core functionality of the spe tested out. I care that the XDR, nfssys() call, and the policy evaluation are all exercised.
My remaining task list for this initial integration is:
The real big item is the extraction of the guids from the data servers. To prove that I have the right values, I'm going to have to write an observability tool to dump them out. I'm also going to have to change the REPORTAVAIL code to also provide the paths. I.e., we want to encode storage pools as:
npool1: ds1.nfs41.sun.com:/dpool1 ds2.nfs41.sun.com:/dpool1 ds2.nfs41.sun.com:/dpool2
and not:
npool1: 35865356092523570 75365456092253973 75365456092254015
For one thing, it is probably easy to automate a regression test suite that always uses the same storage paths, but the underlying guids change every time the regression testing is started (because the machines are reinstalled from scratch).
For another, it is easier for us to spot mistakes with the strings.
Oh, and I forgot a last item on my task list, update these documents! :->
I used OmniGraffle to bring to life a data structure for a blog entry -- A little more about eliminating tail recursion.
I find it easy to use and I didn't mind paying for it. I've used xfig, dot, etc. and eventually grouping and ease of use does them all in for me.
Grouping the boxes and locking them down was simple.
I was talking with Mike Eisler in the hallway about my experience with rpcgen and how it did eliminate tail recursion. He then asked if it only did it for the last element in the structure or for any?
Well, we can easily find out:
struct npool {
struct npool *next;
int i;
};
program NPOOL_DESTROYER {
version FIRVERS {
npool_policy_gram
NPOOLTRANSFER(ttype) = 1;
} = 1;
} = 1000334;
Yields:
bool_t
xdr_npool(xdrs, objp)
XDR *xdrs;
npool *objp;
{
rpc_inline_t *buf;
if (!xdr_pointer(xdrs, (char **)&objp->next,
sizeof (npool), (xdrproc_t) xdr_npool))
return (FALSE);
if (!xdr_int(xdrs, &objp->i))
return (FALSE);
return (TRUE);
}
So no elimination of tail recursion.
Why is it difficult if we are not the last element? Consider:
struct stringlist {
string name;
struct stringlist *next;
};
struct npool {
struct stringlist *dses;
struct npool *prev;
struct npool *next;
};
Are we supposed to eliminate tail recursion on prev or next? Remember, rpcgen can't comprehend what the field names mean. Hmm, I'd probably handcraft the prev field because I'd only want the pointer value and no recursion.
But also consider this one:
struct npool {
struct stringlist *dses;
struct npool *child;
struct npool *sibling;
};
Note that the blue arrows are for children and the red arrows are for siblings.
This is actually the data structure I used to store ephemeral trees in the mirror mount implementation, a forest of trees. Unlike the previous example, we do care to visit the additional pointer. But how does rpcgen know this?
The answer is probably that we will need some recursion, but the trick is to tail eliminate the one we expect to be deepest. I.e., that is the one likely to blow the stack. Note that with this example, if you keep a back pointer, you should be able to eliminate both recursions. And by this, I mean in C code and not standard XDR code. I.e., rpcgen wouldn't generate what I'm about to describe.
You need the back pointer to save the state that would normally go on the stack. I.e., if I'm on home and I visit thomas, I can do this recursively like this:
evaluate_node(np->child);
evaluate_node(np->sibling);
And because we are on the stack, we "remember" we just executed the call to evaluate the child and not the sibling.
If you look at usr/src/uts/common/fs/nfs/nfs4_stub_vnops.c, you'll see code like this:
2140 while (e) {
2141 if (e->ne_state == NFS4_EPHEMERAL_VISIT_CHILD) {
2142 e->ne_state = NFS4_EPHEMERAL_VISIT_SIBLING;
2143 if (e->ne_child) {
2144 e = e->ne_child;
2145 e->ne_state =
2146 NFS4_EPHEMERAL_VISIT_CHILD;
2147 }
2148
2149 continue;
2150 } else if (e->ne_state ==
2151 NFS4_EPHEMERAL_VISIT_SIBLING) {
2152 e->ne_state = NFS4_EPHEMERAL_PROCESS_ME;
2153 if (e->ne_peer) {
2154 e = e->ne_peer;
2155 e->ne_state =
2156 NFS4_EPHEMERAL_VISIT_CHILD;
2157 }
2158
2159 continue;
2160 } else if (e->ne_state ==
2161 NFS4_EPHEMERAL_CHILD_ERROR) {
2162 prior = e->ne_prior;
I use ne_state to say what we are doing to this current node and traverse down the tree until we reach a leaf node. At that point I operate on any of its siblings. When I finally reach a node which has neither any children or siblings, I operate on it. And then I start backing up the tree. Since a node can be pointed to either by a parent or sibling (and this can change as we add or delete nodes), we need to know who is pointing to it. We can't store that in a local variable, because we would need a stack. We store it in the node.
I need to create a npool file for the spe. It will be a simple file with the format:
npool-name list of data storage units
And I know I will have to pass that along with XDR. So, this time, before I code the C data structures, I thought I'd go ahead and use rpcgen from the start.
Right off the bat, I can see that I will have two recursive lists going on. I have an unlimited number of npools and each can have an unlimited number of storage units. So I am interested in seeing if rpcgen can handle both of these at once. Or if I will have to play some tricks with it.
const NPOOL_MAX_NPOOLS = 32;
const NPOOL_MAX_DSES = 32;
struct npool {
string dses;
struct npool *next;
};
program NPOOL_DESTROYER {
version FIRVERS {
npool_policy_gram
NPOOLTRANSFER(ttype) = 1;
} = 1;
} = 1000334;
Notice how I am using a string instead of a NULL terminated C representation of char *. Also, I'm pretty sure this does not capture an array of strings. Let's check:
[th199096@warlock spe]> rpcgen npool.x [th199096@warlock spe]> more npool.h /* * Please do not edit this file. * It was generated using rpcgen. */ #ifndef _NPOOL_H_RPCGEN #define _NPOOL_H_RPCGEN #include#define NPOOL_MAX_NPOOLS 32 #define NPOOL_MAX_DSES 32 struct npool { char *dses; struct npool *next; };
Yes, it doesn't do what I want. My first hack doesn't compile:
struct npool {
string dses;
struct npool *next;
};
But I think this will:
const NPOOL_MAX_NPOOLS = 32;
const NPOOL_MAX_DSES = 32;
struct stringlist {
string name;
struct stringlist *next;
};
struct npool {
struct stringlist *dses;
struct npool *next;
};
program NPOOL_DESTROYER {
version FIRVERS {
npool_policy_gram
NPOOLTRANSFER(ttype) = 1;
} = 1;
} = 1000334;
And yes, yes it does:
struct stringlist {
char *name;
struct stringlist *next;
};
typedef struct stringlist stringlist;
struct npool {
struct stringlist *dses;
struct npool *next;
};
typedef struct npool npool;
And by now, I expect that the code will handle tail recursion in both data structures. You can run rpcgen on it yourself to check out the resulting code.