I was trying to relax and I realized we would have an ongoing problem in keeping the new ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate in sync with our copy of the closed binaries. But, I think we will be saved by a couple of things:
Plus with setting the mail to go out to the dev mailing list, people would be able to see a need to pickup a new set of closed binaries.
[thud@adept src]> hg incoming comparing with ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes changeset: 7743:c672b1cb86be tag: tip user: Thomas Haynesdate: Thu Oct 02 22:28:30 2008 -0500 summary: Added tag closedv1 for changeset 9fab48a31a4a
The group has so much to do and it feels like so little time to do it. I don't think anyone just codes. I'm looking at my action list and it is all over the place:
[thud@adept src]> hg incoming comparing with ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes changeset: 7742:9fab48a31a4a tag: tip user: Thomas Haynesdate: Thu Oct 02 21:19:03 2008 -0500 summary: Test of push to osol [thud@adept src]> hg pull -u pulling from ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
We just opened up a new Mercurial gate of NFSv41 on OpenSolaris.org. Eventually it will automatically push changes as they occur to our gate. I also need to figure out a way to automatically update the closed-bins.
The hardest part was figuring out the naming convention. Some links of interest are Some work on libMicro; Mercurial transition notes and finally How to Use Mercurial (hg) Repositories. Look for For Project Leads: How to set up a Mercurial repository.
Update: Also, SCMVolunteers, look for Setting up a new (Mercurial) Project repository on OpenSolaris.org.
In any event, you can grab a copy of the source at:
hg clone ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate
Note the lack of a double '/' after the FQDN - normally I would take that as a sign of a bug with Mercurial.
Note that while this compiles, you can't run it without a corresponding closed-bins.
Eventually, you should be able to browse the source via Cross Reference: nfs41-gate.
And a big thanks to David Marker for providing the help necessary to getting this to go live!
I wanted to blog about this I Left Out One Detail from the September 2008 pNFS Bake-A-Thon Report because of the hard work of a friend of mine - Pranoop Erasani. I couldn't because of the NDA in place.
I actually don't know any details about their server implementation, but I do know he was quite proud of the work he did leading up to the BakeAThon - I saw him later in the Austin airport and he was beaming with pride.
I'll pick on Mike now, he states:
Congratulations to NetApp's Pranoop Erasani, who is leading our Data ONTAP pNFS server project and the rest of the Data ONTAP NFS team.
The first time I read this, I thought Mike was just pimping out Pranoop, the leader of both the pNFS server project and the Data ONTAP NFS team. It took me a couple of tries to realize that Pranoop hadn't been promoted again, and instead Mike was pimping out both Pranoop and the rest of the Data ONTAP NFS team.
For spe, I need to understand the database stuff in usr/src/uts/common/fs/nfs/nfs4_db.c. Note that this code is not part of the NFSv4 spec, it is a Sun implementation detail. Specifically, I need to know how to create two indexes on the same set of data. I could create two tables, but I think having a second copy is overkill. Also, it would be ugly to make sure that they stayed in sync.
The issue is that I need to keep track of mapping from data source name to guuid. Whereas the existing code only needs to be aware of the guuid.
I've gone through this code in the past, but I've forgotten most of what I learned. So I'm going to walk through it and annotate it here. I'll provide quick links which should stay relevant as the code changes.
We can see the server start to use the database code in rfs4_state_init:
1238 /* Create the overall database to hold all server state */ 1239 rfs4_server_state = rfs4_database_create(rfs4_database_debug);
It then creates some tables and indexes:
1241 /* Now create the individual tables */ 1242 rfs4_client_cache_time *= rfs4_lease_time; 1243 rfs4_client_tab = rfs4_table_create(rfs4_server_state, 1244 "Client", 1245 rfs4_client_cache_time, 1246 2, 1247 rfs4_client_create, 1248 rfs4_client_destroy, 1249 rfs4_client_expiry, 1250 sizeof (rfs4_client_t), 1251 TABSIZE, 1252 MAXTABSZ/8, 100); 1253 rfs4_nfsclnt_idx = rfs4_index_create(rfs4_client_tab, 1254 "nfs_client_id4", nfsclnt_hash, 1255 nfsclnt_compare, nfsclnt_mkkey, 1256 TRUE); 1257 rfs4_clientid_idx = rfs4_index_create(rfs4_client_tab, 1258 "client_id", clientid_hash, 1259 clientid_compare, clientid_mkkey, 1260 FALSE);
Looks like I now know I can create two different indexes on the same table.
Of interest is that there is only one index that can be used to create the table. That is given by the last parameter to rfs4_index_create. Indeed, we can see that rfs4_nfsclnt_idx is the create index for rfs4_client_tab.
BTW: Using the naming convention (see Some usr/src/uts/common/fs/nfs naming conventions)< we can easily tell that the code, table, and indexes are all for the NFSv4 server.
Going back to the code, we can see this property is enforced:
381 if (createable) {
382 table->ccnt++;
383 if (table->ccnt > 1)
384 panic("Table %s currently can have only have one "
385 "index that will allow creation of entries",
386 table->name);
387 idx->createable = TRUE;
388 } else {
389 idx->createable = FALSE;
390 }
Lines 383-384 are basically a VERIFY which spits out useful information. Note that only developers should see this panic as it should happen the first time they try to boot up a kernel. I.e., the index creation is not normally runtime dependent on executing special paths.
Hmm, that was quick. I guess when I have to dive further, I'll add more annotation. ;>
In usr/src/uts/common/fs/nfs/, there are some historical naming conventions that can help you understand where you are in the code:
You may end up in code which doesn't follow these conventions, i.e., the spe code I will be adding, some of the mirror mount code, etc. But if you are look at a stack trace, you would be able to see this type of code being called by a function following the above patterns.
I just put a code review request out for 6751438 mirror mounted mountpoints panic when umounted on nfs-discuss (see [nfs-discuss] Code reviewers wanted for 6751438 mirror mounted mountpoints panic when umounted ).
The hardest part was finding time to test. This resulted from a fix made a couple of months ago. And at that time, both unit and mini-PIT testing showed no panics. And now the mirrormount test suite inside mini-PIT could reliably trigger a panic. Luckily, I understand what the bug is and the panics have stopped.
I'm also about to ask for a code review for 6738223 Can not share a single IP address, which is quite simple to fix and we probably never would have fixed it except for:
The basic issue is that you can not share to a single IP without explicitly mentioning a netmask. I go on about it in these old blog entries: [Open]Solaris and sharing subnets and single machines and Checking a host entry - some code analysis.
The fix is easier than the testing, but I'll do that in the morning after a fresh build and ask for the code review later in the day.