« October 2008 »
SunMonTueWedThuFriSat
   
4
10
15
17
18
19
21
22
24
25
26
27
28
29
31
 
       
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20081003 Friday October 03, 2008
So the closed binaries are live

I just announced on nfs41-discuss that the closed-binaries went live! See Mercurial Repository created.

When I configured my community, I followed all of the steps outlined at Setting up a pnfs community except for the mdsadm command on the MDS server:

[root@pnfs-9-10 ~]> mdsadm -o add -t auth -a ip=10.1.233.50
adding: IP Addr - 10.1.233.50

I experienced the following on my MDS console, which are being investigated but do not appear to be fatal:

[root@pnfs-9-14 ~]> Oct  3 17:00:11 pnfs-9-14 /usr/lib/nfs/nfsd[101025]: write failed for /var/nfs/v4_state/mds_/010.001.233.053-6448e6939a: write(669938620) returned -1 errno=14 ss_len=669938600
Oct  3 17:05:55 pnfs-9-14 nfssrv: NOTICE: op_destroy_session: SP4_NONE

And ditto for these on the DS console:

[root@pnfs-9-13 ~]> dservadm enable
[root@pnfs-9-13 ~]> Oct  3 16:52:29 pnfs-9-13 dserv[101033]: bad cmd: 3
Oct  3 16:52:29 pnfs-9-13 last message repeated 1 time
Oct  3 16:52:32 pnfs-9-13 dserv: WARNING: CLNT_CALL() ds protocol to mds failed: 5
Oct  3 16:52:32 pnfs-9-13 dserv[101033]: ioctl failed: I/O error
sahre
sahre: Command not found.
[root@pnfs-9-13 ~]> share
-@data/nfs4     /data/nfs4   anon=0,sec=sys,rw   ""  
[root@pnfs-9-13 ~]> Oct  3 17:00:27 pnfs-9-13 /usr/lib/nfs/nfsd[101019]: write failed for /var/nfs/v4_state/mds_/010.001.233.053-6448e693c4: write(419178265) returned -1 errno=14 ss_len=419178245

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
New gate and closed bins build a working pNFS community

So I have a successful community up and running. I'll push the closed binaries out later tonight. Life impinges...


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Hey, the source browser is up and running

Check out nfsv41/nfs41-gate.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
First successful push to nfs41-gate on opensolaris

We had the first developer make a real push to the nfs41-gate on the OpenSolaris NFSv41 Project Repository.

Jim Wahlig had this to push:

[thud@adept nfs41-gate]> hg incoming
comparing with ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate
searching for changes
changeset:   7743:c672b1cb86be
user:        Thomas Haynes 
date:        Thu Oct 02 22:28:30 2008 -0500
summary:     Added tag closedv1 for changeset 9fab48a31a4a

changeset:   7744:763bfa203d1a
tag:         tip
user:        jwahlig@aus-build3
date:        Fri Oct 03 11:52:59 2008 -0500
summary:     fix stable storage on x86.

The only issue we encountered was that mail did not get accepted for the nfs41-discuss mailing list. I'll have to look at that.

You can also see above the tag I pushed for closedv1.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Building OpenSolaris inside SWAN

Okay, building OpenSolaris with the opensolaris.sh environment inside SWAN is different. I first tried it with:

% ws cleanroom
% nightly opensolaris.sh

And got garbage. I tried palying with some environment variables and didn't get anywhere. I then tried it with bldenv:

% exit
% cd cleanroom
% bldenv -d opensolaris.sh
% nightly opensolaris.sh

That went fast:

/opt/SUNWspro/bin/dmake
dmake: Sun Distributed Make 7.7 2005/10/13
number of concurrent jobs = 36

No 32-bit compiler found
*** Error code 1
The following command caused the error:
if /builds/th199096/cleanroom/usr/src/tools/proto/opt/onbld/bin/i386/cw -_cc -_versions >/dev/null 2>/dev/null; then \

Finally, I went back to the ws approach and with the following opensolaris.sh diffs:

[th199096@jhereg cleanroom]> diff opensolaris.sh usr/src/tools/env/opensolaris.sh 
45c45
< GATE=cleanroom;                       export GATE
---
> GATE=testws;                  export GATE
48c48
< CODEMGR_WS="/builds/th199096/$GATE";                  export CODEMGR_WS
---
> CODEMGR_WS="/export/$GATE";                   export CODEMGR_WS
91c91
< STAFFER=th199096;                             export STAFFER
---
> STAFFER=nobody;                               export STAFFER
157c157
< #BUILD_TOOLS=/opt;                            export BUILD_TOOLS
---
> BUILD_TOOLS=/opt;                             export BUILD_TOOLS
159,161c159,160
< #SPRO_ROOT=/opt/SUNWspro;                     export SPRO_ROOT
< #SPRO_VROOT=$SPRO_ROOT;                               export SPRO_VROOT
< #__SSNEXT="";                                 export __SSNEXT
---
> SPRO_ROOT=/opt/SUNWspro;                      export SPRO_ROOT
> SPRO_VROOT=$SPRO_ROOT;                                export SPRO_VROOT
186d184
< export CW_NO_SHADOW=1

That seems to have worked. Now I need to test a pNFS community setup and run cthon.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Remember to read the README

So I have the closed binaries which correspond to the new nfs41-gate up on osol. I grabbed a copy of that source and started a build up. And it failed.

My thoughts were that either:

  1. I hosed the push to osol and thus all of the source was not there.
  2. The recent switch to Sun Studio 12 is impacting me.

The first is justifiable paranoia and the second has happened to me before. So, I searched my blog (more than 51% of why I blog is to have an easy to search repository of tips, tricks, and efdups.) and found this tidbit: RTFR - Or make sure you do read all of the README. Now it wasn't a direct hit, but what the hey, while I'm here I should read that README.

And sure enought, it has something on the compiler switch:

   Please note that the compiler that comes with the Solaris Developer
   Express release is Studio 12, which is not the standard compiler
   for OpenSolaris code.  If you use Studio 12, you will need to set
   __SSNEXT to the null string in your environment file.  Please do
   report problems with Studio 12, particularly if the problem goes
   away when you use Studio 11 (the current standard compiler).

I'll rebuild with that change and see if it is a hit or the paranoia is justifiable after all.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
How to get a Mercurial workspace after creating a ZFS clone

I wrote about how I didn't know how to mix Mercurial and ZFS data sets together to get a new clone on a new dataset. Dave Marker provided this insight:

zfs create pool/ws/th199096/spe-build
cd /pool/ws/th199096/spe-build
hg init
echo "[paths]" > .hg/hgrc
echo "default = ssh://anon@hg.opensolaris.org/hg/nfsv41/nfs41-gate" >> .hg/hgrc
hg pull -u

The trick is realizing that there is nothing magical about 'hg clone'.

And if at this point I want to do a closed gate, I can use my normal incantation because it wil be on the same dataset.

And I can then create a ZFS snapshot and clone that to my heart's desire.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Setting up a Development Project Gate

When we transitioned from TeamWare (tw) to Mercurial (hg), I made several attempts to craft a group workspace. The killer always seemed to be that I had to manually do an 'hg update' in the gate and that was too difficult to remember. In the end, I decided to mimic what the ON gatekeepers were doing. At first, I did it all by brute force, copying everything that they had in place. Eventually, since I didn't know Python, I started asking Dave Marker for help. And boy, did I get some. Anyway, here is what I went through to set up nfs41-gate and nfs41-clone.

Setup a restricted user account

You want to create a restricted user account for a couple of reasons. At first I thought this was to just keep people from sneaking a look at the hgrc files and such, but there is a broader need in that you want to force all writes to the gate to come through a single account. That way you can configure the push process to always go through some sanity checks. You don't want this to be your regular account, because it will restrict your ability to do things. I'll show that to you in a bit.

[nfs4hg@aus1500-home hook]> grep nfs4hg /etc/passwd 
nfs4hg:x:3530:1813:Mr. NFS4 HG:/pool/nfs4hg:/usr/bin/tcsh
[nfs4hg@aus1500-home hook]> grep 1813 /etc/group
mhg::1813:th199096

You want a uid and gid which is not in the NIS maps and you want the account to be local to your gate machine.

Setup a ZFS filesystem for your gate and clone

You want to leverage ZFS because snapshots and clones are your safety nets. A snapshot saves you from a bad command and a clone lets you try new things in a sandbox.

zfs create pool/ws/nfs41-gate
zfs create pool/ws/nfs41-clone
chown nfs4hg:mhg /pool/ws/nfs41-gate /pool/ws/nfs41-clone

Populate your gate and clone

Be sure to login as your restricted user:

warlock % ssh aus1500-home 
aus1500-home % su - nfs4hg

Get used to doing it this way - you won't be able to ssh directly as nfs4hg before too long.

I'm going to assume a new branch off of onnv-gate. If you have an existing tw or hg workspace, you can substitute in the relevant commands to create the gate. I never want to speak of migrating TeamWare to Mercurial.

I don't know why, but I have to do something like:

cd /pool/ws/nfs41-gate
hg clone ssh://anon@hg.opensolaris.org/hg/onnv/onnv-gate
mv onnv-gate/.hg* .
mv onnv-gate/usr .
rm -rf onnv-gate

I find Mercurial doesn't like it if the target directory already exists. If you are inside SWAN, make sure to also get the closed bits:

cd usr
hg clone ssh://anon@onnv.eng/export/onnv-clone/usr/closed

And now make your clone off of your new gate. You want to make sure that the hg paths are correct:

[nfs4hg@aus1500-home nfs41-clone]> hg paths
default = /pool/ws/nfs41-gate

Retrieve the gatekeeping Python extensions

You can get these via:

[thud@adept ~/foo]> hg clone ssh://anon@hg.opensolaris.org/hg/scm-migration/onnv-gk-tools
destination directory: onnv-gk-tools
requesting all changes
adding changesets
adding manifests
adding file changes
added 40 changesets with 171 changes to 50 files
35 files updated, 0 files merged, 0 files removed, 0 files unresolved

The onnv-gk-tools are the heart of setting up your project gate. For the nfs41-gate, I put them outside of both the gate and the clone:

[nfs4hg@aus1500-home ws]> zfs list | grep onnv-gk
pool/onnv-gk-tools                                                630K  4.31T   630K  /pool/onnv-gk-tools

Again, leverage ZFS for things like this tool set.

Read onnv-gk-tools/README at this point. I may have done things from there and forgotten to mention them here.

Copy hgrc files to gate and clone

You'll want to copy the existing hgrc files over to your gate and clone:

cp gate-hgrc /pool/ws/nfs41-gate/.hg/hgrc
cp gate-closed-hgrc /pool/ws/nfs41-gate/usr/closed/.hg/hgrc
cp clone-hgrc /pool/ws/nfs41-clone/.hg/hgrc
cp clone-closed-hgrc /pool/ws/nfs41-clone/usr/closed/.hg/hgrc

Set permissions on the gate and clone

I can't say this enough, from the README:

gate enforcement:
        Mercurial only lets you pull if you can read {REPO}/.hg
        Mercurial only lets you push if you can write {REPO}/.hg

        So {GATE} and {GATE}/usr/closed are owned by onhg/gk
        Mode is set to 0770

        {CLONE} and {CLONE}/usr/closed are also owned by onhg/gk
        But mode is set to 0775

Comment out temp hack in hook/notify.py

So, this should be configurable, but for now you want to comment out the following to avoid spamming a mailing list:

[nfs4hg@aus1500-home onnv-gk-tools]> diff hook/notify.py ~/onnv-gk-tools/hook/notify.py
80c80
<     # m.msg["Bcc"] = "onnv-flagdays@onnv.eng"
---
>     m.msg["Bcc"] = "onnv-flagdays@onnv.eng"

Note that you want to make sure the '#' is added right where the 'm' was - I hear Python really cares about indentation.

Copy on-hg.py to the homedir

You'll want to copy this file to the restricted account's homedir and rename it as well. You don't want to inadvertently refer to the ON one at any point:

cp on-hg.py ~/nfs4-hg.py

Edit the new etc/config.py file

This is the step that configures Mercurial to understand your gate.

I'm not going to step through the changes, I feel they are explanatory. Note though that later we will see that some of the Python scripts do not make use of parts of this file. I.e., GATE_USER could be used in the hook/updateoso.py file.

Hmm, and so far, all of the user accounts and paths needed for these changes exist.

[nfs4hg@aus1500-home onnv-gk-tools]> diff etc/config.py ~/onnv-gk-tools/etc/config.py 
87c87
< GATE_NAME = "nfs41"
---
> GATE_NAME = "onnv"
89c89
< GATE_WS = "/pool/ws/%s-gate" % (GATE_NAME)
---
> GATE_WS = "/ws/%s-gate" % (GATE_NAME)
91c91
< CLONE_WS = "/pool/ws/%s-clone" % (GATE_NAME)
---
> CLONE_WS = "/ws/%s-clone" % (GATE_NAME)
94c94
< GATE_DIR = "/pool/ws/nfs41-gate"
---
> GATE_DIR = "/export/onnv-gate"
96c96
< CLONE_DIR = "/pool/ws/nfs41-clone"
---
> CLONE_DIR = "/export/onnv-clone"
99,104c99,104
< GATE_HOST = "aus1500-home"
< GATE_ALTHOST = "aus1500-home"
< GATE_HOST_X = "aus1500-home"
< GATE_HOST_S = "aus1500-home"
< GATE_DOMAIN = "central"
< GATE_MAIL = "aus1500-home.central"
---
> GATE_HOST = "elpaso"
> GATE_ALTHOST = "juarez"
> GATE_HOST_X = "elpaso"
> GATE_HOST_S = "juarez"
> GATE_DOMAIN = "sfbay"
> GATE_MAIL = "onnv.eng"
106,110c106,110
< GATEKEEPER = "th199096"
< ASSTGATEKEEPER = "rmesta"
< TECHLEAD = "th199096"
< ASSTTECHLEAD = "rmesta"
< CTEAMLEAD = "webaker"
---
> GATEKEEPER = "dm120769"
> ASSTGATEKEEPER = "suha"
> TECHLEAD = "jbeck"
> ASSTTECHLEAD = "nickto"
> CTEAMLEAD = "muolla"
112,113c112,113
< ALIAS_GK = "th199096@%s" % (GATE_MAIL)
< ALIAS_GATEKEEPER = "th199096@%s" % (GATE_MAIL)
---
> ALIAS_GK = "gk@%s" % (GATE_MAIL)
> ALIAS_GATEKEEPER = "gatekeeper@%s" % (GATE_MAIL)
115,116c115,116
< GATE_USER = "nfs4hg"
< GATE_GROUP = "mhg"
---
> GATE_USER = "onhg"
> GATE_GROUP = "gk"
118,119c118,119
< SNAPS_DIR = "/pool/ws/snapshot"
< BUILDS_DIR = "/pool/ws/builds"
---
> SNAPS_DIR = "/export/snapshot"
> BUILDS_DIR = "/export/builds"

Edit the hgrc files

Now we go back and edit the hgrc files for the various pieces. These modifications tell the gate how to interact with the clone, etc.

Gate's hgrc

I will annotate these changes:

[nfs4hg@aus1500-home .hg]> diff hgrc ~/onnv-gk-tools/gate-hgrc 
17c17
< hook = /pool/onnv-gk-tools/hook
---
> hook = /export/onnv-gate/public/python/hook

Okay, we need to tell the gate where our config.py file is and how to use the extensions. The above does that. Note that if we do not make this change, we could impact ON.

20c20
< gatename = nfs41-gate
---
> gatename = onnv-gate
23c23
< wlock = nfs4hg, th199096
---
> wlock = onhg, dm120769, suha
27,28c27,28
< recv = pnfs-core@sun.com
< #logmail = onnv-gate-putback-log@onnv.eng
---
> recv = onnv-gate-notify@onnv.eng
> logmail = onnv-gate-putback-log@onnv.eng
32,33c32,33
< recv = thomas.haynes@sun.com
< rti = False
---
> recv = onnv-putback-diffs@onnv.eng
> rti = True

With a development gate, you bypass the RTI process. So, we should bypass the checking for it.

36,38d35
< [web]
< baseurl = http://aus1500-home.central
< 
40c37
< url = http://aus1500-home.central/pool/ws/nfs41-gate
---
> url = http://onnv.sfbay/net/onnv.sfbay
43c40
< temp = /pool/nfs4hg/webrev
---
> temp = /space/webrev

Ah, we will want to create this directory. I understand that you want this directory to have parents that do not have ".hg/" or "Codemgr_wsdata/". If there is even one with a subdirectory of with these names, it will mess things up.

45,48c42,49
< #[rti]
< #webrticli = /net/webrti/export/home/bin/webrticli
< #url = http://webrti.sfbay/rti/xml/index.php
< #project = on
---
> # advocate is only set for restricted builds.
> # When set only those listed (separated by commas) are valid RTI advocates.
> # Any others used will cause a rollback from rti.py
> [rti]
> webrticli = /ws/onnv-gate/public/bin/webrticli
> url = http://webrti.sfbay/rti/xml/index.php
> project = on
> #advocate = John (dot) Beck (at) sun (DOT) com

We really, really want to bypass RTI checking and John really doesn't want to be spammed by this. (And this is the only place I changed the source.)

51c52
< comchk = False
---
> comchk = True

We know a comments in a development gate are not going to be valid, so do no checks.

74c75
< #pretxnchangegroup.2 = python:hook.rti.rti
---
> pretxnchangegroup.2 = python:hook.rti.rti

Again, no RTI at all!

82c83
< changegroup.0 = /usr/bin/hg push -R /pool/ws/nfs41-gate /pool/ws/nfs41-clone
---
> changegroup.0 = /usr/bin/hg push -R /export/onnv-gate /export/onnv-clone
91d91
< 

Ahh, we should get the above out of etc/config.py, no?

BTW: The two lines I care about most here are:

changegroup.0 = /usr/bin/hg push -R /pool/ws/nfs41-gate /pool/ws/nfs41-clone
changegroup.1 = /usr/bin/hg update

Basically, after a push occurs, first push that change to the clone and then run update. See, I don't want to be doing that manually!

Also, note that because of the following lines:

[gatehooks]
gatename = nfs41-gate
logdir = public/log
lockdir = public/lock

You will want to create:

cd /pool/ws/nfs41-gate
mkdir -p public/log
mkdir public/lock

Gate's closed hgrc

All of the above changes apply, but the only real diff is the following:

81,82c81
< # push to hg.os.o will be done out of cron.
< changegroup.0 = /usr/bin/hg push -R /pool/ws/nfs41-gate/usr/closed /pool/ws/nfs41-clone/usr/closed
---
> changegroup.0 = /usr/bin/hg push -R /export/onnv-gate/usr/closed /export/onnv-clone/usr/closed
91d89

And again, the changes will automatically occur to the clone.

Clone's hgrc

This one is much simpler, mainly because there is not much there:

[nfs4hg@aus1500-home .hg]> cd /pool/ws/nfs41-clone/.hg
[nfs4hg@aus1500-home .hg]> diff hgrc ~/onnv-gk-tools/clone-hgrc 
12c12
< default = /pool/ws/nfs41-gate
---
> default = /export/onnv-gate

We tell the clone where the parent is located.

15c15
< hook = /pool/onnv-gk-tools/hook
---
> hook = /export/onnv-gate/public/python/hook
19c19
< gate = file:/pool/ws/nfs41-gate
---
> gate = file:/export/onnv-gate
36d35

Where are the hooks and the gate? All of this should be in etc/config.py.

BTW an important line here is:

# These hooks are run from bghook() in the background
bg-changegroup.0 = python:hook.updateoso.updateoso

When the clone gets updated, then we will push a change out to OpenSolaris!

If you don't have a repository out there, shame on you! Well, just comment out this line.

Clone's closed hgrc

Exact same diffs as above.

Understanding some things in the hgrcs

A big difference between the gate and the clone is in the hgrcs. The gate is write only and the clone is read only.

So the clone has to prevent writes before they occur. This line does that:

# This prevents boneheaded gatekeepers and gives a more useful message
# to gatelings who trust our hooks.
prechangegroup.0 = python:hook.cloneincoming.cloneincoming

And I haven't figured out how the gate keeps people from reading. Note, yes I have, see onnv-gk-tools/README. So part of the above may be wrong....

Modify that file in the homedir

These appear pretty self-explanatory:

[nfs4hg@aus1500-home ~]> diff nfs4-hg.py onnv-gk-tools/on-hg.py 
45c45
< HGLOGIN = "nfs4hg"
---
> HGLOGIN = "onhg"
54,55c54,55
<     "/pool/ws/nfs41-gate",
<     "/pool/ws/nfs41-gate/usr/closed",
---
>     "/export/onnv-gate",
>     "/export/onnv-gate/usr/closed",

Configuring ssh access

Okay, we almost have everything done that I remember. At this point, you need to start sending emails to your developers for them to send you in their SSH public keys -- see opensolaris.org SSH key help. They need to do this for ON anyway.

Once you get them, then you will add them to the ~/.ssh/authorized_keys of your restricted account. The format of each entry will be:

command="~/nfs4-hg.py 'th199096 ' ",no-port-forwarding,no-X11-forwarding,no-agent-forwarding [the contents of their id_ds.pub file]

You will have one per user.

The format needed here is discussed in onnv-gk-tools/README.

Disclaimer

I did all of this a month or so ago. I am reconstructing what I did. I may have missed some steps.

All of the steps reported here are mine. All mention of possible bugs is my opinion.

Notes

Cron jobs

The only cron job I have running is:

[nfs4hg@aus1500-home ~/onnv-gk-tools]> crontab -l
7 3 * * * /pool/ws/scripts/buildtags.sh /pool/ws/nfs41-clone/developer.sh

This will rebuild the cscope and tags databases in the clone. I could do this in another clone, but I like it occurring in a well known place. I do not want it in the gate.

Automatic push to OpenSolaris

I haven't provided the details on how to configure an automatic push to OpenSolaris...

A good link for jumping off: How to Use Mercurial (hg) Repositories


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily