Wednesday Apr 22, 2009

The Problem

Recently one of my customers migrated a legacy SPARC system to a Solaris 8 branded zone on a new T5220 platform. This system provided an anonymous FTP service which no longer worked correctly within the branded Zone. More specifically, files could be uploaded & downloaded but directory listings always returned empty. Non-anonymous users reported no problems with any of the functionality

The Investigation

Initial investigation began by trussing in.ftpd within the Solaris 8 zone to see what was happening. Very quickly it was possible to see that in.ftpd forks and execs /bin/ls to generate the directory listing and this was failing. It's worth pointing out that on Solaris 8 anonymous FTP requires a chroot() environment (as per in.ftpd(1M)) and this was configured correctly

19710/1: execve("/bin/ls", 0x000353F0, 0xFFBFFE0C) Err#2 ENOENT

Very odd as ENOENT indicates 'no such file or directory' and /bin/ls definitely exists. Next step was to look more closely at the execve() syscall... I knocked up a very quick fbt DTrace script that would do this for me before realising that this was Solaris 8! Fortunately we can DTrace from the global zone, so after a few changes I can up with the following to trace the Solaris 8 exec calls:

#pragma D option flowindent

fbt::s8_elfexec:entry
/ execname == "in.ftpd" /
{ printf("s8_exec: execname: %s", execname); self->follow = 1; tracing = 1; }

fbt:::
/ self->follow /
{}

fbt:::return
/ self->follow /
{ trace(arg1); }

fbt::s8_elfexec:return
/ self->follow /
{ trace(arg1); self->follow = 0; exit(0); }

Running this from the global zone at the same time as issuing an LS in the anonymous FTP client and I could see that ENOENT was being returned from a call to lookupnameat(). The DTrace was updated with the following to print out the full filename of the file that was being looked up:

fbt::lookupnameat:entry
/ self->follow /
{ printf("%s", stringof(args[0])); }

Another LS from the client and the problem becomes clear -- we are trying to load /.SUNWnative/usr/lib/s8_brand.so.1. Until now I was not familiar with the /.SUNWnative directory but this exists on all Solaris 8 & 9 zones. Within this directory are three lofs mounts that present /lib, /usr and /platform to the local zone from the global:

/zones/s8_lt203398/root/.SUNWnative/lib on /lib read only/setuid/nodevices/dev=4010002 on Tue Apr 14 16:59:39 2009
/zones/s8_lt203398/root/.SUNWnative/platform on /platform read only/setuid/nodevices/dev=4010002 on Tue Apr 14 16:59:39 2009
/zones/s8_lt203398/root/.SUNWnative/usr on /usr read only/setuid/nodevices/dev=4010002 on Tue Apr 14 16:59:39 2009

Those that have kept up will now know why things are failing -- everything here hinges around anonymous FTP using a chroot() environment and those Solaris 8 zone-specific mounts not existing (why would they? in.ftpd(1M) was written years before Zones were around)

The Fix

Despite all of this investigation the real fix is far more obvious: don't use the legacy Solaris 8 FTP daemon when your system is running Solaris 10 (or OpenSolaris)!

The same functionality can be achieved by creating a minimal Solaris 10 local zone & setting up anonymous FTP there. The newer Solaris 10 ftpd provides additional functionality over Solaris 8. The last step in this method is to set up a loopback mount in the global zone to present the relevant directory into the Solaris 8 zone

The Workaround

Not everybody will have the option of adding an additional Solaris 10 zone to deliver "The Fix" so here is a workaround to get Solaris 8 anonymous FTP working. The good news is that it isn't at all hacky, it's just a few extra steps that are missing from in.ftpd(1M)

Armed with the knowledge that we are missing the /.SUNWnative mounts within the chroot() environment we simply add these in and things will begin to work. The updated mount output will now include:

/export/home/ftp/.SUNWnative on /.SUNWnative read/write/setuid/zone=s8_lt203398/dev=4010009 on Wed Apr 15 02:20:15 2009

Here my chroot environment exists under /export/home/ftp. And a more permanent way to achieve this is by updating your zone's vfstab with:

/.SUNWnative    -    /export/home/ftp/.SUNWnative    lofs    -    yes    ro

As a final note, in.ftpd(1M) didn't seem to mention that /lib/ld.so.1 should also be included in the chroot() environment. It should, and things won't work unless it is there. Don't forget to ensure the permissions are all set correctly

Sunday Feb 01, 2009

udp_get_next_priv_port() from http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/inet/udp/udp.c
Assuming the system is unlabeled, what port will be returned when we're called (hint: it isn't IPPORT_RESERVED-1)?
No doubt this will be trivial for the C guru, but it had me scratching my head for a little while. It's probably time that I went back and read Programming 101...
/*
 * Return the next anonymous port in the privileged port range for
 * bind checking.
 */
static in_port_t
udp_get_next_priv_port(udp_t *udp)
{
        static in_port_t next_priv_port = IPPORT_RESERVED - 1;
        in_port_t nextport;
        boolean_t restart = B_FALSE;
        udp_stack_t *us = udp->udp_us;

retry:
        if (next_priv_port < us->us_min_anonpriv_port ||
            next_priv_port >= IPPORT_RESERVED) {
                next_priv_port = IPPORT_RESERVED - 1;
                if (restart)
                        return (0);
                restart = B_TRUE;
        }

        if (isystem_labeled() &&
            (nextport = tsol_next_port(crgetzone(udp->udp_connp->conn_cred),
            next_priv_port, IPPROTO_UDP, B_FALSE)) != 0) {
                next_priv_port = nextport;
                goto retry;
        }

        return (next_priv_port--);
}

Wednesday Jan 28, 2009

Thanks to an invitation from Clive King I attended Cardiff university for the day to present on Solaris CIFS and VSCAN as part of a set of tech talks for academic customers

As promised to those guests I'm providing a copy of my slides (thanks Jarod & our lab interns), at the end of which there is the list of references along with links to the relevant sites. You can also check out the sites I visited while gathering the information for the slides as well as my notes

If there were any other questions, just drop a comment and I'll endeavour to give you an answer

Wednesday Sep 05, 2007

I recently had a customer contact the Solutions Centre requesting a procedure to grow a Solaris Volume Manager (previously known as Solstice DiskSuite) RAID1 mirror using an old, unwanted volume.

In order to answer the customer's question authoritatively I turned to Sun's extensive 'Global Labs', where I booked a suitable host and installed the same version of Solaris as the customer. In this circumstance the global lab I picked was actually in the same building as me, but the procedure was done remotely using the lab tools. For all intents and purposes, the system could very well have been in the global lab in Singapore.

With Solaris installed I quickly duplicated the customer's configuration. He was using two disks in his mirror, giving him good redundancy:

d10 (RAID 1 mirror, 20GB; mounted on /export/home)
|- d11 (c1t0d0s0, 20GB)
\- d12 (c2t0d0s0, 20GB)

d20 (RAID 1 mirror, 30GB; mounted on /export/backup)
|- d21 (c1t0d0s1, 30GB)
\- d22 (c2t0d0s1, 30GB)

In this example, the customer is now using a remote host for backup, so the d20 metadevice is no longer being used. He wanted to grow d10 by the capacity of d20.

The following simple steps achieve this with 0% downtime. It is not even necessary to unmount the volume that is being grown:

  1. Take a backup of /export/home. Verify this backup.

  2. Unmount d20:
  3. # umount /export/backup

  4. Remove all references to the d20 mirror and all sub-mirrors with metaclear. In this step the -r flag indicates that d20, d21 and d22 will be removed:
  5. # metaclear -r d20

  6. Attach the slices previously used in the d20 mirror to each of the d10 sub-mirrors:
  7. # metattach d11 c1t0d0s1
    # metattach d12 c2t0d0s1

  8. Tell d10 that the sub-mirrors have changed and that the size of the metadevice has changed:
  9. # metattach d10

  10. Finally, we need to use growfs to grow the UFS filesystem that lives on the d10 metadevice:
  11. # growfs -M /export/home /dev/md/rdsk/d10

Hopefully this simple six-step procedure will be of use to those new to Volume Manager.

This blog copyright 2009 by Lewis Thompson