Weblog

All | General | Java | Music
Main | Next month (Jul 2005) »
20050615 Wednesday June 15, 2005

A Tour Through Devfs and Device Permissions A Tour Through Devfs and Device Permissions

Admit it. You've always been curious to understand how the specifications in /etc/minor_perm get reflected in device permissions enforced at device access time. Perhaps even more than curious. Perhaps obsessed, even.

As a friend of mine would say with a sarcastic sneer: yeah, sure.

No doubt you're just like the rest of us and never given minor_perm any thought at all. Anyone who would want to spend even a nanosecond thinking about such a bland feature in an operating system when there's got to be a sitcom rerun playing on some channel would have to have a screw loose, right? Allow me to give you a brief tour describing how devfs manages the permissions specified in /etc/minor_perm before you reflexively answer that question in the affirmative.

Before Solaris 10 and devfs, the hierarchy below /devices was simply implemented in the root filesystem as a series of mkdir and mknod operations. Utilities such as devfsadm or drvconfig, responsible for creating the /devices hierarchy, would simply read /etc/minor_perm, applying any necessary changes with chmod and chown operations. This was simple but inefficient and relatively static.

Enter devfs. Devfs, as a filesystem mounted on top of /devices, has access to the files and directories of the underlying filesystem. The initial delivery of devfs still relied upon devfsadm to statically perform chmod/chown operations on device nodes as necessary. These attribute operations were implemented in devfs by simply pushing the operation down to the underlying root filesystem. All device access required checking the attributes of the underlying filesystem. This was necessary for two reasons: the information available in /etc/minor_perm was not available to devfs, and the only persistence mechanism available to devfs was to record the change in the underlying filesystem.

With the kernel-based filesystem implementation of devfs however, we had the opportunity to do something better. Well in time for the final builds of Solaris 10, we rewhacked the implementation of attributes management by pushing the minor perm information into the kernel. Together with a simple regular expression handler, this permits devfs to determine a minor node's ownership and permission attributes dynamically as a device is being configured on-demand.

Further, devfs gives us the opportunity to hard-wire certain aspects of system behavior so they can no longer be modified to operate incorrectly. For example, the permissions of traditional UNIX nodes such as /dev/null are now fixed so that it's impossible to remove write permission to this device. This is probably more of a performance optimization than a security feature but still it seems nice to simplify the operation of certain commonly-used and essential features in this fashion.

The actual ordering of ownership and permission attributes enforced within devfs is:

So, what's cool about this?

Solaris now no longer relies so heavily on the device nodes in the root filesystem underlying /devices. The system is more dynamic, scales a little better and requires less disk accesses to operate. In fact, Solaris can boot with an entirely empty base /devices hierarchy and yet the necessary permissions for all device nodes are being properly enforced. This was unimaginable only a few years ago. As delivered, /devices is not yet empty but that's just a matter of some more bug fixes here and there.

A minor_perm specification can now be applied dynamically to the system. Permissions derived from minor_perm were statically applied once. Now, an update to minor_perm performed via update_drv(1M) is applied dynamically to the system, and will be honored by all subsequent device accesses. Granted, most device permissions, with the exception of features such as logindevperm, are generally static, so this is for the most part a convenience to the occasional driver developer.

The administrator has explicit control. Device permissions can be modified for all minors, via update_drv(1M). Or a selected device can be changed via standard system tools. By the way, changing the permission of a minor node back to minor perm defaults is effected by removing the underlying attribute node, as minor perm specifications are automatically applied to a minor with no underlying attribute node override.

There, I bet that was more interesting than you initially thought it would be. Somehow I can hear my friend saying, yeah, sure, Jerry.


Technorati Tag: OpenSolaris
Technorati Tag: Solaris ( Jun 15 2005, 04:53:25 PM PDT ) Permalink Comments [1]

20050614 Tuesday June 14, 2005

Devfs and the Devid Cache Devfs and the Devid Cache Before delving into a discussion of the devid cache itself, I'd like to present some background information. One of the most technically interesting (and personally satisfying) projects my group delivered in Solaris 10 was an improved method of device configuration known as devfs.

what is really cool about devfs is that it is a fully dynamic filesystem-driven method of device enumeration and attach. Being a filesystem, devfs lives entirely within the kernel and provides the usual filesystem services such as lookup and readdir. Mounted on /devices, the devfs filesystem controls the lookup of pathnames below that mount point. This means that a userland operation such as open(2) or stat(2) of a pathname such as /devices/pci@1c,600000 is satisfied by devfs enumerating and attaching the named instance of the pci driver.

This works similarly for arbitrary pathnames. For example, opening the device path:

first attaches an instance of the pci driver, then attaches below that an instance of a SCSI adapter driver, then finally an instance of the disk driver sd for the disk with target address 1, lun 0. The open would complete with an active file reference to the attached device instance. Note that devfs orders all operations so that a device instance is fully attached before an open of that instance can be initiated, and that each parent device instance is attached before beginning the configuration of a child of that parent.

What this means is that only the specific devices needed by the higher order operation need to be enumerated and attached. This is true even at boot time; to successfully boot only the boot device and the devices composing the bus interconnect to that device need to be attached. Past releases of Solaris would require that all disk devices be attached when mounting the root disk during boot, for example. Devfs is also fully dynamic: a device plugged into a hot-plug capable bus requires no special configuration to bring the device fully attached into a running Solaris. Within /devices, the special operation known as reconfigure boot is no longer necessary.

So devfs represents a not inconsiderable advance of the state of the device configuration art.

(Note: Solaris still does require a reconfiguration boot, but only to generate the /dev links to the physical devices in /devices. Yes, we have plans to improve this.)

Which, finally, brings me to the present topic: the devid cache.

Some filesystems and volume managers, such as SVM, use the Solaris device id, or devid, facility to access the volume's underlying components. The benefits that open-by-devid rather than open-by-path provide are that the system automatically finds the right device even if the device has been moved or the interconnect to the device has been rerouted. (Refer to ldi_open_by_dev(9F) for more information on what is meant by open-by-devid, and libdevid(3LIB) for information on devids in general.)

The problem with devids has been the implementation of the device discovery by the system underlying open-by-devid. The system typically has required that all devices be attached in order to discover the one device addressed by a specific devid. This meant that any use of open-by-devid for the most part negated the advantages provided by devfs.

To ameliorate this, we introduced a devid cache, which is nothing much more than a straight-forward mapping of devid to device path. When a disk providing a devid attaches, the driver registers that device's devid with ddi_devid_register(9F). The system can then establish that mapping of devid to path. This mapping is persisted in /etc/devices/devid-cache for now. With a devid registered, an open-by-devid can be reduced to a simple open-by-path of the associated device. Of course, for the open to succeed, the device must again register the same devid; the system ensures that is the case.

In the case that the desired device was moved, the device at that physical path would not register a matching devid. Or the device could no longer be functional. When open-by-path isn't successful, open-by-devid falls back to full device discovery. It's relatively rare that devices get shuffled around or fail however, and the full discovery does provide the desired semantics of open-by-devid, which are, a reference to the correct device, if present and functional, will be returned regardless of the maze of interconnect to that device.

Open-by-devid reducing to open-by-path is particularly desirable in SAN environments where full device discovery, meaning enumeration and attach of all extant devices on the SAN, would be prohibitive.

Other tidbits of information about the devid cache may be of interest. Since the cache is built up as devices attach, no configuration or administration of the cache is necessary. It is automatically updated to reflect the state of the machine as devices attach. Of course, on a machine on which disks are not moved around or replaced, for example your typical small server or desktop, no cache updates will be necessary. Multiple paths per devid are supported with open-by-devid returning a validated attached reference for each active path to a multipathed device. The cache repository is checksummed for consistency; if missing or corrupted, the cache will be rebuilt on demand.


Technorati Tag: OpenSolaris
Technorati Tag: Solaris ( Jun 14 2005, 12:08:11 PM PDT ) Permalink Comments [0]

Calendar

RSS Feeds

Search

Links

Navigation

Referers