Friday September 07, 2007 pkg(5): a no scripting zone
In my previous two posts, we examined two packaging system options—installer-specific knowledge and integrated build system—that I believe present costs that exceed their benefits. Here, we will again examine a design choice from a negative perspective: package-associated scripting.
System V packaging is rich with scripting hooks; scripts named
checkinstall, preinstall, postinstall, preremove, request, and
the class action scripts. Each of these scripts can do anything they
like. Scripting, even in a relatively primitive shell, is an open-ended
program—opaque to the invoking framework. It's difficult to catch
an incorrect script prior to package publication time, which blocks our
intent to prevent propagation of bad package versions. With a more
limited set of actions—potentially with that limit enforced or
marked—a class of incompletely known resource handling mechanisms
can be kept off the most conservative systems.
One goal we
have is to preserve or improve the hands-off behaviour
associated with package operations. Legacy packaging allows hands-off
by imposing a series of tasks on the deploying administrator.
The pkgask(1M) tool can enable the deployer to develop a response to
the request script; coming up with an appropriate admin(4) can
restrict the framework's built-in interactive queries. (Interaction
with signed packages also requires the deployer to modify their pkgadd
invocation.) Removing the scripting degree of freedom means that
obstacles to hands-off behaviour come solely from an interactive
installer or from interactive services acting during system startup.
There's some amusingly egregious violations of the hands-off principle across the space of known packages. Less fun is that these set a bad example for later package developers.
A particularly error-prone aspect of the scripting interface in packaging comes from the variety of contexts the package developer must understand (and test within). It is legitimate to install packages on live systems, in alternate filesystem hierarchies of the same or different architecture, and in whole-root and filesystem inheriting zones; in fact, you have multiple choices about how your package should install in a zone.
We can expect the proliferation of virtualized systems, via the various mechanisms like LDOMs and xVM, to keep all of these contexts relevant as degrees of sharing make virtualization even more appealing. Making sure that the package system operates safely in these shared contexts is critical—another of our goals.
Returning to the zones case, the example pseudo-script in
pkginfo(4)—a series of nested shell if ; then blocks to
navigate some of these contexts—is helpful, but misleading. There
is much more variable state a package developer needs to consider to
reach correctness. In fact, if you aren't required to rediscover or
reinvent a set of resource-handling cases for each components your
package delivers, it becomes substantially simpler to make the package
and return to improving the software it contains. Reducing the set of
steps reduces developer burdens associated with packaging.
Two particular resources stand out: device drivers and smf(5)
services. Although some limited amount of awareness—or at least
easily duplicated code—makes these resources somewhat well-behaved
during package operations, there are still problems that scripting
presents: the addition of new contexts, the provision of multiple
genealogies of copied code, and the failure to discover an associated
best practice for any particular kind of resource.
There are other resources, of course; as a start, you could duplicate
our survey of the ON postinstall and class action scripts.
I believe the key counterargument supporting scripting is that the set of configuration patterns on Unix-like systems is large, and that the easiest means of upgrading each of these potential patterns is to allow a complete programming environment to the package developer. Probably true, but if we look at service and application configuration with respect to when a correct configuration state is required, the update step appears to separate into three classes:
Correct at system startup, no runtime context needed. These are the configuration settings that the various low-level boot components, the kernel, and the drivers need to bring the system to its running state. This class of configuration is generally limited to a specific set of resources, potentially established by a packaging system via corresponding resource-handling actions—or by an installer.
Correct at system startup, requiring runtime context. These are settings where the manipulating agent might be influenced by policy or require some form of interprocess communication to effect configuration changes.
smf(5) is an example of the latter, and handles its configuration evolution via themanifest-importservice. Manipulation of the various local name service tables, likepasswdor the RBAC configuration is another example, since data about potential principals must be correct for a group of affected services. Since such configuration can be required on the system as a result of package operations, these resources must also be handled via packaging, or require the use of an appropriate installer.Correct prior to service startup. Most service and application configuration falls into this class. It's not necessary, for instance, to bring a web server's configuration up to date if the service has no enabled instances. There seem to be a number of avenues for handling this kind of configuration: leaving it to the service or application, providing assistance via a configuration mechanism, or giving a hook where such updates can be made as needed. But the packaging system needn't provide this hook—there are a number of possible facilities, of varying suitability.
I should point out that David is making the smf(5) configuration
update scenarios much more capable and precise with the Enhanced
Profiles project. So, at least, a "configuration mechanism
with assistance" is likely to be present soon.
Since the first and second classes and how their configuration manipulations vary in the various operating contexts are generally known, elimination of the third class makes precise, no-scripting packages a viable design choice.
That's a long series of arguments in favour of a scripting-free package
system. It would be reasonable to ask: "can you actually do it?" So,
as a check on our prototype, we used the branded zone capability to let
us create a pkg(5)-based whole root zone. Here's a transcript
# zonecfg -z pkg_test pkg_test: No such zone configured Use 'create' to begin configuring a new zone. zonecfg:pkg_test> create -t SUNWipkg zonecfg:pkg_test> set zonepath=/export/pkg_test zonecfg:pkg_test> commit zonecfg:pkg_test> ^D # zoneadm -z pkg_test install Preparing image Retrieving catalog Installing SUNWcs SUNWesu SUNWadmr SUNWts SUNWipkg Setting up SMF profile links Copying SMF seed repository Done (115s)
There's dependency following, but no constraint handling; there's no filtering or snapshotting, but also none of the obvious performance optimizations has been implemented (for our 211MB resultant image). But the main point is: it works—installs, boots, upgrades, and still boots—with no scripting. Time for a project proposal.
[ T: OpenSolaris Solaris zone Indiana pkg ]
(2007-09-07 16:28:45.0) Permalink Comments [10]Comments are closed for this entry.
if i understand correctly, the problem with the current (scripting) approach is that there are simply too many situations in which the package should work but the packager might not have considered at the time he created the package, and that. what you seem to be proposing is that initial/update configuration of the package should be left outside the packaging framework and instead be handled by smf
am i on track?
Posted by nacho on September 07, 2007 at 11:34 PM PDT #
Progress is wonderful!
So which part of that transcript is the UPS prototype :o)
Posted by MC on September 08, 2007 at 01:41 AM PDT #
@nacho: Yes, although smf(5) isn't the only mechanism we could utilize.
@MC: The "Preparing... Retrieving... Installing..." are all pkg(1) operations. The SMF lines are fix-ups in the zone install, until we construct the "preserve" action and "opensolaris.zone" facet handling. Then this installation process should be purely pkg(1)-based (which won't be true for all installers, of course).
-- Stephen
Posted by Stephen on September 09, 2007 at 10:08 AM PDT #
you've made an interesting assumption there, you basically just said that the only reason those scripts exist is to configure the services but is that really so?.
also getting rid of the said scripts and moving that to smf (or another mechanism you havent explained yet) doesnt solve the problem, it just moves it to another layer and since smf scripts are just scripts you still have that opacity.
am i missing something?
Posted by nacho on September 09, 2007 at 10:32 AM PDT #
Removing the ability to run scripts during PKG installation will break many customer made scripts which are depending heavily from this feature. I know of a some big companies whose concept to distribute software will be at question if the scripting interface in pkgadd would be removed.
Posted by dominik on September 10, 2007 at 02:36 AM PDT #
@nacho: The scripts generally are either inspecting or modifying the destination image to bring its aspects into line with the package's expectations. Such modifications are generally to components that represent configuration in some fashion. By moving scripting out of packaging, we limit the contexts the script developer must consider: for instance, scripting in smf(5) means that there is a running system atop the image.
@dominik: Yes. The present plan is to leave System V packaging in place for compatibility, but not to use it for any of the packages delivered as part of the distribution. "New" system packages will provide, up to a point, aliases for the legacy packages they replace. The benefits are believed to outweigh the costs imposed on consumers of the current packaging APIs.
-- Stephen
Posted by Stephen on September 10, 2007 at 09:31 AM PDT #
do you know when will we have some more documentation about this? i'd like to know how you solved a ton of stuff, sysv packaging compatibility, dependency checking, i guess i'm just a curious idiot :P
Posted by nacho on September 10, 2007 at 09:21 PM PDT #
@nacho: I'm writing the project proposal for the Installation CG to approve now. I don't claim we've solved every issue by any means--we're just coming out of a prototyping phase. -- Stephen
Posted by Stephen on September 11, 2007 at 11:22 AM PDT #
First of all, excellent work, the prototype looks impressive!
So one question is whether or not there continues be a concept of 'clusters' and what kind of public interface or knowledge of clusters pkg(1) has. The reason I ask this question is in Solaris there's a limited set of clusters, but there are clearly reasons people may want to have variations on clusters of higher-level packages. Each one of those may have different configuration defaults.
You probably know one project I'm thinking of that could use something like this (Web Stack), but that's not an exclusive use case. There are say, with Java Enterprise System, use cases where you'll configure a portal server against a webserver, or a portal server against an application server. Right now, that whole area has something even more ugly than pkgask and friends (IMHO), but it sure would be nice to be able to provide OS level support for clusters of packages, configuration and all, to things outside Solaris itself.
Posted by Matt Ingenthron on September 14, 2007 at 03:53 PM PDT #
@Matt: I have a dream: Install the packages for Apache, Database, Scripting-Language and the Application (e.g. Typo3). The dumb packages do almost no configuaiton (minimalistic scripting). The pieces are glued together by configuration recipes which can be tweaked by site-local profiles or an interactive dialog with the administrator.
The big advantage would be, simple standardized base packages are the basement for multiple use-cases, where the more complex configurations tasks are covered by 1 configuration-recipe for the whole stack.
Posted by Thomas Wagner on September 18, 2007 at 07:40 AM PDT #