Predictable
Stephen Hahn's blog at Sun Microsystems
All | Pastime | Person | Peruse | Position | Process | Product

« Previous month (Aug 2004) | Main | Next month (Oct 2004) »
20040929 Wednesday September 29, 2004

Puzzle #2: cal(1)

(I've been fixing little smf(5) bugs, as well as revising our documentation, presentations and--most importantly--more block diagrams for this blog. But I bumped into an annoyance and thought I should share.)

As an young old-school Unix developer, I tend to live in terminal windows. One of my favourite commands is cal(1), which has a great default mode:

$ cal
   September 2004
 S  M Tu  W Th  F  S
          1  2  3  4
 5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30

But if you want to see an October calendar, you might get confused:

$ cal 10



                                10

         Jan                    Feb                    Mar
 S  M Tu  W Th  F  S    S  M Tu  W Th  F  S    S  M Tu  W Th  F  S
          1  2  3  4                      1                      1
 5  6  7  8  9 10 11    2  3  4  5  6  7  8    2  3  4  5  6  7  8
12 13 14 15 16 17 18    9 10 11 12 13 14 15    9 10 11 12 13 14 15
19 20 21 22 23 24 25   16 17 18 19 20 21 22   16 17 18 19 20 21 22
26 27 28 29 30 31      23 24 25 26 27 28      23 24 25 26 27 28 29
                                              30 31
         Apr                    May                    Jun
 S  M Tu  W Th  F  S    S  M Tu  W Th  F  S    S  M Tu  W Th  F  S
       1  2  3  4  5                1  2  3    1  2  3  4  5  6  7
 6  7  8  9 10 11 12    4  5  6  7  8  9 10    8  9 10 11 12 13 14
13 14 15 16 17 18 19   11 12 13 14 15 16 17   15 16 17 18 19 20 21
20 21 22 23 24 25 26   18 19 20 21 22 23 24   22 23 24 25 26 27 28
27 28 29 30            25 26 27 28 29 30 31   29 30

         Jul                    Aug                    Sep
 S  M Tu  W Th  F  S    S  M Tu  W Th  F  S    S  M Tu  W Th  F  S
       1  2  3  4  5                   1  2       1  2  3  4  5  6
 6  7  8  9 10 11 12    3  4  5  6  7  8  9    7  8  9 10 11 12 13
13 14 15 16 17 18 19   10 11 12 13 14 15 16   14 15 16 17 18 19 20
20 21 22 23 24 25 26   17 18 19 20 21 22 23   21 22 23 24 25 26 27
27 28 29 30 31         24 25 26 27 28 29 30   28 29 30
                       31
         Oct                    Nov                    Dec
 S  M Tu  W Th  F  S    S  M Tu  W Th  F  S    S  M Tu  W Th  F  S
          1  2  3  4                      1       1  2  3  4  5  6
 5  6  7  8  9 10 11    2  3  4  5  6  7  8    7  8  9 10 11 12 13
12 13 14 15 16 17 18    9 10 11 12 13 14 15   14 15 16 17 18 19 20
19 20 21 22 23 24 25   16 17 18 19 20 21 22   21 22 23 24 25 26 27
26 27 28 29 30 31      23 24 25 26 27 28 29   28 29 30 31
                       30

It's an interesting UI choice to assume that anyone would want the calendar for the year 10 C.E. Certainly I never do, and I'm pretty sure someone would have told me if UNIX systems were the professional historian's first choice for computing...

So today's puzzle is to make cal(1) more usable. If I enter cal month_num, give me the current month; If I enter "cal now" give me the 3-month window around the current month, like so:

$ cal now
   August 2004             September 2004          October 2004
 S  M Tu  W Th  F  S     S  M Tu  W Th  F  S     S  M Tu  W Th  F  S
 1  2  3  4  5  6  7              1  2  3  4                    1  2
 8  9 10 11 12 13 14     5  6  7  8  9 10 11     3  4  5  6  7  8  9
15 16 17 18 19 20 21    12 13 14 15 16 17 18    10 11 12 13 14 15 16
22 23 24 25 26 27 28    19 20 21 22 23 24 25    17 18 19 20 21 22 23
29 30 31                26 27 28 29 30          24 25 26 27 28 29 30
                                                31

Other than that, all other standard invocations of cal(1) should work as usual.

My example solution is a couple dozen line ksh(1) shell function, and I'll post it along with the best submissions. (Perl folks: no non-core modules, please.)

(2004-09-29 14:21:05.0) Permalink Comments [22]
20040923 Thursday September 23, 2004

smf(5): authorizations built-in

I mentioned yesterday that you can manipulate services if you have the appropriate authorizations, without needing to possess any privileges. For instance, my current shell has the following privileges and authorizations:

$ ppriv $$
117292: bash
flags = 
        E: basic
        I: basic
        P: basic
        L: all
$ auths
solaris.device.cdrw,solaris.profmgr.read,solaris.jobs.users,solaris.mail.mailq

And if we try to manipulate a service managed by smf(5) with this set of authorizations, we'll get a predictable result.

$ svcadm restart network/smtp
svcadm: network/smtp: Permission denied.

However, smf(5) defines two big authorizations:

(There are also lesser built-in authorizations—solaris.smf.modify.method, solaris.smf.modify.dependency, solaris.smf.modify.application, and solaris.smf.modify.framework—which allow the manipulation of properties within property groups of the mentioned type. And you can also customize the authorizations to allow an action at the instance level and to manipulate properties at the property group level.)

To make assigning batches of authorizations scale, the role-based access control facility (RBAC) allows the definition of rights profiles. (The definitions are contained in /etc/security/prof_attr, and the documentation is prof_attr(4).) The service management facility delives with two rights profiles we think are convenient:

$ grep ^Service /etc/security/prof_attr
Service Management:::Manage services:auths=solaris.smf.manage,solaris.smf.modify
Service Operator:::Administer services:auths=solaris.smf.manage,solaris.smf.modify.framework

We can then add the user_attr(4) database to connect a user with the appropriate profile, like:

$ grep sch /etc/user_attr
sch::::profiles=Service Management

(You can edit /etc/user_attr by hand or, if your password database is local, using the -P option to useradd(1M). The equivalent configuration via direct authorizations would read

$ grep sch /etc/user_attr
sch::::auths=solaris.smf.manage,solaris.smf.modify

and will work fine except that if the Service Management profile were to be enhanced subsequently, a user or role with the old explicity authorizations might not have the correct set for future operations.)

So now our authorization list is expanded, but our privileges are unchanged:

$ auths
solaris.smf.manage,solaris.smf.modify,solaris.device.cdrw,solaris.profmgr.read,solaris.jobs.users,solaris.mail.mailq
$ ppriv $$
117292: bash
flags = 
        E: basic
        I: basic
        P: basic
        L: all

and we can carry out our operation from an authorized, but unprivileged user account:

$ svcs network/smtp
STATE          STIME    FMRI
online         Sep_21   svc:/network/smtp:sendmail
$ svcadm restart network/smtp
$ svcs network/smtp
STATE          STIME    FMRI
online         23:58:21 svc:/network/smtp:sendmail

And, finally, since the user_attr(4) database has network name service backends, you can actually make authorization grants that apply across an administrative domain, whether you're giving out the big authorizations illustrated here or custom authorizations specific to a set of services running at your site.

(2004-09-23 00:04:33.0) Permalink Comments [2]
20040922 Wednesday September 22, 2004

smf(5): asking versus doing

Let's consider how applications are traditionally started: we execute (or the system executes) a command, such as fooadm(1M), which in turn calls fork(2), does some detachment work, and then calls exec(2) to run food(1M) (which is what we wanted). A schematic of this sequence would look like

For long-running, always-needed applications (which we call services), this model raises some questions:

(Lest anyone assume I'm pretending to novelty: most restarters (init(1M)) or super-servers (inetd(1M)) have answered the first two of these questions by offering a single, specific application model. But many daemons we run on systems today fit neither of these application models well.)

In smf(5), the service management facility, directly forking a service is discouraged. Instead, one requests that a service be enabled, and the master restarter, svc.startd(1M) or a delegate—like inetd(1M)—will do the fork(2)–exec(2) sequence. The equivalent diagram might be drawn as

Upon receiving an enable request from smf_enable_instance(3SCF) or svcadm enable fmri, svc.startd(1M) determines if the service's dependencies are satisfied and, if so, requests that the responsible restarter start an instance of the service. What the responsible restarter is doing is:

(The combination of the master restarter and the delegates are handling these calculations and operations for every service on the system, propagating their state changes and evaluating the impact of those state changes in turn.)

Moreover, because the smf_enable_instance(3SCF) request is evaluated based on the authorizations of the caller, fooadm(1M) can be run with no significant privileges. Since we split authorizations into action authorizations (non-persistent operations, like "restart this service") and modify authorizations (changing configuration aspects), it becomes straightforward to create an operator role that can tend a service, but not change its configuration or affect any other independent service on the system.

More flexible administrative assignments is one aspect of inserting smf(5) into Solaris, but we'll contrast these two approaches again—and reveal exactly what those purple rectangles represent.

* Not every restarter need offer a fork(2)–exec(2) application model, but presently all smf(5) restarters do.

(2004-09-22 01:34:38.0) Permalink Comments [1]
20040921 Tuesday September 21, 2004

Had no effect, given my coloured lenses

The Register is displaying with a Sun Blue theme today.

(2004-09-21 11:37:41.0) Permalink
20040920 Monday September 20, 2004

smf(5): the system knowing more means...

you can choose to know less. For instance, if you need to know what application model your program runs under, then you have to know how to start or restart or stop your application. The common example is that, if you run a process under inetd(1M), then telling inetd(1M) to take notice of your new service (or your newly deactivated service) requires knowing that

# pkill -HUP -x inetd

will cause inetd(1M) to rescan inetd.conf(4), its configuration file.

But for a service started by init(1M) out of inittab(4), you edit that configuration file and then

# init Q

And for a service that is started somewhere in /etc/rc*.d, restarting looks something like

# /etc/init.d/foo restart
Usage: /etc/init.d/foo {start|stop}
# /etc/init.d/foo stop
# /etc/init.d/foo start

and making this service start on boot consists of creating a link to the /etc/init.d file to some sequence number in the appropriate run level directory. (I'll omit discussion of saf(1M), but that's another distinct method for managing services.)

All of this, of course, could be simpler.

If the service describes what other service is responsible for starting it and stopping it (and restarting it or asking for its configuration to be refreshed), then a single command can relay the appropriate instructions to the responsible restarter. In smf(5), this command is svcadm(1M). Our examples above all reduce to

# svcadm restart application/foo

or to

# svcadm enable application/foo

depending on whether you wanted to restart or merely enable (and start) the service in question.

Of course, since we also know what other services application/foo requires, we can actually enable all required services automatically, by following our service graph. Let's make this more concrete: to enable the SSH daemon on Solaris, all you need to do is:

# svcadm enable -r network/ssh
# svcs -p network/ssh
STATE          STIME    FMRI
online         Sep_14   svc:/network/ssh:default
               Sep_14     100152 sshd

What's svcadm(1M) doing? We can ask it for verbose output:

# svcadm -v enable -r network/ssh
svc:/network/ssh enabled.
svc:/system/cryptosvc enabled.
svc:/system/filesystem/minimal:default enabled.
svc:/system/device/local enabled.
svc:/system/filesystem/usr enabled.
svc:/system/filesystem/root enabled.
svc:/network/loopback enabled.
svc:/system/filesystem/usr:default enabled.

svcadm(1M)'s output represents its traversal of the dependencies for sshd(1M). Taken across the many services included in Solaris, that's a lot of knowledge we've formalized and moved into the system. It becomes the basis for a lot of "meta-service" administration, including automated restart.

If you didn't know these dependencies already, you can interrogate the system (using svcs(1)) and cement your understanding; if you did, then you can answer the second-order questions like, "what is affected by a failure of system/utmp?" much more rapidly than in the past. Or you can instead know less and devote your newly freed neural capacity to understanding your application stack as a whole (or to maintaining encylopedic knowledge of Simpsons trivia...).

(2004-09-20 18:05:10.0) Permalink Comments [4]
20040917 Friday September 17, 2004

smf(5): a view from the moon

One interesting aspect of smf(5) is that we have pulled apart many of the assumed interrelationships between system services, and made them explicit. Doing this makes building availability and failure models much easier, but it also lets us see one projection of Solaris's shambling shape. (There's another interesting technique for dynamic discovery of relationships via DTrace, but I'll let Bryan show the image from those experiments when he's ready.) Everyone wanted to visualize the service graph that results, so Dan Price and David Bustos came up with a way to generate one. Here's the result, generated on my two-way Opteron system earlier today:

Because we'll be tweaking the graph a bit more, I'm only showing this scaled down version, but we can take a bit of a tour just from the gross features:

As you might guess, we've had to write numerous graph-aware diagnosis algorithms to make a large structure like this one navigable. We're looking forward to further enhancing our reporting and visualization tools to make troubleshooting easier still.

(Of course, knowing all the dependencies in Solaris doesn't protect you from the occupational hazard of overdiagnosis, as John and I spent an hour poking at every possible aspect of his system, which ultimately required a new network cable. I figure once a year I still end up following an "all possible software causes" algorithm, ending up in mdb(1) poking around the kernel, rather than checking cable connections or a bad software install.)

(2004-09-17 00:02:00.0) Permalink Comments [4]
20040915 Wednesday September 15, 2004

smf(5): sun.com, new blogging, boot

We had our moment on sun.com last week, which was pretty gratifying: my first slides on doing this project are from June 2001, and I'm not certain how long before that—1997? 1998?—we started sketching out aspects of the problem. smf(5), which we developed under the codename "Greenline", is now rolling out pretty well inside Sun, and with the inclusion of the Java Desktop System, makes for a nice fast workstation startup.

After being strapped into their ergonomic chairs for a year or two, people are understandably itchy to start explaining what they've been working on to others: Tobin's first entry suggests that he's setting himself a lofty goal of touring the entire facility in his blog. I know Liane and a couple of others are ready to get the word out as well. But, being a dilettante at heart, I'll just try to show some of the pieces I think are interesting.

The past couple of days I spent refining verbose boot, which let's you see what services the system is starting and in what order (and what went wrong). I'm eliding some of the boring parts, but it looks roughly like this:

SunOS Release 5.10 Version smf-fixes-sparc 64-bit
Copyright 1983-2004 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
DEBUG enabled
misc/forthdebug (467758 bytes) loaded
[ system/filesystem/root:default starting (Root filesystem mount) ]
[ network/pfil:default starting (pfil) ]
[ network/loopback:default starting (Loopback network interface) ]
[ network/physical:default starting (Physical network interfaces) ]
Sep 15 16:26:33/11: system start time was Wed Sep 15 16:26:20 2004
[ system/filesystem/usr:default starting (/usr and / mounted read/write) ]
[ system/device/local:default starting (Standard Solaris device configuration.) ]
[ system/identity:node starting (System identity) ]
[ system/filesystem/minimal:default starting (Local filesystem mounts) ]
[ milestone/devices:default starting (Device configuration milestone.) ]
Hostname: gremlins-b10
[ system/sysevent:default starting (System event notification service.) ]
[ system/identity:domain starting (System identity) ]
NIS domain name is mpklab.sfbay.sun.com
[ system/cryptosvc:default starting (Cryptographic services) ]
[ system/manifest-import:default starting (Service manifest import) ]
[ system/rmtmpfiles:default starting (Remove temporary files) ]
[ system/sysidtool:net starting (sysidtool) ]
[ system/power:default starting (Power Management) ]
[ system/keymap:default starting (Keyboard defaults) ]
[ system/name-service-cache:default starting (Name service cache daemon) ]
....
[ milestone/single-user:default starting (Single-user milestone) ]
....
[ system/console-login:default starting (Console login) ]
[ milestone/multi-user:default starting (Multi-user milestone) ]
[ network/inetd:default starting (inetd) ]

gremlins-b10 console login:

(On a Java Desktop System with GDM2 enabled, it just presents the GDM graphical login shortly after system/utmp is available.) One new aspect of boot is that the service "common" names are actually localizable although, since Solaris supports various splits between the root filesystem, /usr, and other filesystems, there's a bit of hidden acrobatics involved. As another example, the reason we print out the start time is because, on this system, prior to that message, we can't access the timezone data to convert the time-of-day to the local timezone. Fun.

(2004-09-15 16:45:12.0) Permalink
20040903 Friday September 03, 2004

Back from scatteredness

August was hectic. smf(5) required care and feeding, as did our newborn Nathaniel. I took a strange kind of quasi-paternity leave, by remaining home Tuesdays and Thursdays. (Don't do this—you're left with very few long concentration blocks on the workdays.) I also took Ben to Muskoka to celebrate a family history centennial, and we all went to Chicago for a memorial service for Dina's father. Plus we had a yellowjacket nest in the walls of our home. I'll write these all up shortly.

But I'm back at work now, and cooking up some entries on smf(5), resource management, and Solaris in general. (And for all of you who've written me in the meanwhile, if I haven't already replied, then I'm probably composing a reply now.)

(2004-09-03 10:16:16.0) Permalink
Stephen Hahn
Sun Microsystems
sch@sun.com
17 Network Circle
MS MPK17-301
Menlo Park CA 94025 USA