A recurring concern about smf(5) is the configuration repository. To some folks, it resembles the Windows registry too much for comfort. Rather than trying to contrast with the Windows registry or other registries such as GConf, I thought I'd talk about the design choices we made when deciding how to implement the smf(5) repository. Below is the high-level list of design criteria. It may not be complete, but captures a reasonable amount of what we were thinking when designing the repository.

  1. Transactional.

    All of smf(5) is designed to be completely restartable from the ground up. Do you doubt? First, try killing all user processes (kill -9 -1) on a non-critical Solaris 9 system -- one that nobody's using, please:

          wands console login: root
          Password:
          Feb  7 13:26:30 wands login: ROOT LOGIN /dev/console
          Last login: Tue Feb  1 14:44:40 on console
          Sun Microsystems Inc.   SunOS 5.9       Generic January 2003
          # ptree
          59    /usr/lib/sysevent/syseventd
          73    /usr/lib/picl/picld
          130   /usr/sbin/in.routed
          149   /usr/sbin/rpcbind
          152   /usr/sbin/keyserv
          162   /usr/lib/netsvc/yp/ypbind -broadcast
          178   /usr/sbin/inetd -s
          199   /usr/lib/nfs/lockd
          201   /usr/lib/nfs/statd
          202   /usr/lib/autofs/automountd
          214   /usr/sbin/syslogd
          222   /usr/sbin/cron
          227   /usr/sbin/nscd
          240   /usr/lib/power/powerd
          251   /usr/lib/utmpd
          263   /usr/sadm/lib/smc/bin/smcboot
            270   /usr/sadm/lib/smc/bin/smcboot
            271   /usr/sadm/lib/smc/bin/smcboot
          268   /usr/lib/im/htt -port 9010 -syslog -message_locale C
            275   htt_server -port 9010 -syslog -message_locale C
          285   /usr/lib/sendmail -bd -q15m
          286   /usr/lib/sendmail -Ac -q15m
          311   /usr/dt/bin/dtlogin -daemon
          312   /usr/lib/snmp/snmpdx -y -c /etc/snmp/conf
            315   mibiisa -r -p 15488
          323   /usr/lib/dmi/dmispd
          324   /usr/lib/dmi/snmpXdmid -s wands
          329   /usr/sbin/vold
          336   /usr/lib/saf/sac -t 300
            339   /usr/lib/saf/ttymon
          337   -sh
           1234  ptree
          340   /usr/lib/ssh/sshd
          # kill -9 -1
    
          wands console login: root
          Password:
          Last login: Mon Feb  7 13:26:30 on console
          Sun Microsystems Inc.   SunOS 5.9       Generic January 2003
          # ptree
          1235  /usr/lib/saf/sac -t 300
            1238  /usr/lib/saf/ttymon
          1236  -sh
            1242  ptree
          # 
          

    You've really little chance of recovering your system without a reboot. Note that even init(1M) has disappeared. Restarting it manually won't even do the trick, as it doesn't maintain its process table in a persistent place in Solaris 9.

    Now try the same kill -9 -1 on a non-critical Solaris 10 system. Again, all user processes are killed, including init, svc.startd, svc.configd, inetd, and everything else. You'll be logged out, but log back in and poke around. You'll see that nearly the entire system comes back (services started by their legacy init.d scripts won't, though). Based on doing this experiment now, I'll be filing bugs against a few services, but the experiment is generally successful. All of the core daemons I mentioned dying have returned (we kill them individually as part of our standard testing), and we give restarting all services a good college try.

    In order to implement restart from the kernel up completely, we needed a transactional backing store for all of our service information, including things like service state. If any of our daemons die halfway through an operation, they need to pick up where they left off when they return. Thus, the repository must be transactional to allow us to implement recoverability.

  2. Typed.

    We want to be able to validate that configuration information is at least of the appropriate form. In the future, we expect to be able to do even further validation than just on the type.

  3. Single point of access.

    We wanted all configuration and runtime data access through a single API that can be maintained across release boundaries. Flat file administration tools usually allow access through multiple mechanisms -- e.g. editing the file directly or using an admin GUI which edits the file. This type of access reduces the ability to write event-based APIs -- e.g. "tell me if this service has changed configuration or state". While we don't have many of those APIs yet, they're coming. If we allowed vi as a tool to manipulate the repository, there's no precise way to provide the notification API. A single API also decreases the time to write layered administrative tools.

  4. Access control.

    Allow a subset of configuration changes to be safely delegated to non-root users, without requiring that all configuration changes be allowed by those users. We, however, didn't include provisions for configuration data to be hidden from unprivileged users or applications. While modification is protected, reading is not.

  5. Layerable

    Our configuration store must be designed to support a mix of configuration (with overrides) at the network level (shared among many machines) as well as at the local machine. This isn't here in Solaris 10, but we've designed to allow configuration to span multiple machines. It was important our initial implementation didn't impede that goal. It's easy to imagine that our underlying storage format for local data might not be the same as that for network data.

  6. Service/instance model specific.

    This is more of a non-goal, but, we didn't want to design for a general data storage model. We wanted to constrain ourselves to the general service/instance schema that we've designed for smf(5). That isn't to say the schema couldn't be abused for other data, but we didn't design to make it easy. I realize that perhaps nobody's done the service/instance split blog entry yet. I mentioned it in passing in the service developer intro, but will try to write a dedicated entry later.

  7. Rollback.

    Allow administrators to easily revert to previous configuration versions. This is sometimes solved manually with a revision control system (e.g. SCCS) and flat text files.

  8. Checkable consistency.

    We should be able to confirm on startup that at least the format of the system's configuration data looks sane. Obvious filesystem corruption should be flagged explicitly rather than parsed as lack of or incorrect configuration.

  9. Fast.

    It's pretty tricky to implement a structured, typed, and transactional common store as flat files that's still quick enough for the state-change updates we need to do. Binary format files is usually the way to go. Some other projects doing parallel startup only use a binary cache of plain text files, but that doesn't handle the other design criteria we had. I'm sure there will be comments telling me more precise projects that have solved this problem using plain text files, but the ability to leverage other code can decrease development time.

  10. Endian-neutral export.

    Allow export of all configuration data in an endian-neutral format, so that configuration can easily be moved from machine to machine, regardless of architecture. An easy way to marshall the data out of a machine-specific format and into a standard format (e.g. XML) was considered sufficient.

  11. Embeddable.

    Any open-source solution used must be in a commercial product without licensing/royalty issues. Obviously, writing something ourselves easily gets around this constraint, but that would be a pretty significant additional investment over the implementation we already needed to do for smf(5).

Based on this set of design considerations, we decided on a four-part scheme.

  1. Service Manifests: XML service descriptions provide a transportable way to deliver individual services. No knowledge of the underlying data format nor the full service creation API is necessary for simple service delivery.

  2. libscf: A library provides the fundamental API which all tools can build on. In addition to providing transactional create/change/delete semantics, this also allows us to write tools which also dump the repository in a standard format. Try svccfg archive (re-direct the output to a file) to dump existing configuration of all services and instances in our standard XML manifest format. While it doesn't contain things like snapshot information, it does provide all the information that's necessary to restore a system to its current configuration.

  3. svc.configd: A daemon to manage the data store, providing a single point of access to underlying data for security, layering, etc.

  4. Repository/SQLite: A back-end transactional database to provide file-level storage for smf(5) configuration data. To be precise, we've got 2 backing databases. One is for persistent property groups which, well, persist across system restart. It's located in /etc/svc/repository.db. The other is the non-persistent properties, such as states, which don't need to be kept across system restart. The non-persistent database is kept in /etc/svc/volatile/svc_nonpersist.db

We decided to use SQLite for the local-repository implementation because there was simply no need to re-invent the wheel and implement a transactional database ourselves. SQLite fit all of our other design criteria. However, we haven't exposed that implementation in the interfaces. If SQLite no longer fulfills all of our requirements, we'll change to using a different underlying implementation. Existing code based on libscf(3LIB) or svcprop(1) will continue to work unmodified. That's the nice thing about hiding the data format behind a set of standard interfaces.

By the way, now that I've pulled back the covers on our implementation I should give the warning: direct access to the underlying repository is completely unsupported. If you scrog your repository using direct (sqlite) access, you're on your own. If you'd like to take a copy of repository.db and poke around in it, go for it! But, don't muck with the running copy lest you end up with a 'repository corrupt' message.

Comments:

And to think of it, I was just discussing the "whys and wherefores" of smf with a colleague today...great info, great summary!!! Thanks!

Posted by Umang on February 08, 2005 at 09:56 PM PST #

Apple and Slim Devices (http://slimdevices.com/) have also adopted SQLite to provide a stable, transactional data store. It's nifty seeing a project like this get adopted by so many companies.

Posted by Lars Kellogg-Stedman on March 13, 2005 at 06:20 PM PST #

Thanks for your structured and thorough info, really helps to a broader picture of the smf.

Posted by Per Öberg on April 07, 2005 at 11:08 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2009 by lianep