Thursday September 17, 2009 | Constantin's Blooog |
|
Useful stuff for your blog-reading pleasure.
All
|
General
New OpenSolaris ZFS Auto-Scrub Service Helps You Keep Proper Pool Hygiene
One of the most important features of ZFS is the ability to detect data corruption through the use of end-to-end checksums. In redundant ZFS pools (pools that are either mirrored or use a variant of RAID-Z), this can be used to fix broken data blocks by using the redundancy of the pool to reconstruct the data. This is often called self-healing. This mechanism works whenever ZFS accesses any data, because it will always verify the checksum after reading a block of data. Unfortunately, this does not work if you don't regularly look at your data: Bit rot happens and with every broken block that is not checked (and therefore not corrected), the probability increases that even the redundant copy will be affected by bit rot too, resulting in data corruption. Therefore, It should now be clear that every system should regularly scrub their pools to take full advantage of the ZFS self-healing feature. But you know how it is: You set up your server and often those little things get overlooked and that Introducing the ZFS Auto-Scrub SMF ServiceHere's a service that is easy to install and configure that will make sure all of your pools will be scrubbed at least once a month. Advanced users can set up individualized schedules per pool with different scrubbing periods. It is implemented as an SMF service which means it can be easily managed using The service borrows heavily from Tim Foster's ZFS Auto-Snapshot Service. This is not just coding laziness, it also helps minimize bugs in common tasks (such as setting up periodic cron jobs) and provides better consistency across multiple similar services. Plus: Why invent the wheel twice? RequirementsThe ZFS Auto-Scrub service assumes it is running on OpenSolaris. It should run on any recent distribution of OpenSolaris without problems. More specifically, it uses the -d switch of the GNU variant of date(1) to parse human-readable date values. Make sure that /usr/gnu/bin/date is available (which is the default in OpenSolaris). Right now, this service does not work on Solaris 10 out of the box (unless you install GNU date in /usr/gnu/bin). A future version of this script will work around this issue to make it easily usable on Solaris 10 systems as well. Download and InstallationYou can download Version 0.5b of the ZFS Auto-Scrub Service here. The included README file explains everything you need to know to make it work: After unpacking the archive, start the install script as a privileged user:
The script will copy three SMF method scripts into
After installation, you need to activate the service. This can be done easily with:
or by running the GUI with:
This will activate a pre-defined instance of the service that makes sure each of your pools is scrubbed at least once a month. This is all you need to do to make sure all your pools are regularly scrubbed. If your pools haven't been scrubbed before or if the time or their last scrub is unknown, the script will proceed and start scrubbing. Keep in mind that scrubbing consumes a significant amount of system resources, so if you feel that a currently running scrub slows your system too much, you can interrupt it by saying:
In this case, don't worry, you can always start a manual scrub at a more suitable time or wait until the service kicks in by itself during the next scheduled scrubbing period. Should you want to get rid of this service, use:
The script will then disable any instances of the service, remove the manifests from the SMF repository, delete the scripts from
Advanced UseYou can create your own instances of this service for individual pools at specified intervals. Here's an example: constant@fridolin:~$ svccfg svc:> select auto-scrub svc:/system/filesystem/zfs/auto-scrub> add mypool-weekly svc:/system/filesystem/zfs/auto-scrub> select mypool-weekly svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> addpg zfs application svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/pool-name=mypool svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/interval=days svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/period=7 svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/offset=0 svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> setprop zfs/verbose=false svc:/system/filesystem/zfs/auto-scrub:mypool-weekly> end constant@fridolin:~$ svcadm enable auto-scrub:mypool-weekly This example will create and activate a service instance that makes sure the pool "mypool" is scrubbed once a week. Check out the Implementation DetailsHere are some interesting aspects of this service that I came across while writing it:
Lessons learnedIt's funny how a very simple task like "Write an SMF service that takes care of regular zpool scrubbing" can develop into a moderately complex thing. It grew into three different services instead of one, each with their own scripts and SMF manifests. It required an extra RBAC role to make it more secure. I ran into some zpool(1M) limitations which I now feel are worthy of RFEs and working around them made the whole thing slightly more complex. Add an install and de-install script and some minor quirks like using GNU date(1) instead of the regular one to have a reliable parser for human-readable date strings, not to mention a GUI and you cover quite a lot of ground even with a service as seemingly simple as this. But this is what made this project interesting to me: I learned a lot about RBAC and SMF (of course), some new scripting hacks from the existing ZFS Auto-Snapshot service, found a few minor bugs (in the ZFS Auto-Snapshot service) and RFEs, programmed some Java including the use of the NetBeans GUI builder and had some fun with scripting, finding solutions and making sure stuff is more or less cleanly implemented. I'd like to encourage everyone to write their own SMF services for whatever tools they install or write for themselves. It helps you think your stuff through, make it easy to install and manage, and you get a better feel of how Solaris and its subsystems work. And you can have some fun too. The easiest way to get started is by looking at what others have done. You'll find a lot of SMF scripts in If you happen to be in Dresden for OSDevCon 2009, check out my session on "Implementing a simple SMF Service: Lessons learned" where I'll share more of the details behind implementing this service including the Visual Panels part. Edit (Sep. 21st) Changed the link to CR 6878281 to the externally visible OpenSolaris bug database version, added a link to the session details on OSDevCon.
"New OpenSolaris ZFS Auto-Scrub Service Helps You Keep Proper Pool Hygiene" has been brought to you by Constantin's Blooog.
This entry was created on 2009-09-17 07:25:34.0 PST and is associated with the following tags:
opensolaris
script
scrub
service
smf
solaris
tool
useful
zfs
zpool
|
|