ZFS Automatic Snapshots - now with send/receive!
I've finished writing the next round of features for my ZFS Automatic Snapshots SMF Service prototype. You can download this as zfs-auto-snapshot-0.6.tar.gz
The main new features in this release are:
- ZFS Send/Receive support
- Multiple schedules per filesystem
The send/receive support means that if you want it to, the service can send backups of each snapshot, either full streams or incremental streams, depending on how the service is configured. The service will also send snapshots of all child filesystems, if required, though without the send -r support in ZFS, this is a little unweildy a at the moment.
There's a SMF property which the user can set to the command which should recover the backup stream. Typically, this would be a "zfs receive", but there's no reason why you couldn't simply cat the output to a unique file on an NFS server. I've altered the bundled GUI to also ask for these new options when it's constructing a new manifest:
The multiple schedules per filesystem feature allows the user to assign an optional label to each snapshot schedule, allowing multiple schedules for the same dataset. For example, for a given filesystem you might choose to take monthly full backups, sent to a remote server (and backed up to tape as a flat file), but also daily incremental backups, perhaps via zfs send/receive to a different server.
The label is also useful to quickly tell which services are running for which filesystems. For example, here's the configuration on my desktop at the moment:
root@haiiro[236] svcs | grep zfs online Aug_31 svc:/system/filesystem/zfs/auto-snapshot:space-archive online Aug_31 svc:/system/filesystem/zfs/auto-snapshot:tank-root_filesystem online 13:28:27 svc:/system/filesystem/zfs/auto-snapshot:space-timf online 17:47:37 svc:/system/filesystem/zfs/auto-snapshot:default online 18:00:02 svc:/system/filesystem/zfs/auto-snapshot:tank-new,backup online 18:01:02 svc:/system/filesystem/zfs/auto-snapshot:tank-new,moreoften
I've updated the documentation and README for these new features, but let me know if anything's unclear.
Finally, I'm trying hard to do the right thing in the face of failure. The service will move to maintenance should a backup fail for any reason, and the cron job should be removed in that case. Also, I'm doing some basic locking, to see if zfs send commands are still running before attempting to send another backup stream from the same instance. Unfortunately, there doesn't seem to be an atomic way to set/get properties from SMF from what I can see, but feedback is welcome.
I hope you find this stuff useful, and if you run into problems, bug reports would be great!
ps. Chris is also doing some pretty snazzy stuff with ZFS snapshots - over on his blog : well worth checking out!
[ update here ]
Posted by sickness on September 07, 2006 at 01:50 AM IST #
Posted by Tim Foster on September 07, 2006 at 11:39 AM IST #
Posted by Amit Kulkarni on November 20, 2006 at 06:07 PM GMT #
Posted by grif on July 06, 2007 at 11:43 PM IST #
Posted by Tim Foster on July 06, 2007 at 11:52 PM IST #
Posted by grif on July 13, 2007 at 10:54 AM IST #
Posted by Tim Foster on July 17, 2007 at 07:47 PM IST #
This is great! I have a basic question and then wondering why I have an error in the log.
1) When setting this up for the first time to use incremental back-ups, do I need to have a first back-up in place? For example, I just did " > /lc/bkups/hello" for the script. Not sure if it needed a send?
2) Have this error in one of my logs and don't understand why:
[ Sep 5 19:20:02 Disabled. ]
[ Sep 5 19:20:02 Rereading configuration. ]
[ Sep 5 19:20:14 Enabled. ]
[ Sep 5 19:20:14 Executing start method ("/lib/svc/method/zfs-auto-snapshot start") ]
[ Sep 5 19:20:14 Method "start" exited with status 0 ]
[ Sep 6 00:00:01 Rereading configuration. ]
[ Sep 6 00:00:01 No 'refresh' method defined. Treating as :true. ]
[ Sep 6 00:00:01 Stopping for maintenance due to administrative_request. ]
[ Sep 6 00:00:01 Executing stop method ("/lib/svc/method/zfs-auto-snapshot stop") ]
[ Sep 6 00:00:01 Method "stop" exited with status 0 ]
[ Sep 6 00:00:01 Stopping for maintenance due to administrative_request. ]
[ Sep 6 00:00:01 Rereading configuration. ]
thanks!
Posted by aorchid on September 06, 2007 at 09:35 PM IST #
Hi aorchid - glad you're finding it useful ( look for version 0.8 if you're not already using it )
If you're doing incremental backups, you don't need an initial backup - the system should create one of those for you first. It looks for an older snapshot that matches the same naming policy ("zfs-auto-snapshot:label"), and if that doesn't exist, takes a full backup the first time, subsequent backups are incremental.
For the backup command, the script does a " zfs send $LAST_SNAP | $BACKUP_SAVE_CMD" - so your backup command should probably be "cat > foo", rather than just "> foo" (anyone else, feel free to correct me if you've better ideas - I'm usually wary of using cat in this way)
As for the errors you're seeing, unfortunately logging is one of the known weaknesses of this service - see more in the last paragraph of http://blogs.sun.com/timf/entry/smf_philosophy_more_on_zfs
You might find more information in the cron logs, usually mailed to the cron user - /var/mail/root probably in this case.
Finally, when debugging this, it's sometimes easiest to import the manifest, and start the service, and then directly run the method script from the command line giving the SMF URI as the argument, rather than waiting for cron to do it - eg.
# /lib/svc/method/zfs-auto-snapshot svc:/system/filesystem/zfs/auto-snapshot:space-timf,frequent
Hope this helps ?
Posted by Tim Foster on September 07, 2007 at 11:34 AM IST #
Thanks for your pointers with this (running version .8). I can run the backup from the command line using:
1. pfexec zfs send home/ftp@today | pfexec zfs recv -d zz/bkups, then
2. pfexec zfs send -i today home/ftp@zfs-auto-snap-2007-09-23-12:18:32 | pfexec zfs recv -d zz/bkups
Checking zfs list -t snapshot demonstrates all snapshots are present. Strangely enough, all these auto-snapshot services are being maintenanced once they run, but one of them continues to make snapshots locally, but not on the backup disk:
@solenv ~ % zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
zz/ftp@today 15K - 19K -
zz/ftp@zfs-auto-snap-2007-09-21-00:00:01 0 - 19K -
zz/ftp@zfs-auto-snap-2007-09-23-12:17:58 0 - 19K -
zz/ftp@zfs-auto-snap-2007-09-23-12:18:32 0 - 19K -
svcs:
maintenance Sep_21 svc:/system/filesystem/zfs/auto-snapshot:zz-ftp
You can see that the service has been in maintenance mode since the 21st, but there is a snapshot from it dated 9-23. How is that happening?
thanks
Posted by aorchid on September 23, 2007 at 08:33 PM IST #
That's interesting - I'd thought that my marking the service as maintenance (read the method script, there's a few scenarios where we deem this a sane thing to do) would disable the service, and remove the crontab entry for that set of automatic snapshots - evidently not. You can verify this by running "crontab -l" as root. If there's still an entry for zfs-auto-snapshot, then that'll be it.
As regards further debugging, rather than just running "pfexec zfs send", try actually running the method script directly from /lib/svc/method, using the FMRI as the argument to it. You'll get even more information by turning on the verbose property in the service, or just cut right to the chase, and invoke using ksh -x, eg. "ksh -x /lib/svc/method/zfs-auto-snapshot svc:/system/filesystem/zfs/auto-snapshot:space-timf,frequent"
That zfs send/recv works is good to know - now I just need to work out why my method script is deciding we should be moving to maintenance.
Feel free to mail me offline with more details, and we'll get this sorted out. [ first.last@sun.com ]
Posted by Tim Foster on September 23, 2007 at 09:25 PM IST #
Note for future visitors - version 0.6 is out of date! The latest version of this service is available from the sidebar of my blog (at the time of writing, this is version 0.10)
Posted by Tim Foster on June 15, 2008 at 01:10 PM IST #
Hello Tim,
I'm just playing with your tool on Solaris 10 (10/09) but I can't see sending/receiving support there. Is it working on Solaris 10 too?
If I run "pfexec ./zfs-auto-snapshot-admin.sh rpool/export/test"
it asks about frequency, number to save, children and label. That is all.
Version: zfs-auto-snapshot-0.6.tar.gz
Thanks for answer.
Jan Hlodan
Posted by Jan Hlodan on October 15, 2009 at 05:57 PM IST #