ZFS automatic snapshot service logging
I've just finished a relatively minor update to the ZFS Automatic Snapshots SMF Service.
The main change here, was due to Bill complaining (nicely) that the service was a bit noisy when running from the cron script. Too right! Every time the service ran, it would spew the FMRI for the service - even when everything was working properly, it'd still print it, resulting in a noisier than necessary output.
(For some reason, I'm now quite sensitive to unexpected spewing, and can definitely sympathise about babies^H^H^H^H^H^Hdaemons that are noisier than they could be (Bananas had her 2 month immunisations today, and the poor wee soul is a bit under the weather as a result))
So as well as fixing that, I had another look at the way the method script is doing logging. Before, stdout from the script was being handled by cron, mailing the output to root. I've now changed all of the logging in the script, so that messages are now reported via logger(1) to syslogd(1M) : I'd still love to have the messages land in the appropriate SMF log for each instance, but since we're running the script from cron, not SMF, this is a bit tricky. With these changes, when running the service now, you'll messages in syslog like:
Nov 27 11:55:16 haiiro zfs-auto-snap: [ID 702911 daemon.notice]
space/timf@zfs-auto-snap:frequent-2006-10-28-22:30:00 being destroyed as
per retention policy.
- the service isn't too verbose, but let me know if you've any feedback. I've updated the README to mention this.
The new version is now available in zfs-auto-snapshot-0.7.tar.gz
So, we tried doing incremental backups using zfs send -i. There are only a few gigs of difference (maybe 10G) between snap1 and snap2 on my 2.7TB filesystem, but it takes about 40 hours to complete.
For people using ZFS for large tasks, what can we do? Is there a way to prioritize the zfs send over other I/O traffic on the zpool?
Posted by Theo Schlossnagle on November 29, 2006 at 04:51 AM GMT #
Hi Theo - that's interesting. Is it the case where copying random 10gb of data from the filesystem results in similar performance ? I'm curious to see if it's really zfs send that's the problem, or whether it's that the pool is just io saturated.
When you're using zfs send, what are you doing with the backup stream ? S10 06/06 I take it ?
I don't know of a way to prioritise more io to zfs send (or any other operation) - sorry.
Posted by Tim Foster on November 29, 2006 at 09:39 AM GMT #
Posted by mz on December 20, 2006 at 03:08 PM GMT #
Recursive send isn't yet supported by ZFS, but there is an RFE open to allow that (6421958), and I believe it's actively being worked on.
In the meantime, it would be possible to change this automatic snapshot service to do recursive sends (by simply iterating across all child filesystems) -- is that what you're after ? I'll add it in when I get a chance. Thanks for the interest.
Posted by Tim Foster on December 20, 2006 at 04:30 PM GMT #
Ack! What am I saying - it must be all this Christmas cheer :-)
I just checked the code, and see that I *already* support recursive send in this script, doing precisely what I mentioned above!
Now, there still isn't "zfs send -r" support in ZFS, but here, if the service property <code>zfs/snapshot-children</code> is set to <code>true</code>, and you've set <code>zfs/backup</code> to <code>full</code> or <code>incremental</code>, then we'll recursively send child filesystem snapshots by iterating across them.
See lines 402 and 412 in the method script installed as <code>/lib/svc/method/zfs-auto-snapshot</code>. Hope this helps!
Posted by Tim Foster on December 20, 2006 at 04:42 PM GMT #
Posted by Dick Davies on January 12, 2007 at 04:11 PM GMT #
Posted by Tim Foster on January 12, 2007 at 04:19 PM GMT #
Which results in :
Posted by Tim Foster on January 12, 2007 at 04:31 PM GMT #
Some more ideas for the crontab entries.
First, for simplicity, why be so explicit about it? You could use:
hourly: 0 * * * * (0 minutes, every hour, every day)
daily: 0 2 * * * (2:00am, every day)
weekly: 0 3 * * 0 (3:00am, every sunday)
monthly: 0 0 1 * * (midnight on the first of every month)
yearly: 0 0 1 1 * (midnight on the first of january)
Wouldn't such a scheme be easier to understand and less error prone than the explicit listing of each day/hour/week?
Note that the characteristics of the weekly backup are more predictable, load wise than using 1,8,15,22,29 as in the current script because the administrator could pick a day of the week (e.g. sunday) instead of having the 1st and subsequent days falling on different weekdays each month.
Just some food for thought.
Posted by Reid Spencer on November 28, 2007 at 09:56 PM GMT #
Hi Reid, yep - I did think about something like that, but it restricts you to taking single snapshots only on given boundaries (daily, weeky, hourly, etc) rather than being able to have, say "every 3 days", "every 2 weeks".
In terms of when to take the snapshot, I'd originally intended the "zfs/offset" property to allow you to chose how far into the period your snapshot/backup would be triggered, I just haven't implemented it yet.
(it'd be used in the get_divisor() function to add to the first argument, so $START = $(($1 + $OFFSET)) )
I think what I'm really after is a new cron implementation that allows for the flexibility I need. Having the '/' operator would make for cleaner crontab entries at least... Ultimately the aim here is to hide the complexity of cron from the user
Posted by Tim Foster on November 29, 2007 at 11:18 AM GMT #
Note for future visitors - this version is now out of date - the latest version is available via a link on the sidebar of my blog (at the time of writing, this is version 0.10)
Posted by Tim Foster on June 15, 2008 at 01:14 PM IST #