Wow - Mika has a patch for my ZFS Automatic Snapshots service, to have it clean up old snapshots on the backup server.

We talked about this a bit via email, and I'm thinking I don't really want to incorporate the patch into this code. My thinking was two-fold - first, I'm not sure I'd really want clients futzing with the backup server: let the backup server admin decide what they want to do with the backups as they roll in.

Next, if you have a few hundred clients all running the ZFS Automatic Snapshots service, and you add diskspace to your backup server, you then perhaps need to change each client, in order to tell it not to destroy server-side backups any more.. Centralised administration, yeah?

Now, on the other hand, if we had a means of accessing a remote SMF repository, the client's zfs/backup-destroy-cmd could be retrieved from the backup server, properly allowing centralised administration of this feature.

Anyway, I think it's waay cool that Mika's thought about this and published his patch - he's got a two-server setup, where he wants backups from one machine to be copied over to the other machine, but also destroyed periodically. Not a backup server, so much as a "poor-man's cluster" (as Mika so eloquently put it) - so if you don't want to setup an actual cluster (for free!), then Mika's patch might be what you're after!

Update: Of course, I should also mention this thread, which talks about actual clustering using ZFS.. (as shipped, ZFS is not a clustered filesystem, don't plug a zfs pool into two hosts simultaneously, and expect it to work. more here and here)


Comments:

Hi Tim,

A few points that I could not find answers to in any ZFS FAQ, and that you could maybe point me to the answers to...

1) Is there any way of downloading the ZFS packages to an already-installed-and-running ZFS-less Solaris 10 (SPARC) system? I sure can't see it!

2) Is it possible to have ZFS manage non-local storage in its pools, eg over NFS from cheap Network Attached Storage (NAS)? Can that storage be heterogenous?

3) What happens with special cases where copy-on-write seems inappropriate for security and space reasons, eg: swap files, and security file over-writers (that try to scrub your data from the disc)? My reading of ZFS behaviour so far is that (a) they would fail to overwrite and (b) they would end up filling up the storage with the random data they are trying to overwrite with.

Rgds

Damon

Posted by Damon Hart-Davis on September 17, 2006 at 12:02 PM IST #

Hi Damon,

Sounds like we need to update the FAQ!

Here goes at attempting to answer your questions:

1. Yes, it's possible to patch a Solaris 10 system up to ZFS functionality, but there's quite a few patches needed to do this (start with 122640). My personal opinion is that it's probably easier to upgrade to Solaris 10U2 (06/06) if that's an option at all?

2. Yes, there's a few options - if your NAS server can give you iSCSI, then you can use the Solaris iSCSI initiator to give you a device that ZFS can use. I'd be more inclined to have my NFS server run ZFS, and share ZFS-managed storage via NFS or samba to my clients. (a Thumper would be great for this!)

3. Sounds like you're interested in the ZFS encryption project? You can already use zvols, via ZFS as swap devices.

Let me know if these answer your questions - and if so, whether we should add them to the FAQ, cheers!

Posted by Tim Foster on September 17, 2006 at 01:19 PM IST #

Hi Tim,

1) OK, answers it in a way I suppose, though not the way I wanted! OK, I may upgrade a current Sol 9 fileserver all the way to 10U2 with ZFS...

2) The NAS box I just invested in is plain NFSv3 or SMB (from a stand-alone black-box unit). Basically it's ~1TB for GBP1000-ish. I have two main "reliable" options with that box (a) have it do some internal secret RAID-5 variant and export one filesystem or (b) have it export 4 raw filesystems on separate intrernal discs and knit them together on the Solaris box through ZFS to give me better-than-RAID-5 plus checksums (since this is valuable data that has already suffered bit-rot though the years). I'm currently doing (a) but would like to do (b). Can I?

3) No, not really, yet. Just running my personal Java disc-scrubber utility and realised that it would probably not work at all on vanilla ZFS, and wondered how to solve the same problem, eg secure data destruction, and telling ZFS that the app wants to manage data integrity for certain files (eg swap files, RDBMSs in files, etc).

Yes, this stuff should be in a FAQ somewhere!

Rgds

Damon

Posted by Damon Hart-Davis on September 17, 2006 at 09:51 PM IST #

Hey Damon,

1) Upgrading from s9 to s10U2, yep - that's supported afaik.

2) Okay, again you're not going to like this, but I think the easiest way to get your black-box storage solution to run ZFS, is to get a hammer, open the box and install OpenSolaris! If the box can only give you NFS or SMB, really you've already got a filesystem - ZFS can't help you. Now, ZFS does allow you to create pools on top of existing files (eg. mkfile 64m /foo/bar.dat; zpool create pool /foo/bar.dat) so you could be tempted to create those on top of NFS, but this doesn't really buy you anything: your pool reliability is then only as good as the reliability of the underlying filesystem :-/

3) Right. There's a half a thread here about secure destruction using ZFS: summary, you still need to wait for the zfs crypto project. Encrypt the filesystem first, and then when you want to securely destroy it, swallow the key for that encrypted filesystem and that should be secure enough (assuming you have nice strong stomach acid) The guys working on the project would know more details - I haven't yet played with it.

Posted by Tim Foster on September 18, 2006 at 01:24 PM IST #

Hi Tim,

Continuing with (2) for the moment.

Supposing that I don't trust the reliability of a single NAS box as much as I trust a single Solaris (fileserver) box, and I want to guard against bit-rot with ZFS checksums (something I can't even do with the metafs and mirroring in the current S9 box that I use AFAIK).

Could I then create huge files over NFS on each of two or more cheap NAS boxes and use ZFS pooling to guard against (a) gross disc failure and (b) silent bit-rot on any of the NAS discs? That would be worth a lot to me.

Rgds

Damon

Posted by Damon Hart-Davis on September 18, 2006 at 02:13 PM IST #

Woah, careful! A few things here will end up in catastrophe. If you back ZFS with NFS-based files, then you can only base your safety on the validity of the data that NFS is telling you.

Can you be sure your NFS server is telling the truth? If the volume manager backing the NFS file system is lying, or NFS is lying, or your hardware is lying to NFS, then ZFS will be blissfully ignorant - you're in trouble: silent data corruption abound.

Moreover, if your network goes down, then your ZFS server looking at NFS-backed zpools may panic! ( you can try this today with file-based zpools )

File-backed pools are really only for evaluation of ZFS's capabilities, they're not meant to be used in production. Going through multiple filesystem layers makes everything complex.

Posted by Tim Foster on September 18, 2006 at 02:53 PM IST #

Hi,

OK, I was looking at ZFS as potentially a high-level data-integrity, reliability, pooling and performance abstraction layer ABOVE some cheap (NFS) NAS storage in the same way that Solaris metafs can do some of the job BELOW the filesystem layer.

(I run my increasingly-disc-hungry pro-bono site on a bit-more-than-shoestring, and thus was hoping that ZFS could "raise the game" of the commodity NAS a little!)

That means that I may just have to go on suffering bit-rot for the time being I guess. It's a nuisance seeing files rot in situ. I already keep (MD5 and other) checksums over the raw files, but really more to detect problems on the site mirrors' caches or in transit over the Net, since by the time the file checksum is seen to fail on the master it may be too late... Yes, I do have backups, lots of them, but backup media is not necesarily longer lasting than the primary disc...

Rgds

Damon

Posted by Damon Hart-Davis on September 18, 2006 at 04:11 PM IST #

Hmm, I see what you're after, but I'm not sure ZFS can help you here: the beauty of ZFS is that it manages the stack all the way to the disk, to paraphrase "it's checksums all the way down" - once you put more layers between ZFS and the disk (beyond say, a driver) you're introducing more potential point of failure, which ZFS /may/ not be able to detect. It is a local filesystem, as well - not a networked filesystem..

That said, these are just imho - it might be worthwhile dropping a mail to zfs-discuss to see what the others think.

Posted by Tim Foster on September 18, 2006 at 04:39 PM IST #

Hi,

Hmmm, I don't think that I want to add to my mail volume of 10,000+ mails per day (99.99% SPAM of course) and that looks like quite a busy list!

I guess I'll just have to knock up some application-level FEC to stop the rot!

Thanks again though!

Rgds

Damon

Posted by Damon Hart-Davis on September 18, 2006 at 05:23 PM IST #

Heh, come on in - the water's fine, and there's always the forums ? Good luck with the solution though - hope all works out!

Posted by Tim Foster on September 18, 2006 at 05:30 PM IST #

Will zfs work with EMC's metaluns? Before with ufs / format was unable to see the new geometry of the concat or stipe LUN. So I would zero out the VTOC and the grow the file system..

Posted by sid wilroy on October 13, 2006 at 04:06 AM IST #

hi Sid - I haven't tried this to be honest, but I believe you're describing the same thing as 6475340, in which case, the answer is, not yet, but it's on the todo list. Hope this helps.

Posted by Tim Foster on October 13, 2006 at 01:41 PM IST #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2009 by timf