|
People are finding that setting 'zil_disable' seems to increase their performance - especially NFS/ZFS performance. But what does setting 'zil_disable' to 1 really do? It completely disables the ZIL. Ok fine, what does that mean?
Disabling the ZIL causes ZFS to not immediatley write synchronous operations to disk/storage. With the ZIL disabled, the synchronous operations (such as fsync(), O_DSYNC, OP_COMMIT for NFS, etc.) will be written to disk, just at the same guarantees as asynchronous operations. That means you can return success to applications/NFS clients before the data has been commited to stable storage. In the event of a server crash, if the data hasn't been written out to the storage, it is lost forever.
With the ZIL disabled, no ZIL log records are written.
Note: disabling the ZIL does NOT compromise filesystem integrity. Disabling the ZIL does NOT cause corruption in ZFS.
Disabling the ZIL is definitely frowned upon and can cause your applications much confusion. Disabling the ZIL can cause corruption for NFS clients in the case where a reply to the client is done before the server crashes, and the server crashes before the data is commited to stable storage. If you can't live with this, then don't turn off the ZIL.
The 'zil_disable' tuneable will go away once 6280630 zil synchronicity is putback.
Hmm, so all of this sounds shady - so why did we add 'zil_disable' to the code base? Not for people to use, but as an easy way to do performance measurements (to isolate areas outside the ZIL).
If you'd like more information on how the ZIL works, check out Neil's blog and Neelakanth's blog.
(2006-11-27 08:49:17.0/2006-11-21 10:47:51.0)
Permalink
Trackback: http://blogs.sun.com/erickustarz/en_US/entry/zil_disable
|
Posted by Joe Little on November 21, 2006 at 05:52 PM PST #
Its on Neil's list of things to do, though i don't think its at the top. We're trying to increase the performance of the system by default before giving people ways to circumvent the ZIL.
I would imagine it would be per filesystem as our nice infrastructure already supports per-filesystem properties, and there currently is one ZIL per filesystem.
Posted by 192.18.43.10 on November 22, 2006 at 09:15 AM PST #
Posted by Robert on November 22, 2006 at 09:39 AM PST #
Posted by Joe Little on November 22, 2006 at 10:18 AM PST #
hey Robert, this is orthogonal to memory usage. I would use '::kmastat' and 'arc::print' via 'mdb -k' to see where you're memory is being used. kmastat will provide a kernel wide view, and the arc's 's' will show you how many bytes the ARC is holding onto.
Posted by eric kustarz on November 22, 2006 at 11:33 AM PST #
hey joe, is the performance for NFS on ZFS worse than what you're seeing with UFS (or VxFS)? If so, we'd like to know.
I don't think the problem is ZFS (let alone the ZIL). Its the way the NFS protocol works. That's been explained on zfs-discuss, but i'll be posting a separate blog for that. Disabling the ZIL is a hack and really a violation of NFS/POSIX.
Posted by eric kustarz on November 22, 2006 at 11:36 AM PST #
Posted by Joe Little on November 22, 2006 at 03:32 PM PST #
Posted by Bfactor on November 26, 2006 at 08:22 PM PST #
For the UFS to ZFS comparison, make sure you're not comparing a single UFS filesystem against a RAID-Z. It's not a fair comparison, as RAID-Z offers redundancy and a single UFS filesystem does not.
I agree that the whole NFS/ZFS perceived problem is really that its a single threaded app, and to get better performance is to send more requests to the local filesystem.
Posted by eric kustarz on November 27, 2006 at 08:42 AM PST #
Hey Bfactor, putting the ZIL on a separate storage device is a good idea and one Neil is working on right now. NVRAM in its most simple form is a faster device than physical disks, so yeah assigning the ZIL to a NVRAM device would help performance.
Posted by eric kustarz on November 27, 2006 at 08:45 AM PST #