Java and security bits
My reasons for choosing ZFS
A few weeks ago I migrated the data on my main development and build server to a ZFS pool. Let me explain why performance was not the most important reason to choose ZFS.
Discussions of ZFS and filesystems in general seem to quickly degenerate into performance discussions. Maybe that is because most filesystems are very similar in all other respects, but ZFS has so much more to offer that just talking about performance misses the most important points. Don't get me wrong: performance is clearly a precondition. If a filesystem does not perform competitively, nobody will use it. And ZFS performs very well and is still getting better with performance fixes going in all the time. But my reasons for choosing ZFS were different:
-
Data integrity: ZFS provides much better data integrity than other common filesystems because of 2 key differences: transactions and checksums. ZFS is a transactional copy-on-write filesystem: each operation either succeeds or it does not. Either way you have a consistent on-disk state - if the commit failed you merely revert to the previous state from some 5 seconds ago. All this holds even in the presence of a volatile disk write cache. In addition, ZFS keeps strong hierarchical checksums of all data. They make it possible to reliable verify that the data is valid.
Compare that to traditional filesystems and the particular problem that made me switch: I had some data on a disk with a UFS filesystem. For some reason the disk stopped responding temporarily (problem with disk? software? a power fluctuation? who knows). After a reboot, Solaris was not happy with the disk and told me to run fsck. I did, and fsck reported some cryptic messages about problems it found and fixed. After that I could access the disk again, but did I lose any files? Did any files get corrupted? There is no way to know! (other than to compare with a backup)
With ZFS, the disk would not have gotten into this state, plus I could have run
zpool scrubto verify that all data on the disk was still intact and not corrupted. As far as I am concerned, this alone is reason enough to use ZFS whenever I can. The choice between strong verifiable data integrity and limited non-verifiable data integrity is no choice at all! All the other ZFS features are merely a bonus, but keep reading. -
Easy administration: all sysadmins are overworked, so easy administration is important to everyone. But it is even more important to people like myself, who have a full time day job and merely spend a couple of hours a month on sysadmin tasks. In other words, I have lot more time for forget stuff :-( ZFS has built-in support for mirroring and RAID, which has traditionally been performed by a different piece or hardware or software with its own administrative interface. In ZFS, this is integrated and designed to be as simple as possible. You only need to know two commands: zpool and zfs.
Easy data migration: if the server ever dies, all I need to do is find another Solaris machine with SATA ports, plug in the disks, and type
zfs import. No need to find a machine with a matching hardware RAID controller. No need to find a machine with the matching CPU architecture (SPARC or x86). No need to connect the disks to the right ports in the right order (ZFS finds the data anyway). No RAID configuration data or mountpoint information to migrate (ZFS stores all configuration information in the pool).And if I ever want to move the data to a different set of disks, ZFS also includes fast and easy
zfs sendandzfs receivecommands.Snapshots: ZFS supports an unlimited number of constant-time snapshots. If I need to look at the state of a workspace from last week, I just do so. And because ZFS is copy-on-write, snapshots take only as much disk space as is necessary to store the changes.
New features are still being added: such as double-parity RAID (RAID-Z2), clone promotion, iSCSI integration, delegated administration, encryption, and improvements to the Solaris install experience (especially liveupgrade).
And of course since it is completely open source, ports to other platforms are already in progress.
To me ZFS is one of those trailblazing technologies that are destined to be used (or imitated) by everyone. As soon as you hear about the concept for the first time, it makes complete sense and the only question you ask yourself is why didn't I think of that? Just like a portable, multi-threaded, secure programming platform (Java), multicore processors (Niagara UltraSPARC-T1), or transportable easy-to-deploy datacenters (Project Blackbox).
PS: the next entry will be about Java topics again ;-)
Posted at 18:18 Nov 04, 2006 by Andreas Sterbenz in General |