This is an incredibly exciting day for me. I have been working on ZFS for almost five years, and now we are finally going public with it! The last five years have been fantastic. There is nothing to compare with the experience of working with a small group of really talented engineers to write a significant piece of software from scratch. I particularly want to mention Matthew Ahrens, my partner-in-crime for the DMU. Matt continues to amaze me with his insight and coding ability. I also want to mention Mark Shellenbaum. I started this project working with Mark on the ZPL (ZFS POSIX layer). When I moved on to work on the DMU, he stuck with thankless nitty-gritty job of turning ACLs from a little-used feature of most file system into a powerful and useful feature of ZFS. Finally, I have to mention Jeff Bonwick, the guy who started the whole thing and kept it all together over the long strange trip. ZFS really is his brainchild.
OK, so ZFS has tons of new cool features in it. Many of them catch your imagination immediately: Pooled Storage, Instantaneous Snapshots, or Clones. But some of them don't sound so cool, or maybe sound like something you've heard about before. I'm going to blog about a couple of the features in this category: Quotas and Reservations. You may think you know all about quotas: how they were the bane of your existence at the always diskspace-starved university you went to. But those were the old days, read on and find out what they mean in the brand new world of ZFS.
Why Do I Care?
In the pooled storage environment provided by ZFS, the file system becomes a point of control for system administrators. File systems in this environment are cheap, fast, and easy to create. It's likely that administrators will be creating a file system for every single user and probably a file system for every project as well. ZFS provides a pwerful hierarchical naming scheme for managing this, potentially very large, name space. But, since all of these file systems will draw their space from a common pool, there is also a need to manage the space consumption in the pool. Administrators need to be able to place limits on the amount of space any single user's or project's file system can consume. They also need to be able to guarantee space will be available for user and project file systems. ZFS introduces quota and reservation properties for the file system to provide this management ability.How Is This Different?
In traditional file systems, there is a one-to-one mapping between file systems and physical storage. The amount of space allocated to a file system is pre-determined at creation time. In this type of environment there is no need for the management controls described above, and so they simply don't exist.The concepts of quotas and reservations do exist in todays file systems. However, they are applied at different levels and so have very different semantics. A "traditional" quota is on space use by all files owned by a particular user, and limits the amount of space that user can consume within a file system. It has no impact on the amount of space consumed by the file system itself. A "traditional" reservation is usually applied to a file. It guarantees space within the file system for the file. It also has no impact on the amount of space available to the file system as a whole.
So What Does It Really Mean?
The ZFS quota and reservation properties are designed to give the file system administrator the ability to manage the way space is consumed within the storage pool. Each file system within the pool can have a quota or reservation assigned to it. A quota is a limit on the amount of space the file system can consume. For example, if a pool (named tank) is created with 36 gigabytes of space, and some file system (named fs1) is created within that pool and given a quota of 10 gigabytes, the file system fs1 will never be allowed to use more than 10GB of the 36GB of space in the pool. A reservation is a guarantee of space to a file system. For example, in pool tank just described, if some file system (named fs2) is created and given a reservation of 10GB, the pool will now report only 26GB of available space, since 10GB has been committed to file system fs2. File system fs2 could grow to use all 36GB, but the sum of the space used by all other filesystems can never be more than 26GB.A quota is not subject to the available space limitations of a pool. It is possible to set a quota greater than the space available in a pool. For example, if we increase the reservation of file system fs2 to 30 gigabytes (permitted only if fs1 is currently less than 6GB in size), the file system fs1 will not be able to grow beyond 6GB even though it has a quota of 10GB. There is now only 6GB of space available in the pool for all file systems other than fs2. Note that it is illegal to set a quota for less than the current file system size, as the file system would be immediately in violation of its quota.
A reservation, in contrast, is limited by the available space in the pool. It is simply not possible to reserve more space than is available. It is possible, however, to set a reservation below the current amount of space used by a file system. While this has no impact when first set, it does have meaning if the size of the file system ever drops below the set reservation. Space freed is returned to the pool, for distribution to any file system, if the file system is using more space than its reservation. However, if the file system is below its reservation then freed space remains reserved for future use by only this file system.
Quotas and reservations are particularly powerful in the hierarchical file system environment supported by ZFS. The quota and reservation properties are not inherited as other properties in ZFS. Rather, they impact their descendants directly. For example, giving a reservation of 10GB to file system fs1 does not mean that some child fs1/child also receives a 10GB reservation. A quota limits the sum of the of space consumed by the filesystem it is placed on, and all of its descendants. A reservation reserves space for use by the filesystem it is placed on and all of its descendants. So quotas and reservations limit or reserve space that can be consumed from that point in the hierarchy down. Note that snapshots are considered descendants of the file system they originated from.
Each file system tracks its own quota, reservation, and the amount of space it's using. The space used is a sum of the space used directly by the file system and the space used by all descendants. When a file system wants to use more space, it must check against its own quota and all its ancestor's quotas. Reservations are also checked at each level. Space available in a reservation will be used first to satisfy a space request. If the space request cannot be satisfied by existing reservations, a final check will be made at the pool root against the pool's available free space. If the space request can be satisfied, the space change is applied to the file system. If the change is over the reservation, the change is applied recursively to the parent.
OK, Lets See Some Examples:
Steve the administrator has a pool with 500GB of space. He has 6 users working on 3 projects. Using the zfs(1M) command, he creates home and 6 file systems under home (one for each user). He also creates project and 3 file systems under it (one for each project):# zfs list -o name,used,available,reservation,quota NAME USED AVAIL RESERV QUOTA pool 162K 498G none none pool/home 59.5K 498G none none pool/home/ahrens 8K 498G none none pool/home/billm 8K 498G none none pool/home/bonwick 8K 498G none none pool/home/marks 8K 498G none none pool/home/maybee 8K 498G none none pool/home/perrin 8K 498G none none pool/project 33.5K 489G none none pool/project/dmu 8K 498G none none pool/project/spa 8K 498G none none pool/project/zpl 8K 498G none noneSteve does not want users to be putting everything in their home directories. The bulk of their files should end up in their project directores. So he decides to set a 100GB quota on pool/home:
# zfs set quota=100g pool/home # zfs list -o name,used,available,reservation,quota NAME USED AVAIL RESERV QUOTA pool 162K 489G none none pool/home 59.5K 100G none 100G pool/home/ahrens 8K 100G none none pool/home/billm 8K 100G none none pool/home/bonwick 8K 100G none none pool/home/marks 8K 100G none none pool/home/maybee 8K 100G none none pool/home/perrin 8K 100G none none pool/project 33.5K 489G none none pool/project/dmu 8K 489G none none pool/project/spa 8K 489G none none pool/project/zpl 8K 489G none noneNote that although each user's home directory now shows 100G available, if the combined usage by all users reaches 100G, no user will be able to create any more files.
One of Steve's users, bonwick, tends to be a space hog. So he further limits him with an individual quota:
# zfs set quota=20g pool/home/bonwick # zfs list -o name,used,available,reservation,quota pool/home/bonwick NAME USED AVAIL RESERV QUOTA pool/home/bonwick 8K 20.0G none 20.0GThe quota on pool/home is intended to prevent the users from consuming all of the pool space with files in their home directories. But Steve also wants to make sure that there is a reasonable amount of the pool available for home directory use (i.e., it isn't all used by the project file systems). So he reserves some space for home directories:
# zfs set reservation=60g pool/home # zfs list -o name,used,available,reservation,quota NAME USED AVAIL RESERV QUOTA pool 60.0G 429G none none pool/home 59.5K 100G 60.0G 100G pool/home/ahrens 8K 100G none none pool/home/billm 8K 100G none none pool/home/bonwick 8K 20.0G none 20.0G pool/home/marks 8K 100G none none pool/home/maybee 8K 100G none none pool/home/perrin 8K 100G none none pool/project 33.5K 429G none none pool/project/dmu 8K 429G none none pool/project/spa 8K 429G none none pool/project/zpl 8K 429G none noneAs you can see, reserving space for pool/home has decreased the available space for the projects under pool/project.
Finally, although the spa project is going to start out small, it's already known that it's going to need a lot of space. So Steve reserves 150GB just for that project:
# zfs set reservation=150G project/spa # zfs list -o name,used,available,reservation,quota NAME USED AVAIL RESERV QUOTA pool 210.0G 279G none none pool/home 59.5K 100G 60.0G 100G pool/home/ahrens 8K 100G none none pool/home/billm 8K 100G none none pool/home/bonwick 8K 20.0G none 20.0G pool/home/marks 8K 100G none none pool/home/maybee 8K 100G none none pool/home/perrin 8K 100G none none pool/project 33.5K 279G none none pool/project/dmu 8K 279G none none pool/project/spa 8K 429G 150G none pool/project/zpl 8K 279G none noneNote again that reserving space has decreased the generally available space in the pool.
Where Can I Find Out More?
There's a lot more information about Quotas, Reservations, and all the other cool features of ZFS in the admin guide. Heck, you can even get the source on OpenSolaris!Technorati Tags: [OpenSolaris Solaris ZFS]
I quickly scanned the admin guide, and thought about a "what if" scenario...
Pg 27 states
"In fact, the file deletion can end up consuming more disk space, since a new version of the directory will need to be created to reflect the new state of the namespace."
What if, the administrator wants to remove the file permanently, and have it cascade through all snapshots?
Thanks
Posted by Amit Kulkarni on November 16, 2005 at 06:10 PM MST #
Posted by Jason Santos on November 17, 2005 at 09:29 PM MST #
Posted by Don't Get Out Much? on November 18, 2005 at 09:41 PM MST #
In regards to the "removing a file from snapshots": At the moment, snapshot content is immutable. The only way to "permanently" remove a file from the file system is to remove all snapshots that reference the file. Its an intriguing concept to remove part of a snapshot, but it violates one its fundamental properties.
About edge performance: Yes, like all file systems, we are susceptible to performance degradation when there is very little space left. As you surmise, this is partially due to fragmentation of the storage. We will be addressing this in a future release. In the meantime, its best to leave a few MB of space in the pool as a buffer (or throw another disk into the pool when you get close to maximum capacity).
About inovation: Yes, its true, not every feature of ZFS is innovative. Many of the features we offer are the expected/required features of a modern file system. But we also offer innovation, indeed I believe that even many of the expected features are implemented in innovative ways and so offer scalability and performance beyond anything else out there. As a package, I believe that ZFS is truly innovative (but of course I work for Sun and helped build this product)! So I beg to differ with you. We have not just dragged Solaris into the mid-1990's, we have brought Solaris well into the 21st century.
Posted by Mark Maybee on November 19, 2005 at 10:12 AM MST #
Posted by S on November 27, 2005 at 07:47 PM MST #
Posted by bluecube on April 19, 2007 at 11:11 AM MDT #