Tuesday September 21, 2004
Confessions of an operating systems junkieVal Henson's weblog
All
|
Operating Systems
Farewell for now As of October 4th, I will be a Linux developer again. After a many months-long search for a Linux job that was better than my Sun job (no small task), I finally accepted an offer from IBM. I learned a few things over the past few months. Sun is a great working environment; most other companies score much higher on the Dilbert scale. Which explains why I know so many people who left Sun and then came back a few years later (an option I will consider). After deciding I wanted to quit, I stopped doing all the things I didn't want to do - and ended up being more productive. Finally, a little patience (and negotiating skill) will go a long way (unfortunately, these are two qualities I am notoriously short on). I feel lucky to have had the chance to work on ZFS. It really is a groundbreaking new file system - and so easy to use. Now I just hope that it gets ported to Linux soon, so I can use it. I may start a new weblog; if so, it will be linked to from my home page: After leafing through the stack of computer science papers on my desk, looking for one last systems paper to talk about, I find I can't pick just one. Here for your reading enjoyment, then, are four of my favorite systems papers. Checking System Rules Using System-Specific Programmer-Written Compiler Extensions Aside from the awful title, this is a fantastic paper. The authors managed to automate (in a practical, easy-to-understand manner) checking for common, simple bugs such as forgetting to drop a lock on error exit from a function. They found many bugs in widely-used real world systems with relatively little human effort. I've seen their stuff in action; it's good. And as usual for an Engler paper, it's well-written and a pleasure to read. Sensing User Intention and Context for Energy Management Don't you hate it when your laptop's screensaver starts up in the middle of a presentation? Angela Dalton and Carla Ellis explore using low-power sensors to make power-saving smarter. Angela had a great demo at HotOS: a laptop with attached camera that turned off the display when no human face was visible. All day long, I saw people sitting in front of a laptop and then suddenly ducking to the side, again and again, testing her demo. I like that they are taking a creative new approach to power-saving, instead of fiddling with timeouts and mathematical models. Lots of research has been done with the idea that users will annotate files with performance hints using some obscure system-specific interface (yeah, right!). In this paper, Daniel Ellard shows that programs are already giving the system useful performance hints - by the names they give to files. Even more exciting, filename patterns and usage patterns can be automatically correlated by a modeler program. I don't really need to describe this paper, do I? I just wanted to explain why I like this paper so much. Many people have tried write a distributed file system that is correct, generic, and performs well in all cases - and are still trying. What I like about GFS is that they picked a very specific problem (large scale distributed processing using a queuing system) and solved it in a specific way. I hope you enjoyed the Confessions; thanks for reading! (2004-09-21 15:27:46.0) Permalink ZFS is the front page story on http://www.sun.com today: ZFS--the last word in file systems It's an excellent and accurate article. Also, I turned off comments for now, after getting some blogspam. Email me with your comments and I'll respond here. v a l . h e n s o n ( a t ) s u n ( d o t ) c o m (2004-09-14 16:16:43.0) Permalink ZFS FAQs, FREENIX CFP, new systems paper Today's post is a bit long, but I promise you'll find the entire post interesting even if you are only interested in reading about ZFS. Also, apologies in advance for the wacky fonts - the defaults don't seem to work quite right.
FREENIX CFPFREENIX is coming! Submit to FREENIX! I'm especially encouraging all of you userland/application types to submit papers. FREENIX is intended to publish work on open source projects that isn't being published anywhere else, and while most kernel programmers think that the most trivial snippet of code is worth publication, userland programmers can easily write wildly popular million-lines-of-code systems with nary a thought about publishing. The deadline for submissions is October 22, 2004.
Cool systems paper: WAFLThe cool systems paper for today is File System Design for an NFS File Server Appliance, by Dave Hitz, et al. This paper describes the WAFL file system (WAFL stands for Write Anywhere File Layout), used internally by NetApp filers. In my opinion, this paper describes the most significant improvement in file system design since the original FFS in 1978. The basic idea behind WAFL is that all data is part of a tree of blocks, each pointing to the block below it. All updates to the file system are copy-on-write - each block is written to a new location when it is modified. This allows easy transactional updates to the file system. The WAFL paper is a prime example of the kind of paper I'd like to see published more often (N.B. It was published in the Winter 1994 USENIX conference). From an academic standpoint, the paper is unacceptable due to style and format. From the standpoint of great new ideas, full implementation, and advancing the state of the art in practical file system design and implementation, it's a gem. Unfortunately, some people conclude that because NetApp filers use NVRAM to get acceptable (nay, excellent!) performance while holding to the NFS standard, the design ideas behind WAFL aren't useful for general purpose UNIX file systems. I say they're wrong - but read the paper and form your own opinion. The ZFS team thinks that a copy-on-write, transactionally updated general purpose UNIX file system is not only feasible but an excellent idea - which is why we wrote ZFS.
ZFS FAQsMatt Ahrens, one of the primary architects and implementors of ZFS, stepped up to the plate and wrote about ZFS in his blog. Read Matt's blog entry introducing ZFS for a simple introduction to ZFS. Reading today's cool systems paper will also help you understand ZFS, since the basic philosophy behind some parts of WAFL and ZFS is similar. I'll add to what Matt has written and answer some of the most common questions people have asked me about ZFS. Q. ZFS is just another dumb local file system. What is new about ZFS? A lot of things! I'll try to hit the high points.
These are only the top three features of ZFS. ZFS has a million nifty little features - compression, self-healing data, multiple (and automatically selected) block sizes, unlimited constant-time snapshots - but these are the biggies. Q. Why isn't ZFS a clustered/multi-node/distributed file system? Isn't the local file system problem solved? Speaking from around 8 years of system administration experience, I can say that the local file system problem has most emphatically not been solved! Whenever someone asks me this question, I have to wonder if they ever ran out of space on a partition (especially frustrating on a disk with a lot of free space in other partitions), damaged their root file system beyond repair by tripping on the power cable, attempted to use any volume manager at all, spent a weekend (only a weekend if you are lucky) upgrading disks on a file server, tried to grow or shrink a file system, typed the wrong thing in /etc/fstab, ran into silent data corruption, or waited for fsck to finish on their supposedly journaling (and fsck-free) file system. Between me and two or three of my closest friends (none of whom are even sysadmins), we have run into all of these problems within the last year, on state of the art file systems - ext3, VxFS, logging UFS, you name it. As far as I can tell, most people have simply become accustomed to the inordinate amount of pain involved in administering file systems. We're here to say that file systems don't have to be complex, fragile, labor-intensive, and frustrating to use. Creating a decent local file system turned out to be more than big enough of a problem to solve all by itself; we'll leave designing a distributed file system for another day. Q. What is ZFS's performance? Can I see some ZFS benchmarks? ZFS is still under development, and the benchmarks numbers change day by day. Any benchmark results published now would only be a random snapshot in time of a wildly varying function and not particularly useful for deciding whether to use ZFS the released product. However, we can tell you that performance is secondary only to correctness for the development team and we are evaluating ZFS in comparison with many different file systems on Solaris and Linux. We can also tell you about some of the architectural features of ZFS that will help make ZFS performance scream.
Q. Isn't copy-on-write of every block awfully expensive? The changes in block pointers will ripple up through the indirect blocks, causing many blocks to be rewritten when you change just one byte of data. If we only wrote out one set of changes at once, it would be very slow! Instead, we aggregate many writes together, and then write out the changes to disk together (and very carefully allocate and schedule them). This way the cost of rewriting indirect blocks will be amortized over many writes. Feel free to ask more questions in the comments; I'll do my best to answer them. (2004-09-09 20:29:37.0) Permalink Comments [5] |
Calendar
RSS Feeds
All /Operating Systems SearchNavigationReferersToday's Page Hits: 40 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||