Views on software from Bryan Cantrill's deck chair The Observation Deck

Monday Jul 26, 2004

One of the downsides of being an operating systems developer is that the demos of the technology that you develop often suck. ("Look, it boots! And hey, we can even run programs and it doesn't crash!") So it's been a pleasant change to develop DTrace, a technology that packs a jaw-dropping demo. In demonstrating DTrace for customers around the world, I have had the distinct (and rare) pleasure of impressing the most technically adept (and often jaded) audiences. My typical demonstration is on my Solaris x86 laptop, where I use DTrace to instrument the running system -- exploring with the audience the peculiarities that exist even on an idle laptop. (This usually involves discovering and understanding the unnecessary work being done by acroread, dhcpagent, sendmail, etc.) This ad hoc demo shows DTrace as it's meant to be used: dynamically answering questions that themselves were formed on-the-fly.

And when I demonstrate DTrace, I always do so on the absolute latest Solaris 10 build. Our mantra in Solaris Kernel Development is "FCS Quality All the Time" -- we believe that the product should always be ready to be run in production. And if we're going to tell a customer that it's ready to be run in production, we damn well better run it in production ourselves. This has the added advantage that we tend to run into bugs before our customers do, allowing us to ship a final product that is that much more solid. Over the past year, I have given hundreds of DTrace demonstrations in front of customers running latest bits, and before last week, it had always gone off without a hitch...1

Last week, I had the opportunity to give a DTrace demonstration for a highly technical -- and highly influential -- audience at a Fortune 100 company. When I demonstrate DTrace, I typically do a couple of invocations on the command line before things become sufficiently complicated to merit writing a DTrace script. And it was when I went to run the first such script (a script that explored the activity of xclock) that it happened:

# dtrace -s ./xclock.d
Segmentation Fault (core dumped)
#
If you've never had it, there's no feeling quite like having a demo blow up on you: it's as if you peed your pants, failed an exam and were punched in the gut -- all at the same horrifying instant. It's a feeling that every software developer should have exactly once in their lives: that unique rush of shock, and then humiliation and then despair, followed by the adrenal surge of a fight-or-flight reaction. In the time it takes a single process to dump core, you go from an (over)confident technologist to a frightened woodland creature, transfixed by the light of an oncoming freight train. For the woodland creature, at least it all ends mercifully quickly; the creature is spared the suffering of trying to explain away its foolishness. The hapless technologist, on the other hand, is left with several options:
  1. Pretend that you didn't write the software: "Boy, will you get a load of those fancy-pants software engineers? Overpriced underworked morons, every last one!"

  2. Explain that this is demo software and isn't expected to work: "Well, that's why we haven't shipped it yet! I mean, what fool would run this stuff anyway? Other than me, that is."

  3. Make light of it: "Hey, knock knock! Who's there? Not my software, that's for sure! Wocka wocka wocka!"

  4. Suck it up: "That's a serious problem. If you can excuse me for a second, let me get a handle on what we've got here that we can demo."

I always aim for this last option, but on the rare occasion that this has happened to me (and this is -- honest -- probably the worst that a customer-facing demo has gone for me) I usually end up with some combination of the last three, often with plenty of stuttering, some mild swearing ("Damn! Damn!") and profuse sweating.

In my particular case, the worst part was not knowing the exact pathology of the bug that I had just run into. Was there something basic that was broken or toxic about my machine? Would all scripts that I tried to run dump core? And if this was broken, what else was broken? Would I panic the machine or crash a target app if I continued? (Much more serious problems, both.) In an effort to get a handle on it, I did a quick pstack on the core file:

 0804718f ???????? (8046604, 2)
 d137c839 dt_instr_size (82d051a, 8067320, 223, d1380fe2) + 59
 d137c0c2 dt_pid_create_return_probe (81651b8, 8067320, 8046af0, 8047170, 80472d
 d137370d dt_pid_per_sym (80472ac, 8047170, d087b02c) + 15b
 d13739ae dt_pid_sym_filt (80472ac, 8047170, d087b02c, 804715c) + 7c
 d13152ca Psymbol_iter_com (81651b8, ffffffff, 8069060, 1, 407, 1) + 1e0
 d13153ae Psymbol_iter_by_addr (81651b8, 8069060, 1, 407, d1373932, 80472ac) + 1
 d1373b81 dt_pid_per_mod (80472ac, 82cf600, 8069060) + 191
 d1373d56 dt_pid_mod_filt (80472ac, 82cf600, 8069060) + a3
 d1314fe4 Pobject_iter (81651b8, d1373cb3, 80472ac) + 4f
 d13740b4 dt_pid_create_probes (82cafa0, 8067320) + 344
 d1353af8 dt_setcontext (8067320, 82cafa0) + 42
 d13537d4 dt_compile_one_clause (8067320, 82be430, 82cdae0) + 32
 d1353a9c dt_compile_clause (8067320, 82be430) + 26
 d1354d66 dt_compile (8067320, 16a, 3, 0, 80, 1) + 3d9
 d1355263 dtrace_program_strcompile (8067320, 8047ec2, 3, 80, 1, 8066848) + 23
 080526ef ???????? (8066e48)
 0805370e main     (3, 8047df8, 8047e08) + 8fc
 0805177a ???????? (3, 8047eb8, 8047ebf, 8047ec2, 0, 8047edf)
This was dying in the code that analyzes a target binary as part of creating pid provider probes. There was at least a chance that this problem was localized to something specific about the xclock program text -- it was worth trying a similar script on a different process. Fortunately, I was able to stave off total panic long enough to write such a script and -- even better -- this one worked. The problem did indeed seem to be localized to something specific in xclock. And thanks to my coreadm settings, the core file from the seg faulting dtrace had been stashed away for later analysis; the best thing I could do at that point was drive on with the rest of the demo.

And this is what I did. The rest of the demo went well, and the audience was ultimately impressed with the technology. And while I never quite regained my stride (in part because my mind was racing about which change to DTrace could have introduced the problem), I was at least sufficiently effective -- we achieved the goals of the meeting.2 On the plane back home, I root-caused the problem and developed a fix. The next day, I integrated the fix into Solaris -- and I don't think I've ever been so relieved to put latest bits on my laptop!

In the end, having the demo blow up certainly wasn't a pleasant experience -- but I wouldn't change my decision to demo on the latest bits. Not only did we discover a serious bug, we discovered the hole in our test suite that prevented us from finding the bug before it integrated. So who am I to get upset about a little personal humiliation if the upshot is a better product? ;)


1 This is a slight exaggeration. I had actually run into DTrace bugs in front of customers, but they were always sufficiently small that only a trained eye would realize that something was amiss -- things like slightly incorrect error messages.

2 The primary goal of such a demo is often to get the customer sufficiently excited about Solaris 10 to download Solaris Express (usually for x86) and start playing around with the technology themselves. We are nearly always successful in this -- and I have even had a few customers start downloading Solaris Express before the end of the meeting!

Sunday Jul 25, 2004

I've been following with interest this thread on the linux-kernel mailing list. The LTT folks have apparently given up on the claim that they've got "basically almost everything [DTrace] claims to provide." They now acknowledge the difference in functionality between LTT and DTrace, but they're using it to continue an attack on the Linux development model. (Or more accurately, to attack how that model was applied to them.) The most interesting paragraph is this one in a post by Karim Yaghmour:
As for DTrace, then its existence and current feature set is only a
testament to the fact that the Linux development model can sometimes have
perverted effects, as in the case of LTT. The only reason some people can
rave and wax about DTrace's capabilities and others, as yourself, drag
LTT through the mud because of not having these features, is because the
DTrace developers did not have to put up with having to be excluded from
their kernel for 5 years. As I had said earlier, we would be eons ahead
if LTT had been integrated into the kernel in one of the multiple attempts
that was made to get it in in the past few years. Lest you think that
DTrace actually got all its features in a single merge iteration ...
Some points of clarification: we actually did get most of our features in our initial integration into Solaris (on September 3, 2003). In Solaris, projects that integrate are expected to be in a completed state; if there is follow-on work planned, fine -- but what integrates into the gate must be something that is usable and complete from a customer's perspective. So contrary to Karim's assertion, most of DTrace came in that first (giant) putback. As a consequence of this, DTrace spent a long time living the same way LTT has lived: outside the gate, trying to keep in sync while development was underway. Admittedly, DTrace did this for two years -- not five. And this is Solaris, not Linux; it's easier to keep in sync if only because there is only one definition of the latest Solaris. (The newest DTrace build was always based off of the latest Solaris 10 build.) Still: we didn't let the fact that we had not yet integrated prevent us from developing DTrace, nor did we let it prevent us from building a user community around DTrace. By the time DTrace (finally!) integrated into Solaris 10, we had hundreds of internal users, and a long list of actual problems that were found only with the help of DTrace. Not that DTrace would have been unable to integrate without these things, but having them certainly accelerated the process of integration.

More generally, though, I'm getting a little tired of this argument that LTT would be exactly where DTrace is had they only been allowed into the Linux kernel five years ago. I believe that there is some fundamental innovation in DTrace that LTT simply did not anticipate. For insight into what LTT did anticipate, look at the LTT To Do List from 2002. In that document, you will find many future directions, but not so much as a whisper of the need for DTrace basics like aggregations, speculative tracing, thread-local variables or associative arrays -- let alone DTrace arcana like stability or translators. Would LTT be further along now had it been allowed to integrate into Linux five years ago? Surely. Would it be anywhere near where DTrace is today? In the immortal words of the Magic 8-Ball, "very doubtful."

Sunday Jul 18, 2004

Recently, Karim Yaghmour posted the following to the linux-kernel mailing list:

As I noted when discussing this with Andrew, we've been trying to get
LTT into the kernel for the past five (5) years. During that time we've
repeatedly encountered the same type of arguments for not including it,
and have provided proof as to why those arguments are not substantiated.
Lately I've at least got Andrew to admit that there were no maintenance
issues with the LTT trace statements (given that they've literally
remained unchanged ever since LTT was introduced.) In an effort to
address the issues regarding the usefulness of such a tool, I direct
those interested to this article on DTrace, a trace utility for Solaris:
http://www.theregister.co.uk/2004/07/08/dtrace_user_take/

<rant>
With LTT and DProbes, we've basically got almost everything this tool
claims to provide, save that we would be even further down the road if
we did not need to spend so much time updating patches ...
</rant>

Karim
-- 
Author, Speaker, Developer, Consultant

Now, Karim's really only interested in DTrace it that it helps him make his larger point that his project has been unfairly (or unwisely) denied entry into the Linux kernel. His is a legitimate point, and something that is often lost in the assertions that Linux is developed faster than other operating systems: for all of its putative development speed, Linux has a surprising number of otherwise valuable projects that have been repeatedly denied entry for reasons that seem to be petty and non-technical. DProbes/LTT is certainly one example of such a project, and LKCD is probably another.

But what of Karim's assertion that LTT and DProbes "basically [have] everything [DTrace] claims to provide"? This claim is false, and indicates that while Karim may have scanned The Register article, he didn't bother to browse even our USENIX paper -- let alone our documentation. From these, one will note that while LTT lacks many DTrace niceties, it also lacks several vital features. Two among these are aggregations and thread-local variables -- two features that are not syntactic sugar or bolted-on afterthoughts, but rather are core to the DTrace architecture. These features turn out to be essential in using DTrace to quickly resolve problems. For an example of how these features are used, see Section 9 of our USENIX paper -- and note that every script that we wrote to debug that problem used aggregations, and that several critical steps were only possible with thread-local variables.

And fortunately, you don't even have to take my word for it: RedHat developer Daniel Berrangé has posted a comparison of DTrace and DProbes/LTT that reaches roughly the same conclusions...

Tuesday Jul 13, 2004

Ted Leung noted the discussion that Werner and I have been having, and observed that we should consider Rob Pike's (in)famous polemic, "Systems Software Research is Irrelevant." I should say that I broadly agree with most of Pike's conclusions -- and academic systems software research has seemed increasingly irrelevant in the last five years. That said, I think that what Pike characterizes as "systems research" is far too skewed to the interface to the system -- which (tautologically) is but the periphery of the larger system. In my opinion, "systems research" should focus not on the interface of the system, but rather its guts: those hidden Rube Goldberg-esque innards that are rife with old assumptions and unintended consequences. Pike would perhaps dismiss the study of these innards as "phenomenology", but I would counter that understanding phenomena is a prerequisite to understanding larger systemic truths. Of course, the problem to date has been that much systems research has not been able to completely understand phenomena -- the research has often consisted merely of characterizing it.

As evidence that systems research has become irrelevant, Pike points to the fact that SOSP has had markedly fewer papers that have presenting new operating systems, observing that "a new language or OS can make the machine feel different, give excitement, novelty." While I agree with the sentiment that innovation is the source of excitement (and that such exciting innovation has been woefully lacking from academic systems research), I disagree with the implication that systems innovation is restricted to a new language or OS; a new file system, a new debugger, or a new way of virtualization can be just as exciting. So the good news is that work need not be a new system to be important systems work, but the bad news is that while none of these is as large as a new OS, they're still huge projects -- far more than a graduate student (or even a lab of graduate students) can be expected to complete in a reasonable amount of time.

So if even these problems are too big for academia, what's to become of academic systems research? For starters, if it's to be done by graduate students, it will have to be content with smaller innovation. This doesn't mean that it need be any less innovative -- just that the scope of innovation will be naturally narrower. As an extreme example, take the new nohup -p in Solaris 9. While this is a very small body of work, it is exciting and innovative. And yet, most academics would probably dismiss this work as completely uninteresting -- even though most could probably not describe the mechanism by which it works. Is this a dissertation? Certainly not -- and it's not even clear how such a small body of work could be integrated into a larger thesis. But it's original, novel work, and it solves an important and hard (if small) problem. Note, too, that this work is interesting because of the phenomenon that prohibited a naive implementation: any solution that doesn't address the deadlock inherent in the problem isn't actually an acceptable solution. This is an extreme example, but it should make the point that smaller work can be interesting -- as long as it's innovative, robust and thorough.

But if the problems that academic systems researchers work on are going to become smaller, the researchers must have the right foundation upon which to build their work: small work is necessarily more specific, and work is markedly less relevant if it's based on an obsolete system. And (believe it or not) this actually brings us to one of our primary motivations for open sourcing Solaris: we wish to provide complete access to a best-of-breed system that allows researchers to solve new problems instead of revisiting old ones. Will an open source Solaris single-handedly make systems research relevant? Certainly not -- but it should make for one less excuse...

Thursday Jul 08, 2004

Ashlee Vance of The Register has recently compared me to infamous kitchen gadget pitchman Ron Popeil. Let me clear up two misconceptions that have apparently arisen from this comparison. First, despite some claims to the contrary, DTrace cannot be used to make turkey-jerky. And second, the rumors of a DTrace infomercial starring Tom Vu are absolutely false. (That said, it is true that many have used DTrace to work their way up from lowly busboys to yacht-owning multi-millionaires...)
Werner Vogels, a member of the USENIX '04 Program Committee, has written very thoughtful responses to some of my observations. And it's clear that Werner and I see the same problem: there is insufficient industrial/academic cooperation in computer science systems research -- and the lack of cooperation is to the detriment of both groups.

That said, it's clear that there are some different perspectives as to how to address the problem. A common sentiment that I'm seeing in the comments is that it is up to industry to keep USENIX relevant (in Werner's words, "industry will need to be more pro-active in making researchers aware of what the problems are that they need to solve"). I don't entirely agree; in my opinion, the responsibility for keeping USENIX relevant doesn't lie exclusively with industry -- and it doesn't lie exclusively with academia, either. Rather, the responsibility lies with USENIX itself, for it is the mission of USENIX to encourage research with a "practical bias." As such, it is up to USENIX to assemble a Program Committee that will reflect this mission, and it is up to both academia and industry to participate as requested. This means that USENIX cannot simply wait for volunteers from industry to materialize -- USENIX must seek out people in industry who understand both the academic and the industrial sides of systems research, and they must convince these people to work on a Program Committee. Now, I know that this has happened in the past -- and frankly I thought that the USENIX '04 Program Committee was a step in the right direction: where USENIX '03 had four (of sixteen) members from industry, USENIX '04 had six (of seventeen). But unfortunately, USENIX '05 seems to be a marked decline in industry participation, even from USENIX '03: the number from industry has dropped back to four (of eighteen). Worse, all four are from industry labs; where both USENIX '03 and USENIX '04 had at least one product-oriented member from industry, USENIX '05 has none.

Examining these three years of USENIX brings up an interesting question: what has the Program Committee composition looked like over time? That is, is the situation getting better or worse vis a vis industry participation? To answer this question, I looked at the Program Committee composition for the last nine years. The results are perhaps well-known, but they were shocking to me:

To me, this trend should be deeply disconcerting: an organization that has dedicated itself to research with a "practical bias" is clearly losing that bias in its flagship conference.

So what to do? First, we need some recognition from the USENIX side that this is a serious issue, and that it requires substantial corrective action. I believe that the USENIX Board should charter a committee that consists of academia and industry (both labs and product groups) in roughly equal measure. This committee should hash out some of the misconceptions that each group has of the other, clearly define the problems, develop some long-term (measurable) goals, and make some concrete short- and medium-term recommendations. The deliverable of the committee should be a report summarizing their findings and recommendations -- recommendations that the Board should consider but is obviously free to ignore.

The situation is serious, and there is much work to be done to rectify it -- but I am heartened by the amount of thought that Werner has put into this issue. If we can find more like him from both industry and academia, we can get the "practical bias" back into USENIX.

Tuesday Jul 06, 2004

As I mentioned earlier, I recently returned from USENIX '04, where we presented the DTrace paper. It was a little shocking to me that our paper was the only paper to come exclusively from industry: most papers had no industry connection whatsoever, and the papers that had any authors from industry were typically primarily written by PhD students interning at industry labs. The content of the General Session was thus academic in the strictest sense: it consisted of papers written by graduate students, solving problems in systems sufficiently small to be solvable by a single graduate student working for a reasonably short period of time. The problem is that many of these systems -- to me at least -- are so small as to not be terribly relevant. This is important because relevance is sufficiently vital to USENIX to be embodied in the Mission Statement: USENIX "supports and disseminates research with a practical bias." And of course, there is a more pragmatic reason to seek relevance in the General Session: most of the attendees are from industry, and most of them are paying full-freight. Given that relevance is so critical to USENIX, I was a little surprised that -- unlike most industry conferences I have attended -- there was no way to provide feedback on the General Session. How does the Steering Committee know if the research has a "practical bias" if they don't ask the question?

This leads to the more general question: how do we keep the "practical bias" in academic systems research? Before I try to answer that directly, it's worth looking at the way research is conducted by other engineering disciplines. (After all, one of the things that separates systems from the rest of computer science is its relative proximity to engineering.) To me, it's very interesting to look at the history of mechanical engineering at MIT. In particular, note the programs that no longer exist:

  • Marine engineering, stopped in 1913
  • Locomotive engineering, stopped in 1918
  • Steam turbine engineering, stopped in 1918
  • Engine design, stopped in 1925
  • Automotive engineering, stopped in 1949
Why did these programs stop? It's certainly not because there weren't engineering problems to solve. (I can't imagine that anyone would argue that a 1949 V8 was the ne plus ultra of internal combustion engines.) This is something of an educated guess (I'm not a mechanical engineer, so I trust someone will correct me if I'm grossly wrong here), but I bet these programs were stopped because the economics no longer made sense: it became prohibitively expensive to meaningfully contribute to the state-of-the-art. That is, these specialities were so capital and resource intensive, that they could no longer be undertaken by a single graduate student, or even by a single institution. By the time an institution had built a lab and the expertise to contribute meaningfully, the lab would be obsolete and the expertise would have graduated. Moreover, the disciplines were mature enough that there was an established industry that understood that research begat differentiated product, and differentiated product begat profit. Industry was therefore motivated to do its own research -- which is a good thing, because only industry could afford it.

And what has happened to, say, engine design since the formal academic programs stopped? Hard problems are still being solved, but the way those problems are solved has changed. For example, look at the 2001 program for the Small Engine Technology Conference. A roughly typically snippet:

  • G.P. BLAIR - The Queen's University of Belfast (United Kingdom)
    D.O. MACKEY, M.C. ASHE, G.F. CHATFIELD - OPTIMUM Power Technology (USA)
    Exhaust pipe tuning on a four-stroke engine; experimentation and simulation

  • G.P. BLAIR, E. CALLENDER The Queen's University of Belfast (United Kingdom)
    D.O. MACKEY - OPTIMUM Power Technology (USA)
    Maps of discharge coefficient data for valves, ports and throttles

  • V. LAKSHMINARASIMHAN, M.S. RAMASAMY, Y. RAMACHANDRA BABU TVS-Suzuki (India)
    4 stroke gasoline engine performance optimization using statistical techniques

  • K. RAJASHEKHAR SWAMY, V. HARNE, D.S. GUNJEGAONKAR TVS-Suzuki (India)
    K.V. GOPALKRISHNAN - Indian Institute of Technology (India)
    Study and development of lean burn systems on small 4-stroke gasoline engine

Note that there's some work exclusively by industry, and some work done in conjunction with academia. (There's some work done exclusively by academia, too -- but it's the exception, and it tends to be purely analytical work.) And here's the Program Committee for this conference: Of these, three are clearly academics, and seven are clearly from industry.

Okay, so that's one example of how a traditional engineering discipline conducts joint academic/industrial research. Let's get back to USENIX with a look at the Program Committee for USENIX '05. Note that the mix is exactly the inverse: twelve work for a university and five work for a company. Worse, of those five putatively from industry, all of them work in academic labs. In our industry, these labs have a long tradition of being pure research outfits -- they often have little-to-no product responsibilities. (Which, by the way, is just an observation -- it's not meant to be a value judgement.)

Even more alarming, the makeup of the FREENIX '05 program committee is almost completely from industry. This leads to the obvious question: is FREENIX becoming some sort of dumping ground for systems research deemed to be too "practically biased" for the academy? I hope not: aside from the obvious problem of confusing research problems with business models, having the General Session become strictly academic and leaving the FREENIX track to become strictly industrial effectively separates the academics from the practitioners. And this, in my opinion, is exactly what we don't need...

So how do we keep the "practical bias" in the academic work presented at USENIX? For starters, industry should be better represented at the Program Committee and on the Steering Committee. In my opinion, this is most easily done by eliminating FREENIX (as such), folding the FREENIX Program Committee into the General Session Program Committee, and then having an interleaved "practical" track within the General Session. That is, alternate between practical sessions and academic ones -- forcing the practitioners to sit through the academic sessions and vice versa.

That may be too radical, but the larger point is that we need to start having an honest conversation: how do we prevent USENIX from becoming irrelevant?

Saturday Jul 03, 2004

So I just got back from USENIX '04, and I had planned to spend the flight writing up some observations on the conference. Unfortunately, these observations -- as pithy as they no doubt will be -- will have to wait: I ended up spending the flight inhaling Bringing Down the House by Ben Mezrich. While the book itself is not very well written,1 the subject is fascinating: a well-disciplined (and apparently successful) card-counting team from MIT. The book was brain candy in the purest sense: it was exhilerating and fun -- but it definitely ruined my dinner.

If you're looking for something with a little more meat in it, check out Tom Bass's classic, The Eudaemonic Pie. Bass's subjects are more interesting to me, if only because the problem they're solving is so much harder: a group of physicists and computer scientists develop a device to give them an advantage over roulette. (After all, it's just Newtonian physics, right?) And if the idea sounds incredibly implausible, just wait until you see how they implemented it. And while the "Bringing Down the House" protagonists seem destined for a life of overpaid corporate consulting and/or 12 step programs, the leader of the "Eudaemonic" tribe, Doyne Farmer, now writes papers for academic journals like Quantitative Finance from his roost at The Santa Fe Institute. Meatier stuff, to be sure.


1The author had an incredibly difficult time separating himself from the story -- I don't particularly care if a stripper was "on his lap" for an interview, and I care even less that he knew the principal protagonist through "a friend from Harvard." I didn't drop fifteen bones to read "The Making of 'Bringing Down the House'"...