Reflections on OS integration Eric Schrock's Weblog
Musings about Fishworks, Operating Systems, and the software that runs on them.

Thursday Jun 17, 2004

Before I start looking at some of problems we're addressing in Solaris, I want to step back and examine one of the more fundamental problems I've been seeing in the industry. In order to develop more powerful software quickly, we insert layers of abstraction to distance ourselves from the actual implmentation (let someone else worry about it). There is nothing inherently wrong with this; no one is going to argue that you should write your business critical web service in assembly language instead of Java using J2EE. The problem comes from the disturbing trend that programmers are increasingly less knowledgeable about the layers upon which they build.

Most people can sit down and learn how to program Java or C, given enough time. The difference between an average programmer and a gifted programmer is the ability to truly understand the levels above and below where one works. For the majority of us1, this means understanding two things: our development platform and our customers. While understanding customer needs is a difficult task, a more tragic problem is the failure of programmers to understand their immediate development environment.

If you are a C programmer, it is crucial that you understand how virtual memory works, what exactly happens when you write to a file descriptor, and how threads are implemented. You should understand what a compiler does, how the dynamic linker works, and how the assembly code the compiler generates really works. If you are a Java programmer, you need to understand how garbage collection works, how java bytecodes are interpreted, and what JIT compiling really does. Just because you don't need to know the OS works doesn't mean you shouldn't. Knowing your environment encourages good software practice, as well as making you more effective at solving problems when they do occur.

Unfortunately, everything in the world is not somebody else's fault. As these new layers are added, things tend to become less and less observable. Not to mention that poor documentation can ruin an otherwise great tool. You may understand how a large number of cross calls can be the product of a misbehaving application, but if you can't determine where they're coming from, what's the point? We in the Solaris group develop tools to provide useful layers of abstraction, as well as tools that rip the hood off2 so you can see what's really happening inside.

Before I start examining some of these tools, I just wanted to point out that there is a large human factor involved that is out of our hands. The most powerful tool in the world can be useless in the hands of someone without a basic understanding of their system. Hopefully by providing these tools we can simultaneously expose the inner workings while sparking desire to learn about these inner workings. So take some some time to read a good book once in a while.


1For us kernel developers, the landscape is a little different. We have to work with a multitude of difference hardware (development platforms) in order to provide an effectively limitless number of solutions for our customers. I like to think that kernel engineers are pound-for-pound the best group of programmers around because of this, but maybe that's just my ego talking.

2Scott is in the habit of talking about how we sell cars, not auto parts. Does that mean we also provide the Jaws of Life to save you after you crash your new Enzo?

So begins my first blog post ever.

I have been a Solaris Kernel Engineer for 10 months now after graduating from my alma mater. Since I joined so late in the Solaris 10 development cycle, I have not had the pleasure of working on one of the larger S10 projects such as DTrace, Zones (N1 Grid Containers), FMA (Predictive Self Healing), or ZFS. But this has given me the unique opportunity to attack bits and pieces of Solaris from all directions. In particular, I have spent more than a few lonely nights with mdb, procfs, and the ptools. I've enjoyed growing up in this playground built by Mike Shapiro, Adam Leventhal, Roger Faulkner, and those who came before me. More recently, I have been drafted into service for the AMD64 (opteron) army, selflessly sacrificing my free time for the good of our porting effort.

From here, I will most likely continue to post about Solaris development as well as general software principles. You'll likely see a focus on software observability, debugging, and complexity. This comes with the territory, as you can see from Bryan's blog. It is not a coincidence that we kernel engineers share similar views and goals. It is an essential part of the philosophy that makes Solaris what it is today: a robust, reliable, manageable, serviceable, and observable operating system.