Reflections on OS integration Eric Schrock's Weblog
Musings about Fishworks, Operating Systems, and the software that runs on them.

Sunday Nov 21, 2004

So it's no secret that AMD and Intel are in a mad sprint to the finish for dual-core x86 chips. The offical AMD roadmap, as well as public demos have all shown AMD well on track. The latest tidbits of information indicate Linux is up and running on these dual-core systems. Very cool.

Given our close relationship with AMD and the sensitive nature of hardware plans, I'll refrain from saying what we may or may not have running in our labs. But Solaris has some great features that make it well-suited for these dual core chips. First of all, Solaris 10 has had support for both Chip Multi Threading (hyperthreading) and Chip Multi Processing (multi core) for about a year and half now. Solaris has also been NUMA-aware for much longer (with the current lgroups coming in mid-2001, or Solaris 9). I'm sure AMD has made these cores appear as two processesors for legacy purposes, but with a little cpuid tweaks, we'll see them as sibling cores and get all the benefits inherent in Solaris 10 CMP.

Despite this, the NUMA system in Solaris is undergoing drastic change due to the Opteron memory architecture. While Solaris is NUMA-aware, it uses a simplistic memory heirarchy based on the physical architecture of Sun's high end SPARC systems. We have the notion of a "locality group", which represents the logical relationship of CPUs and memory. Currently, there are only two notions of locality - "near" and "far". Solaris tries its best to keep logically connected memory and processes in the same locality group. On Opteron, things get a bit more complicated due to the integrated memory controller and HyperTransport layout. On 4-way machines the processors are laid out in a square, and on 8-way machines we have a ladder formation. Memory transfers must pass through neighboring memory controllers, so now memory could be "near", "far", or "farther". We're revamping the current lgroup system to support arbitrary memory heirachies, which should produce some nice performance gains on 4- and 8-way Opteron machines. Hopefully one of the NUMA folks will blog some more detailed information once this project integrates.

In conclusion: Opterons are cool, but dual-core Opterons are cooler. And Solaris will rip on both of them.

Given that the amd64 ABI is nearly set in stone, and (as pointed out in comments on my last entry) future OpenSolaris ports could run into similar problems on other architectures (like PowerPC), you may wonder how we can make life easier in Solaris. In this entry I'll elaborate on two possibilities. Note that these are little more than fantasies at the moment - no real engineering work has been done, nor is there any guarantee that they will appear in a future Solaris release.

DWARF Support for MDB

Even though DWARF is a complex beast, it's not impossible to write an interpreter. It's just a matter of doing the work. The more subtle problem is designing it correctly, and making the data accessible in the kernel. Since MDB and KMDB are primarily kernel or post-mortem userland tools, this has not been a high priority. CTF gives us most of what we need, and including all the DWARF information in the kernel (or corefiles) is prohibitively expensive. That being said, there are those among us that would like to see MDB take a more prominent userland role (where it would compete with dbx and gdb), at which point proper DWARF support would be a very nice thing to have.

If this is done properly, we'll end up with a debugging library that's format-independent. Whether the target has CTF, STABS, or DWARF data, MDB (and KMDB) will just "do the right thing". No one argues that this isn't a cool idea - it's just a matter of engineering resources and business justification.

Programmatic Disassembler

The alternative solution is to create a disassembler library that understands code at a semantic level. Once you have a disassembler that understands the logical breakdown of a program, you can determine (via simulation) the original argument values to functions. Of course, it's not always guaranteed to work, but you'll always know when you're guessing (even DWARF can't be correct 100% of the time). This requires no debugging information, only the machine text. It will also help out the DTrace pid provider, which has to wrestle with jump tables and other werid compiler-isms. Of course, this is monumentally more difficult than a DWARF parser - especially on x86.

This idea (along with a prototype) has been around for many years. The converted have prophesized that libdis will bring peace to the world and an end to world hunger. As with many great ideas, there just hasn't been justification for devoting the necessary engineering resources. But if it can get the arguments to functions on amd64 correct in 98% of the situations, it would be incredibly valuable.

OpenSolaris Debugging Futures

There are a host of other ideas that we have kicking around here in the Solaris group. They range from pretty mundance to completely insane. As OpenSolaris finishes getting in gear, I'm looking forward to getting these ideas out in the public and finding support for all the cool possibilities that just aren't high enough priority for us right now. The existence of a larger development community will also make good debugging tools a much better business proposition.