More on bioses
When I was building my current desktop box, I knew I was close to the bleeding edge. Today with a lot of help from sethg I finally found out that I was over the edge with the built-in AC97 audio device.
To start with, I built the box around the Gigabyte K8-NSPro motherboard. I had some initial issues with the chipset (nForce3-250) and not being able to see the plugin pci bus. Seth helped me with that as well (this is back when Solaris 10 build 63 was new), and I did have a soundblaster card installed so the audio seemed quite ok.
Then a few months ago I ripped out the soundblaster card to rebuild the box from the ground up for a fresh Nevada install. And the audio didn't really work too well. Scratchy and interrupted --- with Sun's audio810(7D) driver and also with Jurgen Keil's audioi810(7D) driver from tools.de.
Jurgen suggested I look at the interrupt mappings, which I did and found that the AC97 and USB devices were using the same IRQ. Since I don't know how to remap this without going to the bios (there's magic to do it, which I don't know and don't really care to know), I did that... So now I had AC97 and USB using different IRQs. Still no good with the output.
Last night I tried running intrstat(1M) in an effort to see what was going on. Something of the order of 85000 interrupts/second for audio810#0. Surely that couldn't be correct? Then I fired up dtrace with /usr/demo/dtrace/intr.d.
BIG MISTAKE!!!
I encountered what could easily be mistaken for a hardhang, but was probably just my desktop box being kicked and pummelled and punched and ... resulted in it getting the one-finger salute.
So after I posted to dtrace-discuss-AT-opensolaris-DOT-org asking for some assistance, sethg got in touch with me and poked around inside my kernel's acpi tables for a few hours.
The end result is that we've discovered a new failure mode, and it looks like my motherboard ignores the laws of physics!
To quote Seth's response on dtrace-discuss-AT-opensolaris-DOT-org:
At least with ACPI we can provide the system with a new table for the kernel to blast in and make use of. So now I'm happily listening to non-scratchy and uninterrupted audio, and thanking the dedication of a fellow engineer on the other side of the world. Thanks Seth!
The wrong interrupt controller input was programmed with the wrong polarity, causing continuous interrupts to be sent to the CPU. Hey, at least now James knows the maximum theoretical # of interrupts / seconds his system can process... Specifically, the ACPI tables were lying about the interrupt polarity for a particular set of interrupts.
At least with ACPI we can provide the system with a new table for the kernel to blast in and make use of. So now I'm happily listening to non-scratchy and uninterrupted audio, and thanking the dedication of a fellow engineer on the other side of the world. Thanks Seth!
...
Specifically, the ACPI tables were lying about the interrupt polarity
for a particular set of interrupts.