Prior to Solaris 10, we used to have
a resolution of 10 milliseconds by default for
kernel's
cyclic
subsystem on x86 and also the high resolution user level real time
timer (see
timer_create(3RT)).
Then there was a bug report (4830628) where system time as reported by
mpstat
on idle CPUs was much greater than expected on multi-processor
x86 systems.
At the time, the way our timer interrupts on x86 used to work was that
only the
boot CPU was programmed to periodically generate timer interrupts
(by default
at 10 millisecond intervals). Upon handling such interrupts on boot
CPU, the cyclic
backend then would send a broadcast cross call to other CPUs. The time
to handle
the interrupt from the cross call on the other CPUs were accounted for
system time
in mpstat.
I could have fixed the bug in several ways. Re-implementing the timer
interrupt
mechanism was the best way, originally suggested by
Jonathan. This was something
that needed to be done anyway since there were a lot of benefits to it.
We'd get a
higher resolution for the kernel's cyclic subsystem and also the high
resolution user
level real time timer, in addition to the bug being fixed.
The new implementation is to have a per-CPU based timer source. Each
CPU then
generates timer interrupts on its own instead of having the cross call
from boot CPU.
I took advantage of the Local APIC support for one-shot programming,
instead of
programming the timer to generate periodical interrupts, one-shot mode
is used.
This way the resolution for getting the next interrupt for the cyclic
subsystem was
increased to be the same as resolution of the local APIC, e.g in the
order of
10 nanoseconds :-). That is just great. We also do not have the
overhead of traffic
for the cross calls on the bus anymore.
The cyclic backend already had the infrastructure to support this. I
updated
the interface to our pcplusmp module, a platform specific module (psm)
on
x86 which deals with APIC. I added four new functions to psm_ops
structure:
(see
psm_types.h)
called to generate an interrupt at the time specified by the passed in
argument.
psm_timer_enable and psm_timer_disable, enable and disable the timer
interrupts
on the CPU being called on. Since the cyclic setup on x86 is
intertwined with the psm
setup of for timer, psm_post_cyclic_setup was introduced so that the
psm module itself