Alan Hargreaves' Weblog
The ramblings of an Australian SaND TSC* Principal Field Technologist
* Solaris and Network Domain Technology Support Centre - The group I work forTags
(update 1) acoustic bind birthday blues bugs cec cec2007 cec2008 china cmt contention cringley debugging dogs dtrace earthquake encumbered-binaries extra flash funny google guitar halloween huron install kids linux liveupgrade locking mdb music mysql newyear niagra openjava opensolaris oracle patches patents percussion performance redhat secondlife security solaris sru sun support sxcr t2 t2000 timeslider ufs upgrade virtualbox windows youtube zfs
Sunday Jul 31, 2005
How Solaris Calculates %user, %system and %idle
A year or so back I wrote an infodoc that described how we calculated the %iowait (or %wio) number. I had always intended to create a companion document outlining the broader question of how we do the %user, %system and %idle numbers. A few misconceptions that I've seen have prompted me to do this as a 'blog first.
The first thing that must be noted is that with Solaris 10 and microstate accounting we completely changed how the numbers are arrived at. I'll go over the pre-Solaris 10 method first, then discuss the current method along with links into the Open Solaris Source Tree.
Before Solaris 10
There is an array in cpu_t called cpu_stats.sys.cpu[]. This array contains counters for:
CPU_USER
CPU_SYSTEM
CPU_WAIT
CPU_IDLE
The various array entries contain a count based on a sample (taken at fixed intervals) of what each cpu is doing at the time of the sampling. In order to determine usage, we must take two snapshots of these counters and look at the differences.
If we sum these differences, we get a count of how many samples were taken for this particular cpu. We then simply calculate a percentage for each of the figures.
Okay, so how do we do the sampling?
Usually, the function clock() is called every 10ms1. We do the sampling in here. For each cpu, we look at what it is currently executing and increment the appropriate counter. Note that in Solaris 9 and earlier we still have a counter for IO Wait. This number is only calculated if the cpu is idle. See infodoc 75659 for more explanation of this.
The values are accessible through the cpu_stat kstat module as idle, user, kernel and wait. eg
$ kstat -m cpu_stat -s '/^(wait|user|kernel|idle)$/'
module: cpu_stat instance: 0
name: cpu_stat0 class: misc
idle 373121
kernel 11557
user 5196
wait 0
1. If we define hires_tick as non-zero in /etc/system, then clock will be called every millisecond.
Solaris 10 and Beyond
In general, the sampling method gives us a pretty good number. It would be an unusual thread that takes a significant amount of cpu time, that is not on cpu every time that clock() runs. However, implementing microstate accounting gave us the opportunity to make it even better.
The raw numbers are now kept in an array in cpu_t called cpu_acct[]. This contains entries for:
CMS_USER
CMS_SYSTEM
CMS_IDLE
There is a state called CMS_DISABLED, but it's used for something else and there is not an array element for it.
So what are the numbers? We don't sample anymore. The numbers represent delta values from the high resolution timer (nanoseconds) taken from when the cpu entered this state, to where we are about to change it. The values are calculated in new_cpu_mstate().
The current state is saved in cpu->cpu_mstate. On a state change, the high resolution time is stored in cpu->cpu_mstate_start.
new_cpu_mstate() reads the high resolution timer once at the beginning of the routine. This time is used for the end of the period being measured and the start of the new period so we do not lose small numbers of cycles.
This function is called whenever we change state. It's called directly from idle_enter() and idle_exit(). The other state changes are handled from new_mstate(), which also updates per-lwp statistics. The following functions and macros call new_mstate()
SEMA_BLOCK()
cv_block()
fp_precise()
fpu_trap()
lwp_block()
lwp_cond_wait()
lwp_mutex_timedlock()
lwp_mutex_trylock()
lwp_park()
lwp_rwlock_lock()
sched()
shuttle_resume()
shuttle_sleep()
shuttle_swtch()
stop()
term_mstate()
trap()
turnstile_block()
The upshot of this is that you can probably place a higher reliance in the figures now, whereas the previous figures were a little more coursely grained.
The kstats for the previous figures still exist as distinct structure elements in cpu->cpu_stats.sys, the difference being that they are now calculated from the microstate accounting generated figures.
The new figures can be accessed through the new cpu kstat module which has a grouping called sys, containing the statistics cpu_nsec_idle, cpu_nsec_kernel and cpu_nsec_user.
$ kstat -n sys -s 'cpu_nsec*'
module: cpu instance: 0
name: sys class: misc
cpu_nsec_idle 3626708012091
cpu_nsec_kernel 113348790642
cpu_nsec_user 50875403788
Technorati Tags: OpenSolaris, Solaris
Posted at 09:14AM Jul 31, 2005 by Alan Hargreaves in Solaris | Comments[3]


Posted by Michael Ernest on August 04, 2005 at 11:30 PM EST #
You're very welcome Michael, and I learned some thing while researching it too, which is also good :)
Alan.
Posted by Alan Hargreaves on August 05, 2005 at 08:02 AM EST #
Posted by Sandeep Thakur on August 11, 2005 at 07:32 AM EST #