Dhanaraj M

Tuesday Aug 25, 2009

Solaris - Tickless Kernel Architecture

In order to save power, CPUs can be kept in a low power state (only when the CPU is IDLE).
This can be achieved through Tickless kernel architecture. Hence, the clock cyclic fires
based on events rather than periodically for every tick.

For more details about the project, visit here
CR 6567390 clock efficiency optimizations ('tickless' clock)
CR 6875377clock() service decomposition

This project is divided into several tasks
# Callout / Timeout Re-implementation
    - Integrated into snv_103, tracked by CRs 6565503 (Discussed in my previous blog)
# clock() decoupled lbolt / lbolt64
# Event based historical load average implementation
# Software PLL time adjustment / NTP, tod migration
# Tick processing, Tick accounting
    - CR 6860423 thread tick accounting and time slice enforcement needs to be tickless

Monday Aug 24, 2009

How scalable is Solaris?

In the recent time, we have started seeing the customers running multi-threaded applications which use
CPUs/disks heavily. The increase in number of CPUs helps to speed up the processing time significantly.
How the OS is going to support this? Madhavan.T Venkataraman contributed two scalability projects to
Solaris community and the details are given below.

In Solaris every tick, Clock() performs the following book-keeping & accounting activities
   - Lbolt updates
   - Load average computations
   - User thread Tick accounting
   - Callouts
   - Miscellaneous activities

Tick accounting needs to be made scalable (CR 6619224)

Every tick, the following tasks are performed as part of Tick accounting
 - User thread, running on a CPU, gets charged with one tick
 - Account for CPU time usage by user thread
 - Time quantum used by a thread
 - Dispatching decisions are made using this
 - LWP interval timers (virtual/profiling timers) are processed

The clock() handler walks the CPU list and performs tick accounting.
As the number of CPUs increases, the tick accounting loop gets larger.
Since only one CPU is engaged in doing this, this is single-threaded.

In this project, CPU set is created and a cross-trap is sent to one of the CPUs in the set.
Hence, tick accounting is scheduled on multiple CPUs by the clock() handler.

Callout processing is single threaded, throttling applications that rely on scalable callouts (CR 6565503)

Callout() types are
1) Real-time callout()
     Consumers -> Interval-timers/sleep()/nanosleep()/timed-wait
2) Normal callout()
     Consumers -> Error coditions at Networking Layer/drivers

Callouts are currently processed in a single threaded manner on the clock CPU.
The more the number of CPUs in the system, the more severe the problem.
This basically causes mutex contention, throttling of applications and the
clock CPU being consumed wholly by system activity.

This projects implements per-CPU callout tables and cyclics. Furthermore, the implementation
will be event-based as opposed to polling for expired callouts every tick.

Calendar

Feeds

Search

Links

Navigation

Referrers