Thursday June 16, 2005 It is common for a kernel programmer to postpone processing of some tasks and delegate their execution to another kernel thread. There may be several reasons for doing this:
In all these cases programmer, in essense, needs to execute a piece of code (task) in a different context, where context usually means another kernel thread with different set of locks held and, possibly, a different priority.
Until introduction of task queues in Solaris 8 there was no generic OS facility for such in-kernel context change. Every subsystem used its own ad-hoc mechanisms, usually utilizing ``worker threads'' together with a list of jobs to give them. The task queues interface abstracts common code out of these mechanisms and provides simple way of scheduling asynchronous tasks.
A task queue consists of a list of tasks, together with one or more threads to service the list. If a task queue has a single service thread, all tasks are guaranteed to execute in the order they were dispatched. Otherwise they can be executed in any order. Note that since tasks are placed on a list, execution of one task and should not depend on the execution of another task or a deadlock may occur. A taskq created with a single servicing thread guarantees that all the tasks are serviced in the order in which they are scheduled.
Kernel users should use the documented DDI interface for all taskq operations. These interfaces are defined in the usr/src/uts/common/sys/sunddi.h header file. The exported interface consists of the following functions:
Every taskq created in the system keeps a set of kstat counters associated with it. Try running the following command on your system:
$ kstat -c taskq
module: unix instance: 0
name: ata_nexus_enum_tq class: taskq
crtime 53.877907833
executed 0
maxtasks 0
nactive 1
nalloc 0
priority 60
snaptime 258059.249256749
tasks 0
threads 1
totaltime 0
module: unix instance: 0
name: callout_taskq class: taskq
crtime 0
executed 13956358
maxtasks 4
nactive 4
nalloc 0
priority 99
snaptime 258059.24981709
tasks 13956358
threads 2
totaltime 120247890619
...
The kstat information above includes:
You can use the power of the kstat command to observe how some counter increases over time:
$ kstat -p unix:0:callout_taskq:tasks 1 5
unix:0:callout_taskq:tasks 13994642
unix:0:callout_taskq:tasks 13994711
unix:0:callout_taskq:tasks 13994784
unix:0:callout_taskq:tasks 13994855
unix:0:callout_taskq:tasks 13994926
...
The taskq implementation also provides several useful SDT probes: All the probes described below have two arguments: the taskq pointer and the pointer to the pointer to the taskq_ent_t structure. It can be used to extract the function and the argument from the D script.
Developers can use these probes to collect precise timing information about individual task queues and individual tasks being executed through them. For example, the following script will print what functions were scheduled via task queues for every 10 seconds:
#!/usr/sbin/dtrace -qs
sdt:genunix::taskq-enqueue
{
this->tq = (taskq_t *)arg0;
this->tqe = (taskq_ent_t *) arg1;
@[this->tq->tq_name,
this->tq->tq_instance,
this->tqe->tqent_func] = count();
}
tick-10s
{
printa ("%s(%d): %a called %@d times\n", @);
trunc(@);
}
Running this on my desktop produced the following output1:
callout_taskq(1): genunix`callout_execute called 51 times
callout_taskq(0): genunix`callout_execute called 701 times
kmem_taskq(0): genunix`kmem_update_timeout called 1 times
kmem_taskq(0): genunix`kmem_hash_rescale called 4 times
callout_taskq(1): genunix`callout_execute called 40 times
USB_hid_81_pipehndl_tq_1(14): usba`hcdi_cb_thread called 256 times
callout_taskq(0): genunix`callout_execute called 702 times
kmem_taskq(0): genunix`kmem_update_timeout called 1 times
kmem_taskq(0): genunix`kmem_hash_rescale called 4 times
callout_taskq(1): genunix`callout_execute called 28 times
USB_hid_81_pipehndl_tq_1(14): usba`hcdi_cb_thread called 228 times
callout_taskq(0): genunix`callout_execute called 706 times
callout_taskq(1): genunix`callout_execute called 24 times
USB_hid_81_pipehndl_tq_1(14): usba`hcdi_cb_thread called 141 times
callout_taskq(0): genunix`callout_execute called 708 times
Suppose that two friends, Bob and Alice are staying in the cafeteria line with Alice standing behind Bob. The cashier checks Bobs' tray and it turns out that Bob doesn't have enough money, so he wants to borrow from Alice. But Alice is not sure whether she has enough cash until she knows the cost of her lunch. This is a typical deadlock situation - both Bob and Alice can not make any forward progress waiting for each other. The same kind of deadlock may occur if two tasks A and B are placed on a queue which is served by a single thread when there is a resource dependency between A and B. One way to prevent such a deadlock is to guarantee that A and B are processed by two different threads, so that when A stalls for B the thread processing A will block until B makes enough progress and can provide the needed resource to B.
Dynamic task queues provide exactly such deadlock-free way of scheduling potentially dependent tasks on the same queues. They guarantee that every task is processed by a separate thread. Since the amount of tasks that can be scheduled at the same time is not known in advance, dynamic task queues maintain a dynamic thread pool that grows when the workload increases and shrinks when the workload dries off.
Dynamic task queues can not (yet) be used via the DDI interfaces. Some kernel
subsystems use the internal taskq calls directly to create and use
dynamic task queues. The system also maintains one shared dynamic task queue
called system_taskq. It can be used by specifying
system_taskq as the taskq argument to the
taskq_dispatch() function. It is really a good idea to also add
"TQ_NOSLEEP | TQ_NOQUEUE" to the flags when using
system_taskq.
Each taskq is implemented as a list of tasks protected by a per-taskq lock. One or more worker threads take tasks one by one and execute them by calling f(a) and then sleep, waiting for new entries. A taskq created with a single servicing thread has an important property: it guarantees that all its tasks are executed in the order they are scheduled. When a task queue is created with several servicing threads, task execution order is not predictable.
If you want to look at the actual implementation you need to look at the following files:
The first taskq implementation was done by Jeff Bonwick for Solaris 8. It was successfully used to replace many calls to the low-level thread_create() function. I added Dynamic Task Queues in Solaris 9 and used them to completely re-implement the STREAMS scheduler. In Solaris 10 I added DDI interfaces for task queues and also added kstat counters and DTrace probes.
1 For curious minds: the callout_taskq is used to handle system timers. As an exercise in your DTrace skills, try to figure out what actual timers are firing on each CPU. Hint - use the callout-start SDT probe, which has a pointer to the callout_t structure as its sole argument.
Technorati Tag: Solaris
Technorati Tag: OpenSolaris
Technorati Tag: DTrace
Technorati Tag: Kernel