Stephan Grell's Weblog
Stephan Grell's Weblog

20050721 Donnerstag Juli 21, 2005

N1GE 6 - Scheduler Hacks: Sorting queues

I just received a question asking on how to use the queue sequence numbers and what to do with them. I will give a short overview in this blog and hope to give enough pointers for ones own experiments. Based on the documentation, the scheduler  can sort the queue instances in two ways:

  • load based (from the hosts)
  • sequence number based (from the queues)
The load based sorting is configured by default including load adjustments. The load adjustments are added the host which will run the job during the scheduling cycle. This ensures, that one gets a kind of round robin job distribution. This load adjustment wears of overtime and will be replaced in the host load report interval by the real value. The important configuration values for the queue sorting are (scheduler configuration - qconf -msconf):

queue_sort_method                 load
job_load_adjustments              np_load_avg=0.50
load_adjustment_decay_time        0:7:30
load_formula                      np_load_avg

This setting will use the load for sorting, it adds for each started job 0.5 to the load of that host and the load will decay over 7.5 minutes.

Hint:
If a host has more than 1 slot, the load adjustment can lead to not using all slots on that host, because the next job might overload that host. qstat -j <job_id> will show the reasons, why a job was not dispatched including the hosts, which will not be used due to load adjustments. If np_load_avg is used for the load adjustments and the load formula, the number of processors in one machine is put into account.

 Example (using job_load_adjustments np_load_avg=1.5). As one can see, not all slots are used.
es-ergb01-01% qstat -f
queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
all.q@host1                     BIP   1/5       0.03     lx24-amd64
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 8
----------------------------------------------------------------------------
all.q@host2                    BIP   3/5       0.78     sol-sparc64
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 5
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 7
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 11
----------------------------------------------------------------------------
all.q@host3                   BIP   2/5       0.28     sol-sparc64
    103 0.55500 job        sg144703     t     07/21/2005 09:10:04     1 6
    103 0.55500 job        sg144703     t     07/21/2005 09:10:04     1 12
----------------------------------------------------------------------------
all.q@host4                    BIP   1/5       0.16     sol-x86
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 10
----------------------------------------------------------------------------
all.q@host5                    BIP   0/5       0.01     sol-x86
----------------------------------------------------------------------------
test.q@host1                    BIP   1/5       0.03     lx24-amd64
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 2
----------------------------------------------------------------------------
test.q@host2                   BIP   0/5       0.78     sol-sparc64   D
----------------------------------------------------------------------------
test.q@host3                   BIP   2/5       0.28     sol-sparc64
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 3
    103 0.55500 job        sg144703     t     07/21/2005 09:10:04     1 9
----------------------------------------------------------------------------
test.q@host4                    BIP   1/5       0.16     sol-x86
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 4
----------------------------------------------------------------------------
test.q@host5                    BIP   1/5       0.01     sol-x86
    103 0.55500 job        sg144703     r     07/21/2005 09:10:04     1 1

############################################################################
 PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
    103 0.00000 job        sg144703     qw    07/21/2005 09:10:02     1 13-20:1

qstat -j 103
scheduling info:
                            queue instance "test.q@ori" dropped because it is overloaded: np_load_avg=2.511719 (= 0.011719 + 2.50 * 1.000000 with nproc=1) >= 1.75
                            queue instance "all.q@ori" dropped because it is overloaded: np_load_avg=2.511719 (= 0.011719 + 2.50 * 1.000000 with nproc=1) >= 2.05
                            queue instance "all.q@carc" dropped because it is overloaded: np_load_avg=2.515000 (= 0.015000 + 2.50 * 2.000000 with nproc=1) >= 2.05
                            queue instance "test.q@carc" dropped because it is overloaded: np_load_avg=2.515000 (= 0.015000 + 2.50 * 2.000000 with nproc=1) >= 1.75
                            queue instance "test.q@gimli" dropped because it is overloaded: np_load_avg=1.945312 (= 0.070312 + 2.50 * 3.000000 with nproc=1) >= 1.75
                            queue instance "all.q@nori" dropped because it is overloaded: np_load_avg=2.580078 (= 0.080078 + 2.50 * 2.000000 with nproc=1) >= 2.05
                            queue instance "test.q@nori" dropped because it is overloaded: np_load_avg=2.580078 (= 0.080078 + 2.50 * 2.000000 with nproc=1) >= 1.75
                            queue instance "all.q@es-ergb01-01" dropped because it is overloaded: np_load_avg=2.070312 (= 0.195312 + 2.50 * 3.000000 with nproc=1) >= 2.05
                            queue instance "all.q@gimli" dropped because it is overloaded: np_load_avg=2.570312 (= 0.070312 + 2.50 * 4.000000 with nproc=1) >= 2.05

As we can see, this configuration can be a very powerful tool to setup rather complicated environments. However, there are cases were one would like to ensure that a certain queue is used before another queue. (I am using queue here to reference cluster queues and queue instances together) In these cases, one can assign a sequence number to the queues via qconf -mq <cluster queue name>:

seq_no                0


This sequence number is used, when the scheduler configuration is changed to:

queue_sort_method                 seqno


After this change, queue instances with a low seq_no will be chosen first. If there are are multiple queue instances with the same sequence number, the configured load value will
be used to determine, which queue instance to pick. This means, if all queue instances have the same seq_no and the scheduler should use the seq_no for sorting, it is ultimately using the load from the hosts.

Example:
"test.q" has a sequence number of 0
"all.q" has a sequence number of 2

queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
test.q@host1                   BIP   2/5       0.26     lx24-amd64
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 4
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 8
----------------------------------------------------------------------------
test.q@host2           BIP   0/5       0.58     sol-sparc64   D
----------------------------------------------------------------------------
test.q@host3                   BIP   4/5       0.44     sol-sparc64
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 3
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 5
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 7
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 9
----------------------------------------------------------------------------
test.q@host4                   BIP   2/5       0.08     sol-x86
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 2
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 6
----------------------------------------------------------------------------
test.q@host5                   BIP   2/5       0.01     sol-x86
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 1
    108 0.55500 job        sg144703     r     07/21/2005 09:24:44     1 10
----------------------------------------------------------------------------
all.q@host1                    BIP   0/5       0.26     lx24-amd64
----------------------------------------------------------------------------
all.q@host2                   BIP   0/5       0.58     sol-sparc64
----------------------------------------------------------------------------
all.q@host3                    BIP   0/5       0.44     sol-sparc64
----------------------------------------------------------------------------
all.q@host4                    BIP   0/5       0.08     sol-x86
----------------------------------------------------------------------------
all.q@host5                     BIP   0/5       0.01     sol-x86

As one can see, only the test.q was used and within the test.q, the load values had an evect.


( Jul 21 2005, 09:35:42 AM CEST ) Permalink Kommentare [0]


Archive
Sprache
Links
Referenzierte URLs