
Dienstag April 25, 2006
N1GE 6 - Scheduler Hacks: Seperated Master host for pe jobs
In the distributions of pe jobs over a range of hosts, the pe provides
a set of allocation rules. These rules allow the admin to specify that
a host should be filed up first before another is used, that each host
is used before any host runs a second task, or that the job uses a
specified amount of slots on each host it is using. This solves most of
the use cases around pe jobs.
In this commend I would like to scatch out a scenario which cannot be
addressed with the existing allocation rules, the exclusive use by the
master task of the master host while all other hosts will use the
fill-up allocation rule. This can become handy if the master task of a
job requires a lot of memory while the slave tasks do the computation
and only one machine with a lot of memory is available. The big machine
can and should run multiple master tasks of this job kind.
There are two solutions to the problem. One could separated the memory
intense computation out into an extra job and work with job
dependencies or one configures N1GE to handle the above use case as
specified without any job modifications.
I have the following setup:
qstat -f
queuename
qtype used/tot. load_avg
arch states
----------------------------------------------------------------------------
all.q@big
BIP 0/4
0.02 sol-sparc64
----------------------------------------------------------------------------
small.q@small1
BIP
0/1 0.00
lx24-amd64
----------------------------------------------------------------------------
small.q@small2
BIP
0/1 0.02
sol-sparc64
And a configured pe in all queue instances:
qconf -sp
make
pe_name
make
slots
999
user_lists
NONE
xuser_lists
NONE
start_proc_args
NONE
stop_proc_args
NONE
allocation_rule
$fill_up
control_slaves
TRUE
job_is_first_task
FALSE
urgency_slots
min
We now go ahead and change the load_threshold in the all.q@big queue
instance to be a load value that is not used in the other queue
instances, such as:
qconf -sq all.q
qname
all.q
hostlist
big
seq_no
0
load_thresholds
NONE,[big=load_avg=4]
The used load threshold has to be a real load value and cannot be a
fixed or consumable value.
Next step to make our enviroment work is to change the scheduler
configuration to the following:
qconf -ssconf
algorithm
default
schedule_interval
0:2:0
maxujobs
0
queue_sort_method
load
job_load_adjustments
load_avg=4.000000
load_adjustment_decay_time
0:0:1
By changing the configuration of the scheduler to use the
job_load_adjustments like this, it will add an artificial load to each
host, that will run a task. With this configuration we can start one
task on the big machine in each scheduling run. Since the
load_adjustment_decay_time is only 1 second, the scheduler has
forgotten about the artificial load in the next scheduling run and can
start a new task on the big host. This way, we archive what we have
been looking for.
One important note:
The big machine is only allowed to have one queue instance, or all
queue instances of the big machine have to share the same load
threshold. If that is not the
case, it will not work.
( Apr 25 2006, 10:37:37 AM CEST )
Permalink
|