Richard Hierlmeier's Weblog
- All
- General
- Grid Engine
Getting Grid Engine Scheduler to schedule jobs despite high load
During the preparation for an SDM demo which uses the Cloud Adapter to manage OpenSolaris zones my colleague ran into the problem that the Grid Engine Scheduler did not schedule jobs onto the zones because the overall load on the host was too high.
He installed the system on a opensolaris image running inside of VirtualBox. He submited 1000 sleeper jobs into the grid engine cluster and the MaxPendingJobsSLO started to produce resource requests. The zones service (Cloud Adapter) started up zones and they were assigned to the grid engine service. A small amount of sleeper jobs were scheduled onto the zones. Suddenly no more jobs where scheduled and the zones were unassigned from the grid engine service.
The problem was that his setup lead to a high load on the host (5.6). Grid Engine defines per default a load threshold of 1.75 inside of a queue. If a load on a host is higher than 1.75 the corresponding queue instance goes into alarm state:
# qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@z2 BIP 0/0/1 4.16 sol-x86 ad
---------------------------------------------------------------------------------
all.q@z3 BIP 0/0/1 4.11 sol-x86 ad
---------------------------------------------------------------------------------
all.q@z4 BIP 0/0/1 4.09 sol-x86 ad
No more jobs cloud be scheduled on the host. The MaxPendingJobsSLO saw that the scheduler does not get jobs into the hosts. The resources did no longer get usage from the MaxPendingJobsSLO. SDM decided to move the zones away from the grid engine service.
To fix this problem he increased the load threshold in the all.q:
# qconf -mq all.q
qname all.q
hostlist @allhosts
seq_no 0
load_thresholds np_load_avg=10
Afterwards the alarm state of the queue instances was cleared. Some jobs went also into error state. He had to clear also this jobs:
# qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@z2 BIP 0/1/1 4.04 sol-x86
5 0.55500 sleep rh r 07/17/2009 13:55:50 1 4
############################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
4 0.55500 sleep rh Eqw 07/17/2009 13:39:20 1
5 0.55500 sleep rh Eqw 07/17/2009 13:47:50 1 1-3:1
5 0.00000 sleep rh qw 07/17/2009 13:47:50 1 5-1000:1
# qmod -cj "*"
rh cleared error state of job 4
rh cleared error state of job-array task 5.1
rh cleared error state of job-array task 5.2
rh cleared error state of job-array task 5.3
Job-array task 5.4 is not in error state
# qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@z2 BIP 0/1/1 4.04 sol-x86
5 0.55500 sleep rh r 07/17/2009 13:55:50 1 4
############################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
4 0.55500 sleep rh qw 07/17/2009 13:39:20 1
5 0.55500 sleep rh qw 07/17/2009 13:47:50 1 1-3:1
5 0.55500 sleep rh qw 07/17/2009 13:47:50 1 5-1000:1
Posted at 02:09PM Jul 17, 2009 by rhierlmeier in Grid Engine | Comments[0]