
Dienstag April 25, 2006
N1GE 6 - Monitoring the qmaster
With the update 7 of the N1GE 6 software we added a new switch to monitor the qmaster. The
qmaster monitoring allows to get statistics on each thread displaying
what they have been busy with and how much time they spend on it. There
are two switches to controll the statistic output:
qconf -mconf
qmaster_params
Monitor_Time=0:0:20 LOG_Monitor_Message=1
MONITOR_TIME
Specifies the time interval when the monitoring information should be
printed. The monitoring is disabled per default and can be
enabled by specifying an interval. The monitoring is per thread and is
written to the messages file or displayed by the "qping -f" command
line tool. Example: MONITOR_TIME=0:0:10 generates the monitoring
information most likely every 10 seconds and prints it. The specified
time is a guideline and not a fixed interval. The used interval is
printed and can be everything between 9 seconds and 20 in this example.
LOG_MONITOR_MESSAGE
The monitoring information is logged into the messages files per
default. In addition it is provided for qping and can be requested by
it. The messages files can become quite big, if the monitoring is
enabled all the time, therefore this switch allows to disable the
logging into the messages files and the monitoring data will only be
available via "qping -f".
A description of the output format can be found here.
Example output in the qmaster messages file
($SGE_ROOT/<CELL>/spooling/qmaster/messages):
04/25/2006
19:06:17|qmaster|scrabe|P|EDT: runs: 1.20r/s (clients: 1.00 mod: 0.05/s
ack: 0.05/s blocked: 0.00 busy: 0.00 | events: 0.05/s added: 0.05/s
skipt: 0.00/s) out: 0.00m/s APT: 0.0001s/m idle: 99.99% wait: 0.00%
time: 19.98s
04/25/2006
19:06:17|qmaster|scrabe|P|MT(2): runs: 0.25r/s (execd
(l:0.00,j:0.00,c:0.00,p:0.00,a:0.00)/s GDI
(a:0.05,g:0.00,m:0.00,d:0.00,c:0.00,t:0.00,p:0.00)/s event-acks:
0.05/s) out: 0.05m/s APT: 0.0002s/m idle: 100.00% wait: 0.00% time:
20.10s
04/25/2006
19:06:18|qmaster|scrabe|P|MT(1): runs: 0.19r/s (execd
(l:0.00,j:0.00,c:0.00,p:0.00,a:0.00)/s GDI
(a:0.05,g:0.00,m:0.05,d:0.00,c:0.00,t:0.00,p:0.00)/s event-acks:
0.00/s) out: 0.05m/s APT: 0.0001s/m idle: 100.00% wait: 0.00% time:
21.15s
04/25/2006
19:06:27|qmaster|scrabe|P|TET: runs: 0.67r/s (pending: 9.00 executed:
0.67/s) out: 0.00m/s APT: 0.0205s/m idle: 98.63% wait: 0.00% time:
21.00s
04/25/2006
19:06:37|qmaster|scrabe|P|EDT: runs: 1.60r/s (clients: 1.00 mod: 0.05/s
ack: 0.05/s blocked: 0.00 busy: 0.00 | events: 1.10/s added: 1.10/s
skipt: 0.00/s) out: 0.05m/s APT: 0.0002s/m idle: 99.97% wait: 0.00%
time: 20.00s
04/25/2006
19:06:39|qmaster|scrabe|P|MT(1): runs: 0.37r/s (execd
(l:0.00,j:0.00,c:0.00,p:0.00,a:0.00)/s GDI
(a:0.14,g:0.00,m:0.05,d:0.00,c:0.00,t:0.05,p:0.00)/s event-acks:
0.05/s) out: 0.32m/s APT: 0.0024s/m idle: 99.91% wait: 0.00% time:
21.55s
If we use the following settings:
qconf -mconf
qmaster_params
Monitor_Time=0:0:20 LOG_Monitor_Message=0
We will need to use qping to gain access to the monitoring
messages. Thiis should be the prefered way because we will get the
statics from the communication layer with the statistics in the
qmaster. Here is an example:
04/25/2006 19:09:53:
SIRM
version:
0.1
SIRM message id: 3
start
time:
04/25/2006 08:45:06 (1145947506)
run time
[s]:
37487
messages in read buffer: 0
messages in write buffer: 0
nr. of connected clients: 3
status:
0
info:
TET: R (1.99) | EDT: R (0.99) | SIGT: R (37486.73) | MT(1): R (3.99) |
MT(2): R (0.99) | OK
Monitor:
04/25/2006 19:09:47 | TET: runs: 0.40r/s (pending: 9.00 executed:
0.40/s) out: 0.00m/s APT: 0.0001s/m idle: 100.00% wait: 0.00% time:
20.00s
04/25/2006 19:09:37 | EDT: runs: 1.00r/s (clients: 1.00 mod: 0.00/s
ack: 0.00/s blocked: 0.00 busy: 0.00 | events: 0.00/s added: 0.00/s
skipt: 0.00/s) out: 0.00m/s APT: 0.0001s/m idle: 99.99% wait: 0.00%
time: 20.00s
04/25/2006 08:45:07 | SIGT: no monitoring data available
04/25/2006 19:09:36 | MT(1): runs: 0.15r/s (execd
(l:0.04,j:0.04,c:0.04,p:0.04,a:0.00)/s GDI
(a:0.00,g:0.00,m:0.00,d:0.00,c:0.00,t:0.00,p:0.00)/s event-acks:
0.00/s) out: 0.00m/s APT: 0.0002s/m idle: 100.00% wait: 0.00% time:
26.86s
04/25/2006 19:09:39 | MT(2): runs: 0.14r/s (execd
(l:0.00,j:0.00,c:0.00,p:0.00,a:0.00)/s GDI
(a:0.00,g:0.00,m:0.00,d:0.00,c:0.00,t:0.00,p:0.00)/s event-acks:
0.00/s) out: 0.00m/s APT: 0.0000s/m idle: 100.00% wait: 0.00% time:
21.04s
( Apr 25 2006, 07:14:12 PM CEST )
Permalink
N1GE 6 - Scheduler Hacks: Seperated Master host for pe jobs
In the distributions of pe jobs over a range of hosts, the pe provides
a set of allocation rules. These rules allow the admin to specify that
a host should be filed up first before another is used, that each host
is used before any host runs a second task, or that the job uses a
specified amount of slots on each host it is using. This solves most of
the use cases around pe jobs.
In this commend I would like to scatch out a scenario which cannot be
addressed with the existing allocation rules, the exclusive use by the
master task of the master host while all other hosts will use the
fill-up allocation rule. This can become handy if the master task of a
job requires a lot of memory while the slave tasks do the computation
and only one machine with a lot of memory is available. The big machine
can and should run multiple master tasks of this job kind.
There are two solutions to the problem. One could separated the memory
intense computation out into an extra job and work with job
dependencies or one configures N1GE to handle the above use case as
specified without any job modifications.
I have the following setup:
qstat -f
queuename
qtype used/tot. load_avg
arch states
----------------------------------------------------------------------------
all.q@big
BIP 0/4
0.02 sol-sparc64
----------------------------------------------------------------------------
small.q@small1
BIP
0/1 0.00
lx24-amd64
----------------------------------------------------------------------------
small.q@small2
BIP
0/1 0.02
sol-sparc64
And a configured pe in all queue instances:
qconf -sp
make
pe_name
make
slots
999
user_lists
NONE
xuser_lists
NONE
start_proc_args
NONE
stop_proc_args
NONE
allocation_rule
$fill_up
control_slaves
TRUE
job_is_first_task
FALSE
urgency_slots
min
We now go ahead and change the load_threshold in the all.q@big queue
instance to be a load value that is not used in the other queue
instances, such as:
qconf -sq all.q
qname
all.q
hostlist
big
seq_no
0
load_thresholds
NONE,[big=load_avg=4]
The used load threshold has to be a real load value and cannot be a
fixed or consumable value.
Next step to make our enviroment work is to change the scheduler
configuration to the following:
qconf -ssconf
algorithm
default
schedule_interval
0:2:0
maxujobs
0
queue_sort_method
load
job_load_adjustments
load_avg=4.000000
load_adjustment_decay_time
0:0:1
By changing the configuration of the scheduler to use the
job_load_adjustments like this, it will add an artificial load to each
host, that will run a task. With this configuration we can start one
task on the big machine in each scheduling run. Since the
load_adjustment_decay_time is only 1 second, the scheduler has
forgotten about the artificial load in the next scheduling run and can
start a new task on the big host. This way, we archive what we have
been looking for.
One important note:
The big machine is only allowed to have one queue instance, or all
queue instances of the big machine have to share the same load
threshold. If that is not the
case, it will not work.
( Apr 25 2006, 10:37:37 AM CEST )
Permalink
|
|
| Archive |
|
|
| « April 2006 » | | Mo | Di | Mi | Do | Fr | Sa | So |
|---|
| | | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | | 26 | 27 | 28 | 29 | 30 | | | | | | | | | | Heute |
|
|
|
|
|
|
| Sprache |
|
|
|
|
|
| Links |
|
|
|
|
|
| Referenzierte URLs |
|
|
|
Page Hits heute: 25
|
|
|
|
|
|