Melvin Koh's Weblog

I'm just a contractor

All | General | Globus Toolkit | Identity Mgmt | Java | Solaris | Sun Grid Engine

20070510 Thursday May 10, 2007


Sun Grid Engine 6.1 - Name changes

Sun Grid Engine (note the N1 has been dropped) has a new version. The latest 6.1 version contain several new features, but the most anticipated feature for me is the new resource quota capability. We had customer asking for this features long time ago, and the only way to achieve it was to use plenty of scripting. Now, SGE 6.1 supports defining finegrain resource limits at user, queue and host level. For this feature, a new command "qquota" has been introduced. For more details about resource quota, see here.


Posted by melvin ( May 10 2007, 02:26:44 PM SGT ) Permalink Comments [0]

20060919 Tuesday September 19, 2006


N1GE Scheduler

Stephan Grell from the N1GE team has left Sun. Although I've never met him, but I've read a lot of his postings in the mailing list, always very helpful and knowledgeble especially in the N1GE scheduling. Stephan posted plenty of N1GE tips in his blog, so before his blog account gets deleted, I'll post of his past entries here.

From Stephen Grell's profiling blog entry:

N1GE 6 - Profiling

The Grid Engine software provides a profiling facility to determain where the qmaster and the scheduler spend their time. This has been introduced long before the N1GE 6 software. With the development of N1GE 6 it was greatly improved and its improvement continued over the the different updates we had for the N1GE 6 software. It was used very extensivly to analyse bottlenecks and find missconfigurations in existing installations. Until now, the source code was the only documentation for the output format, which might change with every new udpate and release. Lately a document was added to the source repository to give a brief overview of the output format and the different switches. The document is not complete, though it is a good start.

Profiling document

Posted by melvin ( Sep 19 2006, 11:29:25 AM SGT ) Permalink Comments [0]

20060704 Tuesday July 04, 2006


Manage complex using hostgroups

A customer of mine are complaining that managing the host complexes in their cluster is extremely tedious, as they have ~200 exec hosts so they need to modify each of the hosts. Since their hosts are grouped in hostgroups, why not use them to ease the management.

Eg. setcomplex @hostgroupA mycomplex=5
will automatically set the complexes of each of the host listed in @hostgroupA. I heck up the simple script, written in perl:

#!/usr/bin/perl

@lines = `qconf -shgrp $ARGV[0] 2>&1`;
shift(@lines);
$string = join("", @lines);
$string =~ s/\\//;
@hosts = split(" " , $string);
shift(@hosts);

foreach(@hosts) {
`qconf -mattr exechost complex_values $ARGV[1] $_`;
}


Of course, we don't have to stop here. We can extend the script to perform many other host specific management. The flexibility of hostgroup allows us to define many hostgroups for many purposes.

Posted by melvin ( Jul 04 2006, 05:49:13 PM SGT ) Permalink Comments [0]

20060629 Thursday June 29, 2006


Qmaster Monitoring

A very detailed monitoring of the qmaster, described by Stephen, will be useful for performance tuning.

Qmaster Monitoring

Posted by melvin ( Jun 29 2006, 10:36:52 AM SGT ) Permalink Comments [1]

20060612 Monday June 12, 2006


Avoid overscription for overlapped queues

It is common to have multiple queues for different purposes when designing the N1GE cluster. A large cluster that I designed for AIST, the F32 cluster, uses 4 specific queues and 1 general queue (all.q). The 4 queues have different ACLs for different groups of users, and all.q overlaps these queues. Since each hosts has 2 processors, the queues are configured to 2 slots per host, thus it is possible that there may be 4 jobs running in a single host (2 in all.q + 2 in specific queue). Here is the tips on how to prevent overscriptions in the overlapping queues.

The trick is to assign the slot complex and set its value to the number of processors it has. E.g.

qconf -me <exec_host>
..
complex_values slots=<number_of_cpus_or_slots>
...


Now the total number of jobs across the queues running on this host will not be more than the value assigned.


Posted by melvin ( Jun 12 2006, 11:32:33 AM SGT ) Permalink Comments [1]

20060605 Monday June 05, 2006


Failover using Shadow Host

A) Pre-Environmental setup for N1GE6

1. Copy all the necessary N1GE binaries files onto the system, unzip them and put them together (eg. /opt/n1ge6u8 = $SGE_ROOT)

2. Ensure all the services and configuration are setup before the actual N1GE installation (services to be available on boot up)
 
i) Ensure that NFS servers, NFS clients are configured correctly
ii) Ensure that the required users are created, sgeadmin, normal users and are able to write to their own directory
iii) Ensure hostname of all machines are in the /etc/hosts with the appropriate IP if they are not in the DNS
iv) Ensure the port numbers for N1GE qmaster and execution daemons are added in the /etc/services (sge_qmaster 536/tcp, sge_execd 537/tcp)
v) Ensure RPC services (server and client) setup correctly (eg. rpcinfo -p, /sbin/service portmap status, /sbin/service nfs start).

B) Installation of Berkeley DB Spooling Server

1. Run the './inst_sge -db' command on the server that you have assigned as the RPC spooling server. Note that the DB spooling server must not be the qmaster server. Use default option for all and write down the value of these two fields after installation:
- Spooling server name
- DB spooling directory

2. Verify that:
- the sgebdb startup script /etc/rc.d/init.d/sgebdb is created
- the sgebdb daemon is running at the spooling DB server "ps -ef | grep sge"

C) Installation of Qmaster, Execution Host

1. Install the qmaster, invoke ./install_qmaster

i) Select the Berkeley DB option
ii) Choose “Y” when you are prompt to use the DB spooling server
iii) Specify the spooling server name and DB spooling directory when prompt about information

2. Verify that qmaster is installed successfully by typing the command “ps -ef | grep sge” and checking that the sge_qmaster and sge_schedd is running

D) Installation of Shadow Host

1. Type ./inst_sge -sm

2. Verify that:
- the sge_shadowd daemon is running ( ps &#150;ef | grep sge )
- there is an entry in the $SGE_ROOT/$SGE_CELL/common/shadow_masters file

E) Important Environment Variable


To change the time interval that the shadow host will take over after the master host is down, set the follow environment variables:

SGE_CHECK_INTERVAL – controls the interval in which the sge_shadowd checks the hearbeat file (60 seconds by default)

SGE_GET_ACTIVE_INTERVAL – controls the interval which a sge_shadowd instance tries to take over when the hearbeat file has not changed

SGE_DELAY_TIME – controls the interval in which sge_shadowd pauses if a takeover bid fails. used only when there are more than one shadow hosts

F) Verfication of Failover

To verify that the shadow host setup is correct, we need to simulate that a qmaster failure so that the shadow daemon will be activated. Note: A common mistake in simulating the failure is by stopping the qmaster daemon using "sgemaster stop" or even with "kill ". Using these command will shutdown the qmaster gracefully, and is equivalent to normal shutdown of the service. The shadow host will not take over under these circumstances. When the qmaster shutdown normally, itwill create an empty "lock" file under "$SGE_ROOT/$SGE_CELL/spool/qmaster/" directory. If the shadow daemon sees this file, it will never activate the failover. Thus, the proper way to test the failover is to stop the qmaster daemon with "kill -9 ". It is fine to kill the "sge_schedd" daemon although it is not really neccessary.

i)Verify that the shadow daemon is running on the shadow host ( ps
-fe | grep sge )
ii) Kill the qmaster (kill
-9 )
iii) Wait for the interval specified for the shadow host to takeover (default is about 10mins). iv) Verify that the qmaster and scheduler daemons are started (ps
-fe | grep sge)
v) The handover messages are logged in the follow files under $SGE_ROOT/$SGE_CELL/spool/qmaster directory
- messages_qmaster
- messages_shadowd.

Posted by melvin ( Jun 05 2006, 11:23:06 PM SGT ) Permalink Comments [0]


This is a personal weblog, I do not speak for my employer.