Today's Page Hits: 172
This page validates as XHTML 1.0, and will look much better in a browser that supports web standards, but it is accessible to any browser or Internet device. It was created using techniques detailed at glish.com/css/.
sge6u4 and mpich integration for solaris 10 x64
Posted at 09:09PM Aug 07, 2005 by hstsao in N1 Grid | Comments[1]
lustre and N1GE6 experience
Posted at 08:51PM Aug 07, 2005 by hstsao in N1 Grid | Comments[6]
rocks and v20z
since I only have one node so I could not try out the compute node
this version still contain sge5p3
it will support any version of the rhel4, e.g. CentOS etc
since this initial posting, Rocks 4 now also has sge6u4 roll and new improved roll from scable systems
rocks4 will also work with rhel4u1 that has dual core opteron support
In order to access the SP of v40z, one maight need IPMItool roll so one can have inband or outband ipmi control
Posted at 07:41PM Jun 13, 2005 by hstsao in N1 Grid | Comments[4]
acro for sles9sp1 and n1ge6u6
We put done some observation in installing ARCo with SLES9SP1 and N1GE6U6
There is patch for U6 for AMD and common
The main point is that in a full installation of SLES9SP1 postgresql-7.4.6-02 is already installed
Posted at 09:14PM Jun 04, 2005 by hstsao in N1 Grid | Comments[2]
integration of schrodinger with SGE
We present a POC of integration of schrodinger with SGE
I have good advice from the help@schrodinger.com.
Schrodinger can use the MPI environmnet to run on multiple CPUs on multiple system.
As a shared grid environmnet one would like to submit jobs through a queue system.
Schrodinger support queue system: NQS , PBS and LSF, we need to create a similary environment so user can use the SGE env.
under the schrodinger root directory there is queues directory and it contains NQS, PBS and LSF.
we use cp -a NQS SGE to copy the NQS evvironment
there are five files
cancel, config, status.pl, submit and templates.sh
first one need to update the variables
QPATH=/opt/gridengine/bin/lx24-amd64 <--change
QDEL=qdel <--same as NQS
QSUB=qsub <-- same as NQS
QSTAT=qstat <-- same as NQS
templates.sh got the most update
#!/bin/sh
#$ -N %NAME%
#$ -o %LOGDIR%/%JOBID%.qlog
#$ -j y
#$ -pe mpich %NPROC%
QPATH=/opt/gridengine/bin/lx24-amd64
curdir=`echo $0 |sed -e 's#/[^/]*$##'`
if [ -f "$curdir/config" ]; then
. $curdir/config
fi
PATH=$QPATH:$PATH
export SCHRODINGER_BATCHID
SCHRODINGER_BATCHID=$JOB_ID <-- from SGE
SCHORDINGER_NODEFILE=$TMPDIR/machines <-- frm SGE
export SCHRODINGER_NODEFILE
%ENVIRONMENTS%
%COMMAND%
the schrodinger_hosts file need an update entry
name: localhost
schrodinger: /opt/schrodinger35
env: SCHRODINGER_RSH=ssh
env: SCHRODINGER_RCP=scp
name: testcluster
host: testcluster.local
hostname: testcluster.local
processors: 16
tmpdir; /state/partition1
name: sge
host: testcluster.local
hotname: testcluster.local
Queue: SGE
Qargs: ""
processors: 16
tmpdir: /state/partition1
Posted at 09:35AM Apr 09, 2005 by hstsao in N1 Grid | Comments[5]
Integration of gaussian 03 with SGE
Recently in a POC at customer site, we use the ROCKS cluster SW.
One of the requirement for POC is the running gaussian 03 under SGE.
In linux environment g03 can use linda parallel environmnet to use multiple processors on multiple nodes.
For simplicity we just use the SGE's PE mpi.
The hardest part for this exercise is understanding how to run g03 and how g03 work with queue system
After working with customer to create a input file and learn how to run the g03l and read through some documentation and example on the web on PBS and g03 and g03 and SGE and getting some example g03 script without SGE from barbara perz
I created two scripts
one is for customer to change the inputfile and number of nodes for the g03's jobs
one is the driver for g03 input file.
our s.csh is very simple
it will take two agrument
$1=number of %Nprocl +1
$2=input file
qsub -pe mpi
Under g03l one can specifi Number of Process used in Linda,
one can add the %Nprocl in the input file and since linda will also use one processor, so we allocate %Nprocl +1 from SGE's PE mpi environment.
We will use the user's input to change the inputfile then submit the job to g03l
#!/bin/csh -f
setenv g03root
source $g03root/g03/bsd/g03.login
set scratch=$TMPDIR <-- SGE's tmpdir
setenv GAUSS_SCRDIR $scratch
#$ -cwd
#$ -j y
set nodefile=$TMPDIR/machines <-- SGE's nodefile assign by -pe mpi
set ncpus=`expr $NSLOTS -1 ` <-- NSLOTS from -pe mpi
set input=$1 <-- inputfile
set tempinp=tmp$$ <-- new input file
echo "%nprocl=$nprocs " > $tmpinp
cat $input >> $tmpinp
cp $tmpinp $tmpinp.com <--rename
setenv GAUS_LFLAGS "-vv -mp 2 -nodefile $nodefile" <-- assign 2 processor per node
g03l $tmpinp.com $input.$ncpus.log
Posted at
07:08PM Apr 07, 2005
by hstsao in N1 Grid |
Comments[3]
certificate renew in SGE 5.3
a question on certificate renew in SGE 5.3
After some investigation, the followings are some observations
Posted at 11:40PM Jan 26, 2005 by hstsao in N1 Grid | Comments[1]
n1ge6 installation in Fedore core 2
I need to do a worksop at customer site that is using FC2, I need to install N1GE6 in FC2.
The following is my experience.
I install FC2 in desktop env
Since FC2 is using kernel 2.6, inst_sge -m fails and also complains about "strings" not found
I install binutils-2.15.90.0.3-5.i386.rpm to get strings
I create a link under bin, utilbin, lib of lx26-x86 to lx24-x86.
Now the installation is fine and acturally it is running lx24-x86?.
For qmon to work I need to install opebmotif21-2.1.30-9.i386.rpm
I also install ypserv-2.12.1-2.i386.rpm in the master host that will be NIS master
for the ARCo installation, I basically following my sparc installation blogs but with some difference
Posted at 06:06AM Dec 07, 2004 by hstsao in N1 Grid |
Install N1Grid Engine 6 and compile postgresql on Sparc
To use the ARCO feature of N1ge6, one will need some database to store the data. At this point in time n1ge6 support postgresql and oracle.
To use postgresql on SPARC system , one will need to download the source code and compile.
Since most GUN license based SW like to use the gcc, so one will need to install the Solaris 9 companion CD first
N1ge6 has an update u1 in the form of patches for sparc:
The latest patch required postgresql-7.4.2, since I donot know too much about the postgresql so we download the version postgresql-7.4.2.tar.gz from the http://www.postgresql.org/
After I run gunzip -c postgresql-7.4.2.tar.gz |tar xvf -
cd postgresql-7.4.2
./configure
it fails.
examine config.log
it complains about libreadline.so.4 and about version of bison
Even through I setup the LD_LIBRARY_PATH to include /opt/sfw/lib and under /opt/sfw/lib there is libreadline.so.4
ld still cannot find the library,(this mean I donot know too much about how gcc work) so i copy the libreadline.so.4 to /usr/lib
(patrick@zill.net sugest me to run crle -u -l /opt/sfw/lib
acturally after I reboot the machies all is well, I donot need to copy the library to /usr/lib:-) )
need to run ldconfig /opt/sfw/lib so ld can include /opt/sfw/lib
It works. I finish ./configure and run make.
To be safe, I also download the bison-1.876d.tar.gz from the http://sunfreeware.com/ compile and install the new bison under /usr/local/bin
I re-run the ./configure , it does not complian about the version of bison.
I run make clean, make and make install
I have postgresql-7.4.2 install under /usr/local/pgsql.
The following are my experience in setup the ARCO.
It follows very closely with the chapter 8 of installation guide with minor modification.
on step 10
The ARCo web application connects to the database
with a user which has restricted access.
The name of this database user is needed to grant
him access to the sge tables.
Please enter the name of this database user [arco_read] >>
Upgrade to database model version 1 ... Install version 6.0 (id=0) -------
Create table sge_job
Create index sge_job_idx0
Create index sge_job_idx1
create table sge_job_usage
Create table sge_job_log
Create table sge_job_request
Create table sge_queue
Create index sge_queue_idx0^M
Create table sge_queue_values^M
Create index sge_queue_values_idx0^M
Create table sge_host
Create index sge_host_idx0
Create table sge_host_values
Create index sge_host_values_idx0
Create table sge_department
Create index sge_department_idx0
Create table sge_department_values
Create index sge_department_values_idx0
Create table sge_project
Create index sge_project_idx0
Create table sge_project_values
Create index sge_project_values_idx0
Create table sge_user
Create table sge_user_values
Create index sge_user_values_idx0
Create table sge_group
Create index sge_group_idx0
Creat table sge_group_values
Create index sge_group_values_idx0
Create table sge_share_log
Create view view_accounting^M
Create view view_job_times^M
Create view view_jobs_completed^M
Create view view_job_log
Create view view_department_values
Create view view_group_values
Create view_host_values
Create view view_project_values
Create view view_queue_values
Create view view_user_values
revoke privileges from sge_department
revoke privileges from sge_department_values
revoke privileges from sge_group
revoke privileges from sge_group_values
revoke privileges from sge_host
revoke privileges from sge_host_values
revoke privileges from sge_job
revoke privileges from sge_job_log
revoke privileges from sge_job_request
revoke privileges from sge_job_usage
revoke privileges from sge_project
revoke privileges from sge_project_values
revoke privileges from sge_queue
revoke privileges from sge_queue_values^M
revoke privileges from sge_share_log^M
revoke privileges from sge_user^M
revoke privileges from sge_user_values
grant privileges to view_accounting
grant privileges to view_department_values
grant privileges on sge_department to arco_read
grant privileges on sge_department_values to arco_read
grant privileges on sge_group to arco_read
grant privileges on sge_group_values to arco_read
grant privileges on sge_host to arco_read
grant privileges on sge_host_values to arco_read
grant privileges on sge_job to arco_read
grant privileges on sge_job_log to arco_read
grant privileges on sge_job_request to arco_read
grant privileges on sge_job_usage to arco_read
grant privileges on sge_project to arco_read
grant privileges on sge_project_values to arco_read
grant privileges on sge_queue to arco_read
grant privileges on sge_queue_values to arco_read
grant privileges on sge_share_log to arco_read
grant privileges on sge_user to arco_read
grant privileges on sge_user_values to arco_read
grant privileges on view_job_log to arco_read
grant privileges on view_job_times to arco_read
grant privileges on view_jobs_completed to arco_read
grant privileges on view_project_values to arco_read
grant privileges on view_queue_values to arco_read
grant privileges on view_user_values to arco_read
commiting changes
version 6.0 (id=0) successfully installed
Install version 6.0u1 (id=1) -------
Create table sge_version
Update view view_job_times
Update version table
commiting changes
version 6.0u1 (id=1) successfully installed
OK
at this time console will not start because nothings have been registred
Posted at 06:33AM Sep 19, 2004 by hstsao in N1 Grid | Comments[2]
HA-GRID: N1Grid Engine 6 Edition with Berkely DB spooling
N1GE6 introduce spooling with Berkely DB. There will be new issues to consider when we want to provide a HA N1GE6 services with Java ES CLuster service.
if we assume that SGE_ROOT=/opt/n1ge and cell=default
the start script: $SGE_ROOT/$cell/common/sgemaster start
the stop script:$SGE_ROOT/$cell/common/sgemaster stop
Using the SunPlex agent builder one can easily create a sge_qmaster agent. and SUNWmsge package.
We also need another agent to modify the host_aliases file,
In this case we install Berkely DB on different nodes pairs.
This node pairs will run HA-NFS and HA-BDB agents.
HA-NFS come with SunCluster and it will serve the
HA-DBD will be serve Berkely DB FO .
Once one install BDB with the inst_sge -db command.
We use the SunPlex agent builder to build a HA-Agent that control the start and stop of the BDB.
The start script: $SGE_ROOT/$cell/common/sgebdb start
The stop script: $SGE_ROOT/$cell/common/sgebdb stop
The BDB spool directory: $SGE_ROOT/$cell/spooldb
From a previous blog posting one can create a SUNWbdb agent using those infomation.
Posted at 04:24PM Jul 13, 2004 by hstsao in N1 Grid |
HA-GRID HowTo
In the future post we will describe how to use the suncluster agentbuilder to create customer agent
Posted at 11:07AM Jun 08, 2004 by hstsao in N1 Grid |