Collection of mostly unrelated personal thoughts, experiences and opinions. Mike's Weblog

Thursday Jun 25, 2009

Introduction


The following is a PREVIEW to a detailed how-to for using xVM Ops Center 2.1 (xVMOC) to deploy the Sun HPC Software, Linux Edition 2.0 for RHEL 5.3. Why would you want to use xVMOC to deploy a cluster that will run Sun HPC Software, Linux Edition 2.0 since it already has it's own provisioning tools? xVMOC has the ability to do detailed hardware monitoring and management along with collection of OS monitoring all in a browser based user interface. Provisioning the Sun HPC Software with xVMOC provides the integration to allow xVMOC to provide these monitoring and management features. Check back for a final paper in the coming weeks.


To learn more about the Sun HPC Software, Linux Edition 2.0, visit the web site at http://www.sun.com/software/products/hpcsoftware/. To learn about xVM Ops Center 2.1, visit http://www.sun.com/software/products/xvmopscenter/index.jsp.


An important note: This how-to goes beyond the limits of support for either software offering. One such notable limit is the use of RHEL 5.3 as the Operating System for the xVMOC server. This how-to is also not intended to be a comprehensive guide for using either software product. Refer to the appropriate product documentation as needed.


Installation of xVMOC 2.1 on RHEL 5.3


Configure RHEL on xVMOC server


Using a private network interface for the provisioning network is ideal. For example, the xVMOC server might access host service processors over eth0 and provision hosts over eth1.



  • Install RHEL 5.3. Skip the RHEL registration part (say No).


    • xVMOC wants at least a 70GB / partition

    • Be sure to have both the Developer and Web Server packages installed



After RHEL 5.3 has been installed and is up and running at the command line turn off firewalls and disable SELinux:


  • Turn off firewall:

  • # chkconfig iptables off
    # chkconfig ip6tables off

  • Disable SELINUX:

  • # echo "SELINUX=disabled" > /etc/selinux/config
    # chmod 644 /etc/selinux/config

  • Set umask for xVMOC:

  • Edit /etc/bashrc and change umask from 002 to 022.

  • Set a hostname if this was not done during the installation of RHEL.

  • # hostname hostname

  • Edit /etc/hosts and add the hostname resolvable to a regular ip address.

  • Reboot to make settings take affect:

  • # shutdown -r now

  • The RHEL 5.3 and the HPC Software 2.0 iso image needs to be accessible so that it can be mounted. Mount both iso images in /var/www/html so they can be accessible via http later:

  • # mkdir -p /var/www/html/rhel5.3
    # mkdir -p /var/www/html/sun_hpc_linux
    # mount -o loop rhel-server-5.3-x86_64-dvd.iso /var/www/html/rhel5.3
    # mount -o loop sun-hpc-linux-rhel-2.0.iso /var/www/html/sun_hpc_linux

  • Create a yum.repo entry for the RHEL 5.3 that was just mounted:

  • # cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=RHEL DVD
    baseurl=file:///var/www/html/rhel5.3/Server
    enabled=0
    gpgcheck=0
    EOF

  • Install missing RHEL components needed by xVMOC:

  • # yum -y --enablerepo=rhel install expect
    # yum -y --enablerepo=rhel install ncompress
    # yum -y --enablerepo=rhel install dhcp
    # yum -y --enablerepo=rhel install xinetd
    # yum -y --enablerepo=rhel install tftp-server
    # yum -y --enablerepo=rhel install perl-DBI
    # yum -y --enablerepo=rhel install perl-DBD-Pg
    # yum -y --enablerepo=rhel install perl-XML-Parser

    Run the xVMOC installer



    • run xvmoc install script from untared xvmoc tarball

    • If there is not enough RAM, the installer will show an error. If this is for test purposes hit c to continue

    • If there is not ~70GB of free disk, the installer will complain. If this is for test purposes hit c to continue

    • Installer will complain about RHEL release because RHEL 5.3 is not supported

      • Error: Cannot determine Linux version. Error: Check that /etc/redhat-release or /etc/SuSE-release is present and that this OS and version is supported.
        hit c to continue

    • Finish setup after installer is done by going to the web browser address show by the installer.

    • For test purposes accept the defaults except to be sure to enable the Local Proxy services

    After Completing the xVM OC installation via the BUI Configure local proxy service:


    # /opt/sun/xvmoc/bin/proxyadm stop -w
    # /opt/sun/xvmoc/bin/proxyadm configure -D isc -I eth1
    (eth1 would be the provisioning network interface)
    # /opt/sun/xvmoc/bin/proxyadm start -w

  • Set up web services for yum repo access for the clients to be provisioned by xVMOC

  • xVM OC takes up several web services ports. In create web additional web services different ports need to be configured. Below are the ports used in this example:

  • /etc/httpd/conf/httpd.conf change Listen port 80, example 1972

  • /etc/httpd/conf.d/ssl.conf change Listen port 443, example 1443

  • Start http services

  • /etc/init.d/httpd start

  • make sure httpd starts after a reboot

  • chkconfig httpd on

    This will provide yum repo access via http://hostname:1972/rhel and http://hostname:1972/sun_hpc_software.
    Note that the port numbers need to be the same when creating the post installation scripts described later.


    Build OS image in xVM OC


    Creating the RHEL5.3 base OS image in xVM OC


    This example shows the command line method for creating a OS image. The BUI can also be used if preferred.


      Enter the n1sh shell:

      #/opt/sun/n1gc/bin/n1sh

      Create the OS image from the n1sh shell

      n1sh> create os rhel53 file /mnt/iso/rhel-server-5.3-x86_64-dvd.iso

      /mnt/iso/rhel-server-5.3-x86_64-dvd.iso is the path to the RHEL5.3 iso


      A job number will be given. Monitor this job until it is complete from the n1sh shell by:

      n1sh> show job hostname.jobnumber


    These commands can also be executed from the regular user shell:


    # /opt/sun/n1gc/bin/n1sh create os rhel53 file /mnt/iso/rhel-server-5.3-x86_64-dvd.iso
    # /opt/sun/n1gc/bin/n1sh show job hostname.jobnumber

    While the OS image is being created, gear to be managed can be discovered.


    Discover gear


    There are several methods to discovering gear which are covered in the xVMOC manual.


    Build post installation scripts for the HPC Linux Software Stack


    These must be accessible to xVM OC. The following are examples for a head node, Lustre server, and a compute server.


    Head node post installation script:


    #!/bin/sh
    # located /root/hpc_linux_head_server
    # install HPC stack components for a head server utilizing the online stack repo

    echo "SELINUX=disabled" > /etc/selinux/config

    cat > /etc/yum.repos.d/hpc_stack.repo << EOF
    [hpc-stack]
    name=HPC Linux Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux
    enabled=1
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/hpc_stack_lustre.repo << EOF
    [hpc-stack-lustre]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux/lustre
    enabled=1
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=RHEL media repo
    baseurl=http://192.168.5.1:1972/rhel/Server
    enabled=1
    gpgcheck=0
    EOF

    # remove a few things
    rpm -e openib

    # add hpc stack items for head server

    echo -ne "\n\n### Install SunHPC OFED Infiniband Packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC OFED Infiniband Packages"
    echo -ne "\n\n### Install SunHPC Default MPI Packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Default MPI Packages"
    echo -ne "\n\n### Install SunHPC Cluster Verification Tools ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Cluster Verification Tools"
    echo -ne "\n\n### Install SunHPC Management Tools ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Management Tools"
    echo -ne "\n\n### Install SunHPC Provisioning Tools ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Provisioning Tools"
    echo -ne "\n\n### Install SunHPC Cluster Monitoring ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Cluster Monitoring"
    echo -ne "\n\n### Install SunHPC Slurm Scheduler ###\n\n"
    /usr/bin/yum install -y munge slurm slurm-munge
    echo -ne "\n\n### Install SunHPC Lustre Client ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Lustre Client"
    echo -ne "\n\n### Install Sun HPC Release ###\n\n"
    /usr/bin/yum install -y sunhpc-release.noarch

    cat > /etc/yum.repos.d/hpc_stack.repo << EOF
    [hpc-stack]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux
    enabled=0
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/hpc_stack_lustre.repo << EOF
    [hpc-stack-lustre]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux/lustre
    enabled=0
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/rhel/Server
    enabled=0
    gpgcheck=0
    EOF


    Lustre Server post installation script. Note the correct ssh key will have to be edited. Instructions later on.


    #!/bin/sh
    # located /root/hpc_linux_lustre_server
    # install HPC stack components for a Lustre server utilizing the online stack repo hosted from dlc

    echo "SELINUX=disabled" > /etc/selinux/config

    cat > /etc/yum.repos.d/hpc_stack.repo << EOF
    [hpc-stack]
    name=HPC Linux Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux
    enabled=1
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/hpc_stack_lustre.repo << EOF
    [hpc-stack-lustre]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux/lustre
    enabled=1
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=RHEL media repo
    baseurl=http://192.168.5.1:1972/rhel/Server
    enabled=1
    gpgcheck=0
    EOF

    # remove a few things
    rpm -e openib

    # add hpc stack items for lustre server

    echo -ne "\n\n### Install sg3 utils ###\n\n"
    /usr/bin/yum install -y sg3_utils
    /usr/bin/yum install -y sg3_utils-libs
    echo -ne "\n\n### Install gmond utils ###\n\n"
    /usr/bin/yum install -y ganglia-gmond
    echo -ne "\n\n### Install e2fsprogs-libs utils ###\n\n"
    /usr/bin/yum install -y e2fsprogs-libs
    echo -ne "\n\n### Install OFED packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC OFED Infiniband Packages"
    echo -ne "\n\n### Install Lustre Server packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Lustre Server"
    /usr/bin/yum groupinstall -y "SunHPC Management Tools"

    # configure ssh keys, must be done after headnode has been built and keys created
    mkdir -p /root/.ssh
    cat > /root/.ssh/authorized_keys << EOF
    ssh-rsa ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAgEA8X9iZ8pLuLovQis9iYvOov3pfmu1ID5J22IUqglajiSIl

    3fVeoLfF+TJWZ6UU7G9OWZsQ/4e0FycWgTbzTvba2ALePPckuqEXgR8Fp15g+azHV3m5tdy3b

    morfPSAKCCOnds2ktZndqIoarR5fISKyaY5AAc9fgKMlrWTZgqhpvwrhlDFJFCxnp3ixgDd8S

    /2NmJQ/RQ4JyCocvo0WSZYPwnIgNBl715d0osHaM9YIW9bP95zwHCF7/kDo25WLdycpMlI3YB

    i/bBXuEkWlrug/zKGLmXLKE/YE8e/3PLszogGkwa/q5vt8l2hsc3BSQJrCgoot9m0FWzRktC6

    aAx0MEqKPBKZSvto/9u2NsbEM78hw7Gtr12sFWi1OqIe1wALRkYsDNkV0tK4nK1ca9onwSEDC

    utjplJvH/mu5pCDo4YnkH/c40tsJwMdkjZMVeUleflxvnd+HNkz5XKifrKCgXmLZ9EcwsZ3zg

    HaBuWu2PK3PETnRH6Rfosy9A50HKkdQTjWVhK6I7pckoybIVzx4WYqcTONv13sFl88DZT1H1T

    Xc1clA3/0U4bBN2jltc5JPzJukPKTTFlrTD5sAZiiMEvhZPeeXdw4ep6zlawi5AUSFcDMELjG

    XjERNPzAPVj0m+sDkHlTyCVZsn9YcxVtRzEsk4u8rhMW0zkBQ5k= root@headnode
    EOF

    cat > /etc/yum.repos.d/hpc_stack.repo << EOF
    [hpc-stack]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux
    enabled=0
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/hpc_stack_lustre.repo << EOF
    [hpc-stack-lustre]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux/lustre
    enabled=0
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/rhel/Server
    enabled=0
    gpgcheck=0
    EOF


    Client Server post installation script. Note the correct ssh key will have to be edited.


    #!/bin/sh
    # located /root/hpc_linux_compute_server
    # install HPC stack components for a client servers utilizing the online stack repo

    echo "SELINUX=disabled" > /etc/selinux/config

    cat > /etc/yum.repos.d/hpc_stack.repo << EOF
    [hpc-stack]
    name=HPC Linux Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux
    enabled=1
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/hpc_stack_lustre.repo << EOF
    [hpc-stack-lustre]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux/lustre
    enabled=1
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=RHEL media repo
    baseurl=http://192.168.5.1:1972/rhel/Server
    enabled=1
    gpgcheck=0
    EOF

    # remove a few things
    rpm -e openib

    # add hpc stack items for head server

    echo -ne "\n\n### Install pdsh ###\n\n"
    /usr/bin/yum install -y pdsh
    /usr/bin/yum install -y pdsh-rcmd-ssh
    /usr/bin/yum install -y pdsh-mod-dshgroup
    /usr/bin/yum install -y pdsh-mod-machines
    echo -ne "\n\n### Install modules and env swithcher ###\n\n"
    /usr/bin/yum install -y modules
    /usr/bin/yum install -y env-switcher
    echo -ne "\n\n### Install gmond ###\n\n"
    /usr/bin/yum install -y ganglia-gmond
    echo -ne "\n\n### Install SunHPC OFED Infiniband Packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC OFED Infiniband Packages"
    echo -ne "\n\n### Install SunHPC SLURM ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC SLURM"
    echo -ne "\n\n### Install SunHPC Cluster Verification Tools ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Cluster Verification Tools"
    echo -ne "\n\n### Install SunHPC Lustre Client ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC Lustre Client"
    echo -ne "\n\n### Install SunHPC OpenMPI Packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC OpenMPI Packages"
    echo -ne "\n\n### Install SunHPC MVAPICH Packages ###\n\n"
    /usr/bin/yum groupinstall -y "SunHPC MVAPICH Packages"
    /usr/bin/yum groupinstall -y "SunHPC Management Tools"

    # configure ssh keys, must be done after headnode has been built and keys created
    mkdir -p /root/.ssh
    cat > /root/.ssh/authorized_keys << EOF
    ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAgEA8X9iZ8pLuLovQis9iYvOov3pfmu1ID5J22IUqglajiSIl

    3fVeoLfF+TJWZ6UU7G9OWZsQ/4e0FycWgTbzTvba2ALePPckuqEXgR8Fp15g+azHV3m5tdy3b

    morfPSAKCCOnds2ktZndqIoarR5fISKyaY5AAc9fgKMlrWTZgqhpvwrhlDFJFCxnp3ixgDd8S

    /2NmJQ/RQ4JyCocvo0WSZYPwnIgNBl715d0osHaM9YIW9bP95zwHCF7/kDo25WLdycpMlI3YB

    i/bBXuEkWlrug/zKGLmXLKE/YE8e/3PLszogGkwa/q5vt8l2hsc3BSQJrCgoot9m0FWzRktC6

    aAx0MEqKPBKZSvto/9u2NsbEM78hw7Gtr12sFWi1OqIe1wALRkYsDNkV0tK4nK1ca9onwSEDC

    utjplJvH/mu5pCDo4YnkH/c40tsJwMdkjZMVeUleflxvnd+HNkz5XKifrKCgXmLZ9EcwsZ3zg

    HaBuWu2PK3PETnRH6Rfosy9A50HKkdQTjWVhK6I7pckoybIVzx4WYqcTONv13sFl88DZT1H1T

    Xc1clA3/0U4bBN2jltc5JPzJukPKTTFlrTD5sAZiiMEvhZPeeXdw4ep6zlawi5AUSFcDMELjG

    XjERNPzAPVj0m+sDkHlTyCVZsn9YcxVtRzEsk4u8rhMW0zkBQ5k= root@headnode
    EOF

    cat > /etc/yum.repos.d/hpc_stack.repo << EOF
    [hpc-stack]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux
    enabled=0
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/hpc_stack_lustre.repo << EOF
    [hpc-stack-lustre]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/sun_hpc_linux/lustre
    enabled=0
    gpgcheck=0
    EOF

    cat > /etc/yum.repos.d/rhel.repo << EOF
    [rhel]
    name=HPC Linux Lustre Repo
    baseurl=http://192.168.5.1:1972/rhel/Server
    enabled=0
    gpgcheck=0
    EOF


    Create provisioning profiles for xVMOC


    Create the provisioning profiles for xVM OC for each type of system that will be provisioned by xVM OC. In this case we are creating profiles for Sun HPC SW Head, Lustre, and Compute servers.


    Create HPC stack headnode profile


    First, the head node needs to be created and provisioned before the other types of clients because we need to create the public id_rsa keys which to be included on the other clients.


    # /opt/sun/n1gc/bin/n1sh"create osprofile hpc-headnode os rhel53 rootpassword='changeme' description='HPC Stack Headnode'"
    # /opt/sun/n1gc/bin/n1sh"set osprofile hpc-headnode language=en_US timezone=America/Denver"
    # /opt/sun/n1gc/bin/n1sh"set osprofile hpc-headnode clearmbr=true existingpartition=all initdisklabel=true md5=true rebootafterinstall=true shadowpassword=true"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-headnode distributiongroup Base"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-headnode distributiongroup Core"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-headnode partition / device=hda type=ext3 sizeoption=free"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-headnode partition swap device=hda type=swap sizeoption=fixed size=2048"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-headnode script /mnt/iso/xvmoc-kspost/hpc_linux_head_server type=post"

    Provision hpc-head node


    Nodes can be provisioned via the command line or the BUI. The command line method seems incomplete compared to the BUI method. The BUI method is recommended at this point. Refer to the xVM OC manual for instruction.


    This is an example of what the command line might look like:


    # n1sh load server headnode osprofile hpc-headnodenetworktype=static bootip=192.168.5.20 ip=192.168.5.20 bootnetworkdevice=eth0 networkdevice=eth0 installprotocol=http

    If during discovery xVM OC recognizes the discovered clients, xVM OC will automatically power cycle and netboot the headnode system. If the system is not recognized, a manual boot process will be needed. Refer to the xVM OC documentation.


    Create ssh keys from the headnode


    When hpc-head node is up, create ssh keys. See reference to Sun HPC SW documentation section: Setting up ssh keys.


    # ssh-keygen -t rsa -b 4096
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa):
    Created directory '/root/.ssh'.
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    60:65:a3:53:93:ae:04:35:a0:4b:aa:a0:ec:13:41:b0 root@headnode

    Copy contents id_rsa.pub to each post install script for clients and lustre servers. The sample post install scripts were provided previously. Copy to the section for the Lustre and Compute post installation scripts starting # configure ssh keys.


    Create HPC stack lustre-server profile


    After the ssh key has been copied into the post install scripts for the HPC SW Lustre and Compute server post installation scripts, the xVM OC provisioning profiles can be created for these systems:


    # /opt/sun/n1gc/bin/n1sh"create osprofile hpc-lustre-serverb os rhel53 rootpassword='changeme' description='HPC Stack Lustre Server'"
    # /opt/sun/n1gc/bin/n1sh"set osprofile hpc-lustre-serverb language=en_US timezone=America/Denver"
    # /opt/sun/n1gc/bin/n1sh"set osprofile hpc-lustre-serverb clearmbr=true existingpartition=all initdisklabel=true md5=true rebootafterinstall=true shadowpassword=true"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-lustre-serverb distributiongroup Base"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-lustre-serverb distributiongroup Core"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-lustre-serverb partition / device=hda type=ext3 sizeoption=free"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-lustre-serverb partition swap device=hda type=swap sizeoption=fixed size=2048"

    *** first don't forget to modify the ssh key for the authorized hosts in the kspost file before the next command.

    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-lustre-serverb script /mnt/iso/xvmoc-kspost/hpc_linux_lustre_server type=post"

    Create HPC stack lustre-client profile


    # /opt/sun/n1gc/bin/n1sh"create osprofile hpc-compute-server os rhel53 rootpassword='changeme' description='HPC StackCompute Server'"
    # /opt/sun/n1gc/bin/n1sh"set osprofile hpc-compute-serverlanguage=en_US timezone=America/Denver"
    # /opt/sun/n1gc/bin/n1sh"set osprofile hpc-compute-serverclearmbr=true existingpartition=all initdisklabel=true md5=true rebootafterinstall=true shadowpassword=true"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-compute-serverdistributiongroup Base"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-compute-serverdistributiongroup Core"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-compute-serverpartition / device=hda type=ext3 sizeoption=free"
    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-compute-serverpartition swap device=hda type=swap sizeoption=fixed size=2048"

    *** first don't forget to modify the ssh key for the authorized hosts in the kspost file before the next command.

    # /opt/sun/n1gc/bin/n1sh"add osprofile hpc-compute-serverscript /mnt/iso/xvmoc-kspost/hpc_linux_lustre_server type=post"

    Provision the rest of the cluster nodes


    Use the xVM OC profiles just created. Again, refer to the xVM OC manual for guidance.


    Configure the HPC SW head node


    Now that the head node for the HPC software stack is up an running, it's basic management services need to be configure. First cfengine needs configuration. The following cfengine configuration is based off the script provided by the HPC SW stack:

    NOTE: since provisioning is being handled by xVM OC, do not run sunhpc_setup from the HPC SW. Cobbler and Onesis should not be configured.


    Configure cfengine, items in red need to be adjusted for the current setup:


    #!/bin/bash
    # modified from /usr/sbin/setup_cfegine

    MYDOMAIN="sunhpc"
    HOST_IP=192.168.5.20
    CFNETWORK=192.168.5.20/24
    LOGFILE=/var/tmp/setup_cfengine.log

    # check for domainname in /etc/hosts and add one if needed

    HOSTENTRIES=$(grep $HOST_IP /etc/hosts)
    #DOMAINLIST=$(for i in $(echo $HOSTENTRIES | sed -e "s/^[0-9\. ]* //") ; { echo $i | grep "\." | sed "s/[^\.]*\.//" ; }| head -1)
    if [[ "$HOSTENTRIES" = "" ]] ; then
    HOSTNAME=$(hostname -s)
    NEWHOSTENTRIES="$HOST_IP $HOSTNAME.$MYDOMAIN"
    cp /etc/hosts /etc/hosts.setup_cfengine
    echo $NEWHOSTENTRIES >>/etc/hosts
    else
    if [[ "$HOSTENTRIES" != *$MYDOMAIN* ]] ; then
    HOSTNAME=$(echo $HOSTENTRIES | sed -e "s/.* \([^\.]*\) .*/\1/")
    NEWHOSTENTRIES="$HOSTENTRIES $HOSTNAME.$MYDOMAIN"
    cp /etc/hosts /etc/hosts.setup_cfengine
    cat /etc/hosts.setup_cfengine | sed -e "s/$HOSTENTRIES/$NEWHOSTENTRIES/" >/etc/hosts
    fi
    fi

    echo " done"
    echo -n "fix cfengine settings in gtdb... "
    gtt settings --edit --service cfengine --component policyhost --value $HOST_IP >$LOGFILE 2>&1
    gtt settings --edit --service cfengine --component domain --value $MYDOMAIN >>$LOGFILE 2>&1
    gtt settings --edit --service cfengine --component cfnetwork --value $CFNETWORK >>$LOGFILE 2>&1
    gtt settings --edit --service slurm --component ControlAddr --value $HOST_IP >>$LOGFILE 2>&1
    echo " done"
    echo -n "update configuration files... "

    gtt config --update all >>$LOGFILE 2>&1

    # setup cfservd on head node

    echo " done"
    echo -n "set up cfengine... "
    if [ -L /var/cfengine/bin/cfagent ] ; then
    rm -f /var/cfengine/bin/cfagent
    cp -f /usr/sbin/cfagent /var/cfengine/bin/cfagent
    fi
    # copy initial cfengine config files
    cp -f /var/lib/sunhpc/cfengine/var/cfengine/inputs/* /var/cfengine/inputs/
    cp -f /var/lib/sunhpc/cfengine/var/cfengine/inputs/* /var/cfengine/masterfiles/inputs/
    mkdir -p /var/lib/sunhpc/cfengine/etc/munge
    cp -f /etc/localtime /var/lib/sunhpc/cfengine/etc/
    cp -f /etc/munge/munge.key /var/lib/sunhpc/cfengine/etc/munge/
    cp -f /var/cfengine/ppkeys/localhost.pub /var/cfengine/ppkeys/root-${HOST_IP}.pub
    cat /root/.ssh/*.pub >>/root/.ssh/authorized_keys
    chkconfig --add cfservd >>$LOGFILE 2>&1
    /etc/init.d/cfservd restart >>$LOGFILE 2>&1
    cfagent -q -v update-only >>$LOGFILE 2>&1
    cfagent -qv >>$LOGFILE 2>&1

    echo " done"
    echo "complete logs are in $LOGFILE"


    Add xVM OC provisioned clients to HPC SW server head node.


    Refer to documentation for the HPC SW stack (section 3C). Note, that since hardware management is being done by xVM OC some items are not necissary. It is up to the admin on which tools they would prefer to use.


    Example:


    # gtt host --add --name lustre1 --network "hwaddr=08:00:27:B2:30:AD,ipaddr=192.168.5.30,device=eth0" --attribute "mds"
    Host added successfully: lustre1
    Network added successfully: eth0
    Attribute added successfully to lustre1: mds
    # gtt host --add --name lustre2 --network "hwaddr=08:00:27:2A:8B:0E,ipaddr=192.168.5.31,device=eth0" --attribute "mds"
    Host added successfully: lustre2
    Network added successfully: eth0
    Attribute added successfully to lustre2: mds
    # gtt host --add --name client1 --network "hwaddr=08:00:27:C8:27:9F,ipaddr=192.168.5.40,device=eth0" --attribute "compute"
    Host added successfully: client1
    Network added successfully: eth0
    Attribute added successfully to client1: compute
    # gtt host --add --name client2 --network "hwaddr=08:00:27:B9:9F:2C,ipaddr=192.168.5.41,device=eth0" --attribute "compute"
    Host added successfully: client2
    Network added successfully: eth0
    Attribute added successfully to client2: compute
    # gtt host --remove --name lustre2
    Host removed successfully: lustre2
    # gtt host --add --name lustre2 --network "hwaddr=08:00:27:2A:8B:0E,ipaddr=192.168.5.31,device=eth0" --attribute "oss"
    Host added successfully: lustre2
    Network added successfully: eth0
    Attribute added successfully to lustre2: oss

    gtt config update all
    cfagent -q


    Now the hosts can be managed by the HPC SW head node via ssh. Run cfagent -q on all the client servers once they are up and running. This can be done via various means. See the documentation for the HPC SW stack.

    Comments:

    This rocks! Thanks for putting this together, Mike!

    Posted by Steve Wilson on June 25, 2009 at 11:52 PM MDT #

    Post a Comment:
    • HTML Syntax: NOT allowed