Captain Jack

pageicon 星期三 九月 23, 2009

Hacking AI installation process to install OpenSolaris to iSCSI device

First of all, contents in this entry are sort of hacking rather than the OpenSolaris solution to support iSCSI boot device. The final solution will be coming from caiman project rather than here, AFAIK caiman is actively working on the support and the draft plan may be available soon.


Currently OpenSolaris can't be installed to iSCSI boot device with the liveCD, the major issue here is that the iSCSI initiator module is not included in the liveCD. That basically limits the ability to access to the iSCSI target.


However with customizing, or hacking in another word, the AI process, it is not that difficult to experience iSCSI boot with OpenSolaris.


Requirements:


For x86, the build number of OpenSolaris should be 104+, and the machine should have at least two NICs - one to support PXE and the other to support iBFT.


For sparc, the build number of OpenSolaris should be 127+(per current plan), and the machine should have an updated OBP (should be coming out soon) to support iSCSI boot.


Seteps:


1. First an AI server needs to be configured following the AI instruction.


2. Modify the default manifest to specify the iSCSI target info by adding a harmless comment, e.g.,


<!--
        iscsi-target-name=iqn.1986-03.com.sun:02:1234567890abcdef
        iscsi-target-ip=129.158.144.200
        iscsi-lun=1
-->


3. Configure a default 'target_device' in manifest, this can be inserted before the


<ai_pkg_repo_default_authority> section.


            <ai_target_device>
               <target_device_name>0</target_device_name>
               <target_device_install_slice_number>0</target_device_install_slice_number>
            </ai_target_device>



4. Also don't forget to specify the iSCSI initiator package along with IDM in the manifest by adding following items into the <ai_install_packages> section.
        <pkg name="SUNWiscsi"/>
        <pkg name="SUNWiscsidm"/>


5. Customizing(hacking) the auto-installer in the microroot.


In case of x86 and the AI target directory on AI server is /export/home/ai_server,  the microroot can be customized in this way.


# cd /export/home/ai_server/boot


# gzcat x86.microroot >/tmp/miniroot


# lofiadm -a /tmp/miniroot


/dev/lofi/1


# mount /dev/lofi/1 /mnt


Then open the /mnt/lib/svc/method/auto-installer with a preferred editor, locate the following paragragh.


===============================================


echo "Automated Installation started" | $TEE_LOGTOCONSOLE
echo "The progress of the Automated Installation can be followed by viewing " \
     "the logfile at /tmp/install_log" | $TEE_LOGTOCONSOLE


===============================================


Not this is a shell script to be executed on client side, so here we need to put some customized commands to,


1) Establish the connection to the iSCSI target


2) Identify the iSCSI disk OS name


3) Update the manifest to include the iSCSI disk.


One way to do this would be add following commands just below the above paragraph.


# ========Below will add iscsi configuration for AI client. ============

echo "begin iSCSI configuration..." | $TEE_LOGTOCONSOLE

# get target name
input=`cat $AISC_MANIFEST | grep iscsi-target-name=`
target_name=`echo $input | awk -F"=" '{print$2}' `

# get target ip address
input=`cat $AISC_MANIFEST | grep iscsi-target-ip=`
target_ip=`echo $input | awk -F"=" '{print$2}' `

# get lun number
input=`cat $AISC_MANIFEST | grep iscsi-lun=`
lun=`echo $input | awk -F"=" '{print$2}' `
lun="LUN: $lun"

echo "Destination LUN from manifest is $lun on target $target_name" | $TEE_LOGTOCONSOLE

# add the static-config and enable the discovery
/usr/sbin/iscsiadm add static-config $target_name,$target_ip
/usr/sbin/iscsiadm modify discovery -s enable

# wait here for a while
sleep 10
/usr/sbin/devfsadm -C
sleep 30

/usr/sbin/iscsiadm list target -S >/tmp/client_target.out

test=`cat /tmp/client_target.out | grep "$lun" | wc -l`
if [ $test = "0" ] ; then
        echo "can't find $lun on target $target_name" | $TEE_LOGTOCONSOLE
        exit $SMF_EXIT_ERR_FATAL
fi

# get the os device name of the LUN
i=`sed -n -e /"$lun"/= /tmp/client_target.out`
line=`expr "$i" "+" 3`
string=`sed -n -e ${line}p /tmp/client_target.out`
tmp=`echo ${string} | awk -F"/" '{print$4}' `
name=`echo $tmp | sed 's/..$//' `
echo "Get $name from local disk table for installation" | $TEE_LOGTOCONSOLE
# replace the device name in the manifest
cat $AISC_MANIFEST | sed "s+<target_device_name>.+<target_device_name>${name}+" >/tmp/ai_combined_manifest.xml.2
mv /tmp/ai_combined_manifest.xml.2 $AISC_MANIFEST

echo "iSCSI configuration completes" | $TEE_LOGTOCONSOLE

# =============end of getting iscsi configuration ==================


6. Save the script and umount/delete the lofi device.


# umount /mnt


# lofiadm -d /dev/lofi/1


7. Repack and replace the microroot.


# gzip miniroot


# mv miniroot.gz /export/home/ai_server/boot/x86.microroot


Now go ahead to install the client, good luck!



Guidance for installing Solaris Nevada CE Sparc to iSCSI device

Installation Guide For Sparc - Solaris SXCE


Basically the install process is very similar to the process of installing Solaris x86 onto iSCSI disk. The biggest difference is the way to configure different firmware, as before booting to Solaris, x86 platform will be relying on iBFT-capable firmware (BIOS) to communicate with the iSCSI target, while Sparc platform will be relying on OBP to do the almost same thing.


Before proceeding, please make sure the system is running OBP version >= 4.31 and the command 'show-iscsi' is available.


Prerequisites


Collecting following items before starting the installation, some of them will be used during


the installation process, and may also be a part of the boot argument to 'boot' command in OBP.


* iSCSI Target IP/Port


* Router/Gateway IP if the iSCSI Target is on a different subnet


* Which ethernet interface to be used to access the iSCSI target


* Lun number which will be used as the root disk


Installation Process


The installation process is very similar to the x86 case as described in an earlier post, for


both the cd/network installation and desktop/console session. However a few items are needed to be collected for later use to boot the OS.


* Target Name


* Root Slice if it is not 'a' as default


Also, specifying chap via 'iscsiadm' if authentication is setup in target side. For detailed steps please refer to the Chap. 14 of System Administration Guide: Devices and File Systems.


Postinstall Configuration


A special boot device argument needs to be composed to perform iSCSI boot in OBP, which is in the format of,


'net:key=value[,...]'


The following keys are used to support iSCSI boot,


iscsi-target-ip       <Required>     iSCSI Target IP address
iscsi-target-name     <Required>    iSCSI Target Name
host-ip               <Required>    Host IP address
router-ip             <Optional>    The gateway IP address. It may not be necessary if the host and the iSCSI target are within the same subnet.
iscsi-lun             <Optional>    The lun unit number required by iscsi boot. It is a hexadecimal dash-separated format, defaults to 0. A example of the fully specified number would be 2-0-0-0, however usually it is specified as '2'.
iscsi-port           <Optional>    iSCSI target IP port. It is a decimal formatted integer from 1 to 65535, defaults to 3260.
iscsi-partition        <Optional>    The bootable partition on the iscsi target, defaults to "a".


If you have used the CHAP as the authentication method, you can set the CHAP user name and password as follows in OK mode:

{0} ok set-ascii-security-key chap-user <your chap name>
{0} ok set-ascii-security-key chap-password <your chap secret>


 Note, bidirectional authentication is not available here.


An example of the full argument would be,


net:iscsi-target-name=iqn.1986-03.com.sun:2510.600a0b800049c94d00000000493c920b,host-ip=10.13.49.129,iscsi-lun=3-0-0-0,iscsi-target-ip=10.13.49.145,router-ip=10.13.49.1


An dev alias is probably preferred for such an argument, and then passed to the 'boot' command in OBP.



pageicon 星期五 十一月 21, 2008

iSCSI boot x86

iSCSI Boot for x86 Systems


The iSCSI boot feature initializes an operating system from a remote
location, such as a storage disk array, over the network. iSCSI boot is
typically loaded onto an initiator, or diskless client, while the hard
disk resides on a target attached to the network. Because iSCSI boot
uses standard Ethernet-based infastructure, data, storage, and
networking traffic can be consolidated on a standard server's
networking system.


Remote booting over a storage area network provides the following advantages:



  • Server consolidation and virtualization reduces equipment costs.
    For example, diskless servers that can boot from an OS image over the
    network are important for rack-mounted servers, or blade servers in
    high-density clusters.

  • Simplified and centralized management reduces management costs.
    For example, provisioning new servers and managing and maintaining
    existing servers is simplified when installations, upgrades, and fixes
    are performed from a central location.

  • Diversified datacenter locations reduces the risk of data loss in the case of a disaster.
    For example, you can strategically separate mirrored databases, because
    iSCSI boot utilizes a standard Ethernet-based infastructure. This
    provides protection from regional disasters, such as earthquakes,
    hurricanes, and tornados.

  • Improved availability. For example, recovery from a server failure is simplified when the spare server is booted and provisioned over the network.


Using iSCSI boot from an x86 system is different from booting an x86 system over the network using GRUB:



  • A GRUB based network boot requires a DHCP server that is
    configured for PXE clients. This is not necessary for iSCSI boot,
    however using a DHCP server with iSCSI boot is an option.

  • PXE requires a boot server to provide the miniroot/ramdisk. This is not necessary for iSCSI boot.



PSARC Cases


PSARC 2008/427 iSCSI Boot



This project is to enable Solaris to boot off iSCSI luns via regular network adapters. Different approaches, iBFT/OBP, are adopted to implement this feature on x86/sparc platforms. This case supersedes PSARC/2007/450, iSCSI Software boot.

On x86 platform, iSCSI boot depends on NIC's firmware to implement its own iSCSI initiator and to support iBFT to pass boot info to OS. That means the sulution on x86 needs dedicated hardware/firmware. Currently Intel 1G/10G Pro. series NICs support this feature along with Broadcom in their high-end NICs.

On sparc platform, iSCSI boot depends on OBP to implement its own iSCSI stack to connect to the iSCSI target, load boot archive, and pass the boot info to Solaris OS via standard OBP properties. A suite of standard properties need to be defined in OBP.

iSCSI disk will still be incapable of being a dump device with this project.


Boot process on x86 (to support iBFT)



  1. Host is powered up/reset and the iBF (iSCSI Boot Firmware) is loaded

  2. iBF initializes and connects to the iSCSI target, presents iSCSI disk to BIOS

  3. BIOS uses INT 13 to load MBR and OS boot sectors from the iSCSI disk

  4. OS boot loader (grub) takes over the control from BIOS

  5. Grub loads Solaris kernel/ramdisk

  6. Grub transfers control to Solaris kernel

  7. Kernel scans iBFT, configures the boot NIC, TCP/IP and iSCSI initiator to enumerate the boot disk, and then mount the rootfs

  8. Kernel loads the rest of drivers/conf files as booting from a local disk


Boot process on sparc



  1. Kernel knows iSCSI boot information from OBP, others are the very same with x86 case


PSARC 2008/640 iSCSI With DHCP


Hacking dhcp agent to make it aware of the activity in iSCSI


IBFT


iSCSI Boot Firmware Table
IBFt is a method of communicating boot parameters.
A system's BIOS based iSCSI boot implementation fills in parameters.
This may also be done by a network loaded iSCSI boot loader, such as etherboot.
The operating system consumes the parameters.


So far, Intel and Broadcom implement iBFT, from publicly available information, and only on selected NICs.
Nvidia was working on support, but stopped, partly due to lack of request.
Linux and Windows support boot from iBFT NICs.


Supported Broadcom NICS: BCM5721, BCM5755, BCM5755M, BCM5754,
BCM5754M, BCM5714, BCM5714S, BCM5715, BCM5715S, BCM5780, BCM5780S,
BCM5756, BCM5722


Supported Intel NICS


Implementation in Solaris would be relatively simple. Changes
would be to read iBFT table at boot in the iSCSI driver.
There are probably issues with ensuring that networking is up
sufficiently for Solaris to mount the root device, that we'd have to
solve.


pageicon 星期二 八月 21, 2007

Jupiter Bus

Jupiter Bus is not a 'bus' actually, it is an internal network consisting of high-speed switches and crossbars to connect CPU, MAC and IOC (host bridge).

It treats I/O as transaction and supports error-detection and retry mechanism (conducts as a network as it is). It is used on Sun's M- series Sparc Enterprise systems and usually referred as 'jupiter bus' somehow.

pageicon 星期三 八月 15, 2007

Which distribution of Solaris you installed?

Or, say, 'Solaris cluster', here is the code,

 $ cat /var/sadm/system/admin/CLUSTER


The CLUSTER file will tell you what distribution was selected when installing the OS. The possible list is as follows:

SUNWCreq: core
SUNWCuser: end user distribution
SUNWCprog: developer distribution
SUNWCall: entire distribution
SUNWCXall: entire distribution plus OEM support

For details refer to the article from bigadm for details.

Different distribution affects the procedure of upgrade/recovery. For this time, I can't upgrade my
existing snv56 as SUNWCall to snv70 since the new gui installer accepts SUNWCXall only...


pageicon 星期一 八月 06, 2007

Sun Device Detection Tool 1.2 is released


[Read More]
pageicon 星期二 五月 29, 2007

Duplicated Entry from 'fcinfo'

Duplicated entries could exist if multiple HBA libraries exist in the system. Following is an example:

There is an emulex FC port in the system and it is shown twice from 'fcinfo'.
--------
-bash-3.00# fcinfo hba-port
HBA Port WWN: 10000000c94abfc8
OS Device Name:
Manufacturer: Sun Microsystems, Inc.
Model: LP10000-S
Firmware Version: 1.91x15
FCode/BIOS Version: 1.50a4
Type: N-port
State: online Supported
Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c94abfc8

HBA Port WWN: 10000000c94abfc8
OS Device Name: /dev/cfg/c6
Manufacturer: Sun Microsystems, Inc.
Model: LP10000-S
Firmware Version: 1.91x15
FCode/BIOS Version: 1.50a4
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 20000000c94abfc8

The first output is from emulex's HBA lib and the second is from Sun's.

-bash-3.00# cat /etc/hba.conf
# # This file contains names and references to HBA libraries
# # Format:
# # # # The library name should be prepended with the domain of # the manufacturer or driver author.
com.sun.fchba /usr/lib/libsun_fc.so.1
com.sun.fchba64 /usr/lib/64/libsun_fc.so.1
com.emulex.emulexapilibrary /usr/lib/libemulexhbaapi.so
com.emulex.emulexapilibrary /usr/lib/sparcv9/libemulexhbaapi.so
-------------------------------

The problem is the library from EMLXemlxu, a package as 'Emulex LightPulse Fibre Channel Adapter Utilities (usr)'. It shouldn't enumerate an HBA with Sun's make, though it is OEMed from Emulex.
pageicon 星期一 四月 23, 2007

Started to work with Xen

1. Currently there is a bug so I can't specify a vif for domu now.
2. User can have multiple console in the same time by xm console, I'm not sure if this is intended.
3. Found a bug if there is no entry in /etc/name_to_major for 'vdb' then the dom0 will panic after a while from 'xm create -c .....'. Max is fixing it.
4. After the first installation, domU used to reboot and then the installation starts over. This is harmless anyway but annoying.
5. Hit 6300863 while domU starting up.

--------------------------- root@dmgdom0xen:/$ xm console 11 SunOS Release 5.11 Version matrix-build-2007-04-19 64-bit Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Hostname: dmgdomu0xen /dev/rdsk/c0d0s6 is clean /dev/rdsk/c0d0s3 is clean /dev/rdsk/c0d0s4 is clean t_optmgmt: System error: Cannot assign requested address Starting Solaris Install Launcher in Command Line Mode. Apr 23 01:43:49 dmgdomu0xen sendmail[486]: My unqualified host name (localhost) unknown; sleeping for retry Apr 23 01:43:49 dmgdomu0xen sendmail[485]: My unqualified host name (localhost) unknown; sleeping for retry Exiting launcher. File find_device.out does not exist. The Solaris Install Launcher has terminated unexpectedly. Press the Return key and a system reboot will take place on your machine.Apr 23 01:44:49 dmgdomu0xen sendmail[486]: unable to qualify my own domain name (localhost) -- using short name Apr 23 01:44:49 dmgdomu0xen sendmail[485]: unable to qualify my own domain name (localhost) -- using short name ----------------------------------

Seems this is an unsolved issue and a workaround is add -text to grub. However, how to specify kernel arguments for domU? hm...