Sun Blog Brad Beadles

Thursday Jul 12, 2007

Knowing that I may be a little rusty, as I've been away from doing the day-to-day work of systems administration, I decided to setup a T2000 with LDOMS.  So how rusty am I - well rustier than I thought.  It reminded me of laying off from working out for a while and then wake up and decide to go out and run a mile or two and then hit the weights.  At first, it wasn't so bad until you realize that things aren't going the way they used to when you were doing this regularly.  It was still fun.  And at the end, you flex up and say - "Oh Yea, I've still got it" as if you've just defeated some worldly challenge.  Either way, you do feel good about accomplishing something.  I almost forgot this triumphant feeling.  Wow, it felt good.  Just asked my wife when I came downstairs from the office at 1:30am boasting about defeating this T2000 and LDOMS.

Let's take a look at this not so big a challenge - if you're not rusty.  First, we'll start with an overview of the things you need to do and then talk about the the learning experiences that came from each of the major tasks in getting LDOMs up and running on the T2000.

Overview of Tasks:
  • Load Solaris 10 (11/06) on the T2000
  • Add the required OS Patches & Packages
  • Create A zpool & zfs filesystem for disk image files for guest domains and etc.
  • Update the T2000 System Controller Firmware
  • Setup the Control Domain
  • Create a Guest Domain
  • Jumpstart the Guest Domain
  • Sing "We Are the Champions"
Load Solaris 10 (11/06) on the T2000:

    Stuff you will need:
  • Here's where you can get LDOM Documentation.  Please read both Doc's first.
  • Here's where you can get the Beginner's Guide to LDOMS.  A good first read.
  • Solaris 10 (11/06) media or Jumpstart server with S10 (11/06).
  • LDOM Packages LDOMS_Manager-1_0-RR.zip.
  • Patches:
    • Solaris Kernal Patch 118833-36
    • LDOM Patch 124921-02
    • Solaris Patch 125043-01
    • System Firmware Patch - I used 126399-01 for a T2000
    • I also recommend the 10_Recommended Patch bundle which will take care of your Solaris patches.  Double check w/showrev -p
  • Here's where you can download LDOM 1.0
  • Here's where you can download Patches
  • Here's a good starting point for LDOM Reference materials.
I loaded Solaris 10 on the T2000 using a DVD.  I needed to get to the OK prompt so that I could boot from the DVD drive.  No problem, I hooked up the T2000 to an already installed Termincal Concentrator port 6.  Then I telnet'd to the ip address of the terminal concentrator's specific port and logged into the the system conroller port of the T2000.  I decided now is the time to go ahead and setup the system console to work over the network

bb@hippo:~ >telnet 5006

Please login: admin
Please Enter password: admin

sc> setupsc
  NOTE:  Here I answered the prompts to enable networking and gave the sc an IP address an gateway address and changed the prompt.
sc> resetsc  
NOTE:  You need to reset the system controller for the changes to take affect.

arakeen-sc>showplatform 
NOTE:  Displays platform details and status

arakeen-sc>showhost 
NOTE: Shows flash firmware versions
Host flash versions:
   Hypervisor 1.4.1 2007/04/02 16:37
   OBP 4.26.1 2007/04/02 16:26
   POST 4.26.0 2007/03/26 16:45

arakeen-sc>showfaults 
NOTE: I did this because I had fault lights on the systems front panel.

arakeen-sc>clearfault  
NOTE:  I cleared the fault plugged in second powersupply.  UUID is the id for the faild component.

arakeen-sc clearasrdb 
NOTE:  cleared blacklisted asr db (automatic system reboot/recovery)

arakeen-sc>  
NOTE: Hit control key and right bracket to get out of the telnet session to the terminal concentrator.

bb@hippo:~ >ssh arakeen-sc  NOTE: Decided to use the network port to get to the console via the system controller - less choppy.

arakeen-sc>console -f  NOTE: This get you to the OK prompt if not booted (console for the T2000).  -f is to force write mode.

ok> boot cdrom  NOTE:  Boot from cdrom (okay it really is a DVD drive but cdrom is the alias to point to the DVD device) so that you can load the OS

The T2000 boots up from the DVD and automatically starts the installation process.  Answer the questions via the text based install process.  Note: I had to use   2 for the F2 key due to my terminal emulation setup.

Once installed, I logged into the T2000 and took a look around - everything looked good so off  to the next step.

Add the required OS Patches & Packages:

Once the base OS was loaded, I decided to create a zpool and zfs filesystem to store my downloads of the Patches and Packages.  I also used a zfs filesystem for my jumpstart server in the control domain.  Then I used a seperate zfs filesystem for each of my LDOMS.  The seperate zfs filesystems for each LDOM would allow for me to be able to zfs snapshot and zfs clone a Gold LDOM disk image file such that I could use for easily creating duplicate LDOMS without having to jumpstart.  All I would have to do is a sysconfig for the cloned LDOM.  Now that is really really cool!!!!

Create A zpool & zfs filesystem for disk image files for guest domains and etc:

Here's what I did on the control domain arakeen:

root@arakeen:/> zpool create tank c1t0d0s4

root@arakeen:/> zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                   44.2G   9.15G   35.1G    20%  ONLINE     -


root@arakeen:/>zfs create tank/Downloads
root@arakeen:/>zfs create tank/jumpstart
root@arakeen:/>zfs create tank/LDOMS
root@arakeen:/>zfs create tank/LDOMS/ldg1

root@arakeen:/>zfs list
NAME                   USED  AVAIL  REFER  MOUNTPOINT
tank                      9.15G  34.4G  28.5K  /tank
tank/Downloads         275M  34.4G   275M  /tank/Downloads
tank/LDOMS            5.00G  34.4G  25.5K  /tank/LDOMS
tank/LDOMS/ldg1       5.00G  34.4G  5.00G  /tank/LDOMS/ldg1
tank/jumpstart        3.88G  34.4G  3.88G  /tank/jumpstart

Now I downloaded the the above Patches and Packages into my /tank/Downloads directory.  Unzipped the stuff.  The plan was to boot into single user so doing my old ways I shutdown and booted into single user mode.

root@arakeen:/> shutdown -i0 -g0 -y
root@arakeen:/>boot -s

All was good - entered root password and was in sinlge user mode.   So I went to cd into /tank/Downloads  OOPS  nothing there!  What in the world?  That's right zfs doesn't get mounted in single user mode.  So I went ahead and hit D to take me back to Multi-User mode and copied the zip files over to /var/tmp.  I then installed all the patches.  NOTE:  I could have probably done a zfs mount tank/Downloads and been fine.  Call me RUSTY.

Okay, maybe here's the time I should talk a little about the T2000 configuration that I'm using.  For one, I was very lucky to find one in a Lab.  So I'm not complaining.   The configuration of the T2000 is 8 cores, 4GB memory, and 1 73GB Disk.  Not a real good candidate to run LDOMS - okay I know that; but it was available and I did learn a bunch of stuff due to the limited memory and disk resources.  And I will highlight them throughout when I get to that point.

Update the T2000 System Controller Firmware:

First of all you need to check out the readme file for this patch (126399-01) for installation instructions.  Basically, there are two ways to flashupgrade your T2000 System controller firmware 1.) From the Solaris console and 2.) Using the flashupdate command on the system controller console.  As you would have guessed it the T2000 I was using would not support the first method of using sysfwdownload utility from the Solaris console.  So I had to do flashupdate via the 2nd method which required an ftp server in which the system controller could access as part of the flashupdate command.  Good thing I already configured the system controller to use the network management port.  The only thing I had to do was setup my Solaris 10 laptop as an ftp server.  That was easy enough just enable ftp via "svcadm enable ftp". 

This was really a non-event.  However, please read the readme's included with the patch.

What about LiveUpgrade?

Here's where I decided - hey wouldn't it be cool - to have a partition (slice) reserved for LiveUpgrade.  I should have done this the firstime - it is the recommended practice for patching and upgrades let alone be a great failsafe alternate boot environment.   Besides, I now wanted to see what would happen if I did a LiveUpgrade to Solaris U4 coming out in the next couple of months.   How would this effect my LDOMS configuration?  I'll save this for another blog.

Okay, to do this I'm going to have to repartition my disk - Bummer.  Call me RUSTY - should have done the best practice layout.  So I went back and re-installed with better disk partitioning to utilize LiveUpgrade.   But wait a minute I my boot cdrom hangs with "Assertion failed: nvlist_lookup_uint64(zhp->zpool_config, "pool_guid", &theguid) == 0, file ../common/libzfs_import.c, line 336, function pool_active".   Now that's nasty.  This took me back a few hours.  NOTE:  This is a bug when installing from DVD with a disk with zfs already on it.  So there was a work around where you break out of the install and then restarted the install.  I wasn't able to get this to work so I just deleted the zpool with the zfs filesystems.  I should have searched sunsolve.sun.com and Googled the error message - I would have saved tons of time on trying to figure out what I was doing wrong.

Here's what my partition table looks like:


root@arakeen:/>prtvtoc /dev/rdsk/c1t0d0s2
    .
  Part      Tag    Flag     Cylinders             Size            Blocks
  0       root        wm       0 -  1648            8.00GB    (1649/0/0)   16780224               Root Filesystem
  1       swap       wu    1649 -  3297         8.00GB    (1649/0/0)   16780224              SWAP SPACE
  2     backup     wm       0 - 14086        68.35GB    (14087/0/0) 143349312            Whole Disk
  3 unassigned    wm    3298 -  4946      8.00GB    (1649/0/0)   16780224              Reserved for LiveUpgrade
  4 unassigned    wm    4947 - 14086  44.35GB    (9140/0/0)   93008640               Zpool space
  5 unassigned    wm       0                0         (0/0/0)             0
  6 unassigned    wm       0                0         (0/0/0)             0
  7 unassigned    wm       0                0         (0/0/0)             0

Note:  You might think 8GB of swap is high for 4GB Ram.  It probably is;  but 4GB of memory isn't alot of memory for running multiple LDOM andl isn't much for 32 cpu's (8 cores x 4 threads).

Setup the Control Domain:

Now were on to the fun stuff - right?

So the first thing we need to do is startup LDOM manager and setup the control domain.  I used the Admin Guide to walk me through the steps which looked like this:

root@arakeen:/>svcadm enable ldmd  Note:  This turns on LDOMs.
root@arakeen:/>ldm  Note: This will show you the many command line options and parameters.  It also will make sure you have your path setup.
root@arakeen:/>ldm ls -l Note:  This will show you all available resources that are available.

Now it is time to setup the default services for the control domain (the name of the control domain is defaulted to Primary):
  • vdiskserver - virtual disk server
  • vswitch - virtual switch service
  • vconscon - virtual console concentrator service
My first attemp, I  setup the control domain resources before I setup the services.  I believe this resulted in some wasted time when I tried to setup services.  So I recommend seting up the services first.  Otherwise you may run into some weird behavior which was an error messages.  The release notes indicated for this errors I got - to restart ldmd (svcadm restart ldmd).  The other thing that happened was I wasn't able to stetup the vdiskserver.  So what I finally ended up doing was to reset back to the factory-default configuration (ldm set-config factory-default).  You will see this later.  Remember, I did this out of order so that is why I had to start over.  The easiest way was to go back to the factory-default configuration for the control domain.

root@arakeen:/>ldm add-vds primary-vds0 primary
root@arakeen:/>ldm add-vcc port-range=5000-5100 primary-vcc0 primary
root@arakeen:/>ldm add-vsw net-dev=e1000g0 primary-vsw0 primary

Lets see what the Primary (control domain looks like now:

root@arakeen:/>ldm list-services primary

Vldc:   primary-vldc0
Vldc:   primary-vldc3
Vds:    primary-vds0
                vdsdev: vol1    device=/tank/LDOMS/ldg1/bootdisk.img
Vcc:    primary-vcc0
                port-range=5000-5100
Vsw:    primary-vsw0
                mac-addr=0:14:4f:f8:92:db
                net-dev=e1000g0
                mode=prog,promisc


Services are setup so now it is time to setup system resources for the control domain.  By default all resources are assigned to the Primary (control domain).  So in order to set guest domains you will need to release some of the resources to be used for other domains.  I really couldn't find any real definitive recommendations for what resources should be.  So here's what I would start out with and why.
  • 4 vcpu's -  This is based on keeping the 4 threads per core aligned.  If you put 2 vcpus (threads) in 2 seperate ldoms then each ldom is shareing the same core.  If possible, it is better to not share a core between multiple ldom to minimize any possible contention.  It all depends on the workload of the ldoms.  You can share cores across ldoms upto 4 ldoms as there are 4 threads per core.  This is how you can get 32 ldoms for an 8 core T2000.
  • 4GB of memory - This is based on conversations and experimentation with different memory settings.  You can go lower than 4GB memory if you are not going to use ZFS in the control domain as virtual storeage for guest domains.  Remember the T2000 that I was using only had 4GB of memory and initially I set it up with 2GB of memory and everything was good until I started jumpstarting my first guest domain where the jumpstart hung in the middle off adding packages to the zfs disk image file.   I increased the control domain to have 3GB and I was able to squeak through the jumpstarting of the guest domain.  Others have indicated that if you are using ZFS you should have at least 4GB of memory in the control domain as well.
So here's setting up Primary's (control domain's) resources:

root@arakeen:/> ldm set-mau 1 primary
root@arakeen:/>ldm set-vcpu 4 primary
root@arakeen:/>ldm set-memory 3G primary

Let's check the config:

root@arakeen:/> ldm ls -l primary

Name:   primary
State:  active
Flags:  transition,control,vio service
OS:    
Util:   1.0%
Uptime: 59m
Vcpu:   4
        vid    pid    util strand
        0      0      2.3%   100%
        1      1      1.0%   100%
        2      2      0.6%   100%
        3      3      0.2%   100%
Mau:    1
        mau cpuset (0, 1, 2, 3)
Memory: 3G
        real-addr        phys-addr        size           
        0x8000000        0x8000000        3G
Vars:   reboot-command=cr ." Ignoring auto-boot? setting for this boot." cr
IO:     pci@780 (bus_a)
        pci@7c0 (bus_b)
Vldc:   primary-vldc0   [num_clients=4]
Vldc:   primary-vldc3   [num_clients=7]
Vds:    primary-vds0    [num_clients=1]
                vdsdev: vol1    device=/tank/LDOMS/ldg1/bootdisk.img
Vcc:    primary-vcc0    [num_clients=1]
                port-range=5000-5100
Vsw:    primary-vsw0    [num_clients=1]
                mac-addr=0:14:4f:f8:92:db
                net-dev=e1000g0
                mode=prog,promisc
Vcons:  S
P

Now it's time to store our config and we do this by:

root@arakeen:/> ldm add-config initial   NOTE:  This saves the configuration on the system controller (ALOM).
root@arakeen:/> ldm ls-config

factory-default {current}
initial [next]

Now reboot and we can move onto setting up our first Domain.

Create Guest Domain:

I needed a boot disk for the guest domain and I've already created a zfs files system above tank/LDOMS/ldg1.  I decide to use a file on top of the the zfs filesystem so that I could create a snapshot and then clone the snapshot and use it as a bootable disk image file for another domain.  I created a 5GB file via:

root@arakeen:/tank/LDOMS/ldg1> makefile 5G bootdisk.img

Here's the command I used to setup my guest domain ldg1:

root@arakeen:/> ldm add-domain ldg1

root@arakeen:/> ldm add-vcpu 8 ldg1

root@arakeen:/> ldm add-memory 396M ldg1  NOTE:  I used 396MB as I only have 4G total and needed 3GB min. for using ZFS in control domain.

root@arakeen:/> ldm add-vnet vnet1 primary-vsw0 ldg1

root@arakeen:/> ldm add-vdiskserverdevice /tank/LDOMS/ldg1/bootdisk.img vol1@primary-vds0

root@arakeen:/> ldm add-vdisk vdisk1 vol1@primary-vds0 ldg1

root@arakeen:/> ldm set-variable auto-boot\?=false ldg1

root@arakeen:/> ldm bind-domain ldg1

root@arakeen:/> ldm start-domain ldg1

root@arakeen:/> telnet localhost 5000

You should now be at the ok prompt just like you would be on a physical system - that's cool.  We need to set up our devaliases so that we can boot off the right devices.  Please refer to the Administration Guide and/or the Beginner's Guide for the details.  I setup a devalias called vdisk1 for my disk and vnet1 for my network then changed my boot-device variable to vdisk1 vnet1.

Jumpstart the Guest Domain:

I used the control domain as a jumpstart server.  I did this by mounting the DVD and running:

root@arakeen:/cdrom/sol_10_1106_sparc/s0/Solaris_10/Tools>setup_install_server

Then I did a few short cuts knowing this isn't the best or recommended way to jumpstart a server.  I bypassed setting up a profile and sysidcfg file etc.  I just wanted to be able to get access to the Solaris bits and create a boot server so I could boot the guest domain and interactively install the bits on the virtual disk.  I know should have taken the time to create a jumpstart server correctly.  Don't stone me!  Now I ran:

root@arakeen:/cdrom/sol_10_1106_sparc/s0/Solaris_10/Tools>add_install_client -e 0:14:4f:fa:b5:48 ldg1 sun4v

This did the necessary stuff for me to be able to:

ok boot vnet1 - install

Oh No!  It never got it's IP address to start the booting process.  What's up with this - I spent alot of time messing around with trying to figure out why rarp wasn't working.  Well, if you remember reading the docs the vswitch is a layer 2 switch and by default the vnet can't communicate with the external network via the physical interface.  Okay that's cool. I'll just plumb up vsw0 in the control domain.  No, it didn't work.  The control domain's physical interface (e1000g0) still couldn't see the broadcast from vsw0.  Long story short I had to unplumb e1000g0 and plumb up vsw0 per the install guide!!!!

Starting to feel good now!!  I'm booting and waiting for first install screen.  So you can tell, I did get booted and answered all the install questions.  And yes, I did a reboot after the install and I could login.  Is it time to sing yet?   Let's try one more thing.  Let's clone the /tank/LDOMS/ldg1 filesystem and create a new Guest domain and boot from the cloned file system.  So here's how it went:

root@arakeen:/>zfs snapshot tank/LDOMS/ldg1@july12-1920

Lets take a look to see what happened.

root@arakeen:/>zfs list
NAME                   USED  AVAIL  REFER  MOUNTPOINT
tank                  19.2G  24.4G  28.5K  /tank
tank/Downloads         275M  24.4G   275M  /tank/Downloads
tank/LDOMS            15.0G  24.4G  28.5K  /tank/LDOMS
tank/LDOMS/ldg1       5.00G  24.4G  5.00G  /tank/LDOMS/ldg1
tank/LDOMS/ldg1@july12-1920      0      -  5.00G  -                           <--- Here's the snap shot
tank/LDOMS/ldg2       33.2M  24.4G  5.00G  /tank/LDOMS/ldg2
tank/jumpstart        3.88G  24.4G  3.88G  /tank/jumpstart

Notice that there is no space used and if I:

root@arakeen:/>cd /tank/LDOMS/ldg1/.zfs/snapshot/july12-1920/
root@arakeen:/tank/LDOMS/ldg1/.zfs/snapshot/july12-1920>ls -l
total 10492946
-rw------T   1 root     root     5368709120 Jun 25 15:33 bootdisk.img
-rwxr-xr-x   1 root     root         646 Jun 25 15:29 fcksum
-rw-r--r--   1 root     root         512 Jun 25 15:30 label.bootdisk.img.070625_153008

Now let's clone it so that I can use it:

root@arakeen:/>zfs clone tank/LDOMS/ldg1@july12-1920 tank/LDOMS/ldg3
root@arakeen:/>zfs list
NAME                   USED  AVAIL  REFER  MOUNTPOINT
tank                  19.2G  24.4G  28.5K  /tank
tank/Downloads         275M  24.4G   275M  /tank/Downloads
tank/LDOMS            15.0G  24.4G  29.5K  /tank/LDOMS
tank/LDOMS/ldg1       5.00G  24.4G  5.00G  /tank/LDOMS/ldg1
tank/LDOMS/ldg1@july12-1920      0      -  5.00G  -
tank/LDOMS/ldg2       33.2M  24.4G  5.00G  /tank/LDOMS/ldg2
tank/LDOMS/ldg4           0  24.4G  5.00G  /tank/LDOMS/ldg3         <----Note now space used!!  Yet!  Once I boot it an change hostname etc.. this will change.
tank/jumpstart        3.88G  24.4G  3.88G  /tank/jumpstart

So now I created a ldg2 guest domain:

root@arakeen:/> ldm add-domain ldg2
root@arakeen:/> ldm add-vcpu 8 ldg2
root@arakeen:/> ldm add-memory 396M ldg1  NOTE:  I used 396MB as I only have 4G total and needed 3GB min. for using ZFS in control domain.
root@arakeen:/> ldm add-vnet vnet1 primary-vsw0 ldg2
root@arakeen:/> ldm add-vdiskserverdevice /tank/LDOMS/ldg2/bootdisk.img vol1@primary-vds0
root@arakeen:/> ldm add-vdisk vdisk1 vol1@primary-vds0 ldg2
root@arakeen:/> ldm set-variable auto-boot\?=false ldg2
root@arakeen:/> ldm bind-domain ldg2
root@arakeen:/> ldm start-domain ldg2

root@arakeen:/> telnet localhost 5000
root@arakeen:/>telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Connecting to console "ldg1" in group "ldg1" ....
Press ~? for control options ..

ldg1 console login:


Note that it says ldg1 and not ldg2.  That is because I didn't do a sys-unconfig before I cloned so the new guest domain has the same identity as what I cloned.  So if I want both domains online at once I would just do a sys-unconfig of one of the domains and reboot and answer the identification questions etc.

Also, at this time there is a bug that if you unbind the domain and bind the domain you could loose the disk label of the bootdisk.img (first block inside file) remember it is looks like a physical disk to the guest domain.  The work around is to run fcksum after you unbind and before you bind it again.

NOW WE CAN SING "We Are the Champions"!!!!


Powered by ScribeFire.

Comments:

Hi, Can you tell me - how the guest domain jumpstart install got succeeded? I am still stuck getting an IP from the control domain for the guest MAC.... Regds, Vinod

Posted by vinod on July 23, 2007 at 07:01 AM CDT #

Vinod, The guest domain get's it IP as part of the jumpstart process. Basically, when you add your client for jumpstart an entry into /etc/ethers gets created with MAC address and IP for that MAC. The guest domain sends out a RARP and gets picked up by the control domain (jumpstart server) rarpd service. Now you have to have vsw0 plumbed in place of the e1000g0 interface or the communication won't take place. You may also have to restart the rarpd service "svcadm restart rarpd" etc.. Brad......

Posted by Brad Beadles on July 29, 2007 at 05:02 PM CDT #

Hi Brad, I got the communication established, but got to know the constraint of communication between control to guest domain and vice versa, which doesnot allow to jumpstart guest domains from control domain. I am planning to jumpstart from an external Jumpstart server... Your comments on communication constraint by Sun microsystems

Posted by Vinod on July 30, 2007 at 07:37 AM CDT #

Vinod, If you have plumbed up your vsw0 in place of the physical interface e1000g0 then you should be able to communicate between the control domain and the guest domain. So, I don't know or experieced any communication constraints. When you say you got communication established were you able to boot from the vnet? If so then you are on to having the next step of making sure when you ran the add client that things were correct. Check /etc/bootparams. make sure names can be resolved. Also, want to make sure your jumpstart directory is shared. If you can't boot vnet then do you get an IP, if not, then rarp service is having problems check svcs and check /etc/ethers. If you get IP, but nothing else then, check /tftpboot. You might want to remove the client and then re add it. Do a man on add_install_client for options. Hope this helps. Brad.....

Posted by Brad Beadles on July 31, 2007 at 10:56 AM CDT #

Hi
I see that in setting up the clone you ran
ldm add-vdiskserverdevice /tank/LDOMS/ldg2/bootdisk.img vol1@primary-vds0

but you already had vol1@primary-vds0 in the first guest you created, so this most likely shoudl fail, as the vol1 is a unique identifier for the device being added to the disk server.

Enda

Posted by 192.18.1.36 on September 13, 2007 at 11:34 AM CDT #

Enda,

You would be correct if I would have had ldg1 bound. But it was unbound. So there was no namespace issue.

However, great catch as this would not be a good practice in general. Also, note that I only had 396MB of memory to run a guest domain. So I would have also gotten an error when trying to bind ldg2 on not having enough memory.

Thank you for the comment. I should have clarified that this isn't necessary a good naming scheme; but, just a quick this is what I did on experimenting with LDOMS first time out.

Brad....

Posted by Brad on September 14, 2007 at 06:21 PM CDT #

Stuck jumpstarting a guest domain. I'll figure it out. I just want to make sure what i'm doing is possible so it all comes down to one question.

Can I jumpstart a guest logical domain from an EXTERNAL jumpstart server ( not the control domain ) using a zfs disk image as the storage for the guest?

Thanks for input.

Jason

Posted by Jason on October 02, 2007 at 09:55 AM CDT #

Jason,

Yes, you can use an external jumpstart server to boot and load your guest domain.

A couple of things I ran into was that I needed to make sure that in the control domain that I plumbed up the vsw to the physical nic so that external traffic can be seen by the guest domain. Also, the external jumpstart server must be on the same physical network as the control domain's vsw interface for bootp to work due to broadcasting rarp.

Brad.....

Posted by Brad on October 02, 2007 at 11:19 AM CDT #

Sigh. External jumpstart isn't working.

I can see the rarp packet from the guest domain's boot vnet0 install hit the jumpstart server.

I can see the jumpstart server respond happily, both by in.rarpd in debug mode, and a truss of this process. "0xFEFFAE30: "/tftpboot/8081D1B0"

For some reason the packet is not getting to the guest domain.

1. For guest domain's using a vsw,the vsw0 device has to have the mac-addr=e1000g0's or you simply can't talk to the outside world correct?

I created the vsw,vsw0 devices, but think that is only for communicating between guest & control domains.

Any other passing thoughts?

Posted by Jason on October 05, 2007 at 11:49 AM CDT #

Jason,

You have to unplumb the e1000g0 interface and plumb the vsw0 in its place. This lets the guest doms that are on vsw0 talk to the outside world. The install guide should help clarify this a little more.

I ran into the same problem as described above. You are close.

Brad......

Posted by Brad on October 06, 2007 at 11:21 AM CDT #

Brad, I have a question. I'm following this well enough, however, I'm lost at the makefile
command in the tank zfs (makefile 5G bootdisk.img). There is no "makefile" command in Solaris 10 8_07:

# find / -name makefile
#

Am I missing somethinug obvious?

Posted by Paul Mitchell on October 17, 2007 at 02:22 PM CDT #

Paul,

Sorry, this should be mkfile.

Brad.....

Posted by Brad on October 17, 2007 at 02:53 PM CDT #

MY bad, mkfile, name makefile!

Posted by Paul Mitchell on October 17, 2007 at 02:55 PM CDT #

hi Brad

Did you try any performance test with ldoms? What's the impact of IDOM on CPU utilization / network throughput, etc?

Thanks

Posted by eric on October 18, 2007 at 07:43 PM CDT #

Hi Brad,
First time, I'm trying to install solaris10 on the logical domain "ldg1".

I've created vdisk1 by creating the following:

$ldm add-vsdev /ldom-files/ldg1 vol1@primary-vds0
$ldm add-vdisk1 vol1@primary-vds0 ldg1

I've also assigned vCPUs, Memory from a control domain.

I've burned a DVD with .iso image by downloading the solaris10 images and concatening them. Could you please let me know how do I load solaris10 on the logical domain "ldg1" without using the jump start process.

I really appreciate your help on this.

Thanks
Subba

Posted by Subba on January 29, 2008 at 02:57 PM CST #

Subba,

There is currently a limitation of being able to load guest doms with CD. You can only use the "Physical CD device" in an I/O domain or the control domain.

To use it in an I/O domain, you will have to create a split PCI-E bus configuration where the CD is a physical device on I/O domains PCI-E bus.

Otherwise, you have to do a jumpstart.

Hope that helps.

Brad......

Posted by Brad on January 31, 2008 at 11:07 AM CST #

HI Brad,

I am going through the documents. But sill have questions on setting up the ldoms. Here is my requirement. We have 5220. I need to set up one DB and one APP servers with SAN storage. What is the best approch for this?.Does the IO domain itself is the DB domain or IO domain only works as a control domain?. Do i need to install the ldom software on the IO domain too?.

Thanks
Teja

Posted by Teja on February 29, 2008 at 05:40 AM CST #

Teja,

You can run your DB in an I/O domain which will give you better i/o performance. By default your control domain is also an I/O domain.

Brad.....

Posted by Brad Beadles on March 02, 2008 at 08:02 PM CST #

Hello Brad,

Nice presentation. On the comments mentioned during Jumpstart process ( Not able to get IP address from Jumpstart server ), I am facing the same issue. I have been trying hard to get but clueless. Can you please advice. Following are the details I am sending...

----------------------------------------------------------
1.
bash-3.00# telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Connecting to console "ldg-sunt1" in group "ldg-sunt1" ....
Press ~? for control options ..

~ ?
{0} ok
{0} ok boot vnet1 - install

Sun Fire(TM) T1000, No Keyboard
Copyright 2008 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.27.11, 1024 MB memory available, Serial #66692527.
Ethernet address 0:12:a5:f9:a5:af, Host ID: 83f9a5af.

Boot device: /virtual-devices@100/channel-devices@200/network@0e and args: - install
Requesting Internet Address for 0:13:4f:fa:20:e2
Requesting Internet Address for 0:13:4f:fa:20:e2
Requesting Internet Address for 0:13:4f:fa:20:e2

----------------------------------------------------------------
2. At Jumpstart server I get following rarp diagnosis messages

bash-3.00# /usr/sbin/in.rarpd -da
/usr/sbin/in.rarpd:[1] device nge0 lladdress 0:e0:83:5a:41:c6
/usr/sbin/in.rarpd:[1] device nge0 address 192.167.10.20
/usr/sbin/in.rarpd:[1] device nge0 subnet mask 255.255.255.0
/usr/sbin/in.rarpd:[3] starting rarp service on device nge0 address 0:e0:81:5a:40:c6
/usr/sbin/in.rarpd:[3] RARP_REQUEST for 0:13:4f:fa:20:e2
/usr/sbin/in.rarpd:[3] trying physical netnum 192.167.10.0 mask ffffff00
/usr/sbin/in.rarpd:[3] good lookup, maps to 192.167.10.177
/usr/sbin/in.rarpd:[3] immediate reply sent

--------------------------------------------------------------------
3. On control domain snoop results is as below ( mean not able to get the IP address )

OLD-BROADCAST -> (broadcast) RARP C Who is 0:13:4f:fa:20:e2 ?

--------------------------------------------------------------------
4.
LDOM Config details

bash-3.00# ldm ls -l
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -t-cv SP 4 2G 0.1% 1d 1h 38m

SOFTSTATE
Openboot initializing

VCPU
VID PID UTIL STRAND
0 0 0.3% 100%
1 1 0.1% 100%
2 2 0.0% 100%
3 3 0.0% 100%

MAU
CPUSET
(0, 1, 2, 3)

MEMORY
RA PA SIZE
0x8000000 0x8000000 2G

IO
DEVICE PSEUDONYM OPTIONS
pci@780 bus_a
pci@7c0 bus_b

VDS
NAME VOLUME OPTIONS DEVICE
primary-vds0 vol1 /LDOM1

VCC
NAME PORT-RANGE
primary-vcc0 5000-5100

VSW
NAME MAC NET-DEV DEVICE MODE
primary-vsw0 00:14:2f:20:e3:60 bge0 switch@0 prog,promisc

VCONS
NAME SERVICE PORT
SP

------------------------------------------------------------------------------
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
ldg-sunt1 active -t--- 5000 4 1G 0.0% 40m

SOFTSTATE
Openboot initializing

VCPU
VID PID UTIL STRAND
0 4 100% 100%
1 5 0.0% 100%
2 6 0.0% 100%
3 7 0.0% 100%

MEMORY
RA PA SIZE
0x8000000 0x88000000 1G

VARIABLES
auto-boot?=false
boot-device=vdisk
local-mac-address?=true

NETWORK
NAME SERVICE DEVICE MAC
vnet1 primary-vsw0@primary network@0 0:13:4f:fa:20:e2

DISK
NAME VOLUME TOUT DEVICE SERVER
vdisk1 vol1@primary-vds0 disk@0 primary

VCONS
NAME SERVICE PORT
ldg-sunt1 primary-vcc0@primary 5000

--------------------------------------------------------------------

Please advice, this has been a real headache for a week.

Vijay

Posted by Vijay Upreti on March 23, 2008 at 02:47 AM CDT #

Vijay,

It sounds like you are not getting an IP address for 0:13:4f:fa:20:e2 mac address. Check your etc/ethers on jumpstart server or if your using DHCP server check.

Brad.......

Posted by Brad Beadles on March 24, 2008 at 01:07 PM CDT #

Just a couple of questions:

- What's the difference between assign a zfs to virtual device instead of .img file? you can snapshot the entire zfs volume, does it?

- I've read about Linux kernel that supports LDOM? any experience? does exist any way to install without network install system?

Thanks in advanced,
Cesar

Posted by Cesar on May 17, 2008 at 03:12 PM CDT #

Cesar,

Please ask these questions on the OpenSolaris discuss: ldoms-discuss@opensolaris.org

Hope this helps.

Brad.....

Posted by Brad on May 21, 2008 at 02:33 PM CDT #

Ho do you reciver root password of an ldom client?

Posted by Tabrez Khan on August 14, 2008 at 12:12 AM CDT #

Tabrez,
I would mount the guest ldom's root filesystem and go through the normal recovery practice by editing the /etc/password and /etc/host.

Brad......

Posted by Brad on August 18, 2008 at 02:36 PM CDT #

Hi Brad,
I am trying to install the os on T1000 server which is in the same subnet of install server(V210). I have done setup_install_server and add_install_client blaballl . No patches related to ldom are added to my install server.

The T1000 server is displaying the following,,,
Requesting Internet Address for 0:14:4f:e6:22:4c
Requesting Internet Address for 0:14:4f:e6:22:4c
Requesting Internet Address for 0:14:4f:e6:22:4c

and its not detecting the image..

please help me out as T1000 server doesnt have DVD rom..

Thanks in advance.

Santosh.

Posted by santosh on September 08, 2008 at 05:45 AM CDT #

The tutorial was really helpful .. spent a good amount of time trying to jumpstart my guest domains .. thanks ..

Posted by INIT07 on November 11, 2008 at 05:08 AM CST #

INIT07,

Glad this helped. I also want to make sure that you are aware of the OpenSolaris discuss: ldoms-discuss@opensolaris.org. This is a very useful alias that very active.

Brad.....

Posted by Brad on November 11, 2008 at 09:40 AM CST #

Really good article. Unfortunately, am still not able to configure jumpstart. This is the error i get .. can you please help with this?

{0} ok boot net - install
Boot device: /virtual-devices@100/channel-devices@200/network@0 File and args: - install
Requesting Internet Address for 0:14:4f:fb:6:b4
SunOS Release 5.10 Version Generic_127127-11 64-bit
Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
whoami: no domain name
WARNING: mountnfs3: pmap_kgetport RPC error 2 (RPC: Can't decode result).
WARNING: Unable to mount NFS root filesystem: error 6
Cannot mount root on /virtual-devices@100/channel-devices@200/network@0 fstype nfsdyn

panic[cpu0]/thread=180e000: vfs_mountroot: cannot mount root

000000000180b950 genunix:vfs_mountroot+32c (800, 200, 0, 18a5400, 18cf000, 18f7c00)
%l0-3: 00000000010d0800 00000000010d09e4 00000000018ab0b0 00000000011f9000
%l4-7: 00000000011f9000 00000000018fa000 0000000000000600 0000000000000200
000000000180ba10 genunix:main+98 (182b538, 1019c00, 18594e0, 18f4400, 0, 182b400)
%l0-3: 0000000070002000 0000000000000001 0000000070002000 000000000180c000
%l4-7: 000000000180e000 0000000070002000 000000000180c000 0000000000000000

skipping system dump - no dump device configured
rebooting...
WARNING: Unable to update LDOM Variable

Thanks

Posted by John on November 12, 2008 at 04:10 AM CST #

John,

I'm not sure what this error is - sorry. However, please try OpenSolaris discuss: ldoms-discuss@opensolaris.org. This is a very useful alias that very active.

Make sure that you are net booting from an update of Solaris that is supported for LDOM's and has the necessary virtual device drivers. This may be the cause as you are getting past Jumpstart now it appears to be OS related.

Brad......

Posted by Brad on November 12, 2008 at 08:41 AM CST #

This issue comes because of SUNWJess software. What you will need to do is unconfigure SUNWjess ( which hardens the systems ) and once jumpstart setup is complete just apply securiy hardning.

Posted by Vijay Upreti on November 12, 2008 at 11:22 PM CST #

Thanks Brad,Vijay

I unconfigured sunwjass tool .. still its showing the same error. Have posted in ldom forum .. Is there anything small but important tip i should know?
As of now i have set up jumpstart on control domain, unplumb e1000g0 and plumb vsw0 and removed vsw0 ..

Thanks,
John

Posted by John on November 13, 2008 at 12:27 AM CST #

Update for my last post .. removed sunwjass instead of vsw0 ..

Posted by John on November 13, 2008 at 12:43 AM CST #

I hope these patches will resolve your problem... please look for the latest patch in sunsolve.com

Solaris Kernal Patch 118833-36
LDOM Patch 124921-02
Solaris Patch 125043-01
System Firmware Patch - 126399-01 for a T2000

Posted by Vinodh K on November 13, 2008 at 12:49 AM CST #

Did you uninstall or unconfigured SUNWjess? Its not uninstalling SUNWjess, but unconfiguring SUNWjess to unharden the system. ( This is assuming that your jumpstart server is in control domain ). Use the command "/opt/SUNWjass/bin/jass-execute -u" to unharden the system and once guest domain is installed, use "/opt/SUNWjass/bin/jass-execute -q -d ldm_control-secure.driver" to harden the system again.

Vijay

Posted by Vijay Upreti on November 13, 2008 at 12:53 AM CST #

Vinodh,
Thanks.. I will try to get my hands on these patch and will post update here.

Vijay,
i did unconfigure. "/opt/SUNWjass/bin/jass-execute -u" . Now the error is gone but it is getting stuck.

{0} ok boot net - install
Boot device: /virtual-devices@100/channel-devices@200/network@0 File and args: - install
Requesting Internet Address for 0:14:4f:fb:6:b4
SunOS Release 5.10 Version Generic_127127-11 64-bit
Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
whoami: no domain name

Here is the snoop dump from the control domain
darkstar -> ldom1 NFS R GETATTR3 OK
darkstar -> ldom1 NFS R GETATTR3 OK
darkstar -> ldom1 NFS R GETATTR3 OK
darkstar -> ldom1 NFS R GETATTR3 OK
ldom1 -> darkstar TCP D=2049 S=1023 Ack=2967170980 Seq=2899351166 Len=0 Win=49640
ldom1 -> darkstar NFS C LOOKUP3 FH=8932 initpipe
darkstar -> ldom1 NFS R LOOKUP3 No such file or directory
darkstar -> ldom1 NFS R LOOKUP3 No such file or directory
ldom1 -> darkstar NFS C MKNOD3 FH=8932 (Named pipe) initpipe
ldom1 -> darkstar NFS C GETATTR3 FH=ADDE
darkstar -> ldom1 NFS R MKNOD3 Read-only file system
darkstar -> ldom1 NFS R MKNOD3 Read-only file system
ldom1 -> darkstar NFS C LOOKUP3 FH=8932 initpipe
darkstar -> ldom1 TCP D=1023 S=2049 Ack=2899351670 Seq=2967171248 Len=0 Win=49640
darkstar -> ldom1 TCP D=1023 S=2049 Ack=2899351670 Seq=2967171248 Len=0 Win=49640
darkstar -> ldom1 NFS R GETATTR3 OK
darkstar -> ldom1 NFS R GETATTR3 OK

I did a google for the NFS messages , most result showed that this may be caused due to not using anon=0 while sharing. But since i have used it problem seems to be something else.

Thanks for your time,
John

Posted by 203.187.143.161 on November 13, 2008 at 01:20 AM CST #

Vinodh,

The system is already patched and is up to date.

John

Posted by John on November 13, 2008 at 01:41 AM CST #

I had the issues with jumpstart and ldoms and documented it here:
http://www.swissunixsupport.com/5120'sldoms

Posted by Gav on November 13, 2008 at 05:12 AM CST #

Hi Brad,
I currently have two guest ldoms configured in T5220.I used Veritas filesystem on control domain to create volume and filesystem which has two big bootdisk files
for ldoms.I had issued doing veritas mirroring inside ldoms for bootdisk.What is the best practice to have full redundancy for guest ldoms bootable disks.?

I would really appreciate your response..

Thanks
Karthik

Posted by Karthik on August 28, 2009 at 12:30 PM CDT #

Post a Comment:
  • HTML Syntax: NOT allowed