Today on this ol' server

Tuesday Mar 24, 2009

Remember Ada Lovelace

"[W]e may say most aptly that the analytical engine weaves algebraical patterns just as the Jacquard loom weaves flowers and leaves." Ada Lovelace around the 1840s. It was a woman that took the concept of the analytical engine the precursor to the modern computer and refined the concept of programming the analytical engine. The blogosphere celebrates Ada Lovelace day by remembering her and looks forward to cheer women already in technical disciplines and welcome newcomers to all technical disciplines. Yes we, you and I can. We can compute, abstract, record, test and examine. With those basic elements anyone can participate in Science and Technology.

Wednesday Jan 07, 2009

No need to sync a mirrored swap volume

Today's quick tip, is how to get SVM to not sync a mirrored swap volume.

Per best practices on hosts running Solaris versions incapable of root mirroring with zfs, use Solaris Volume Manager. For those of you out there that are still running legacy versions of Solaris or older versions of S10 you can set up SVM to not sync a mirrored swap volume with metaparam(1m).

root@thumper # metaparam -p 0 d20
root@thumper # metastat -p | grep d20 
d20 -m d21 d22 0
At boot time before I made the change there was about 30 seconds spent waiting for the disks to sync. And right now after the change there aren't any writes being written to d2*
thumper% iostat -xnz 2
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.3    0.5    1.0    1.1  0.0  0.0    0.1    1.9   0   0 md/d10
    0.1    0.5    0.5    1.1  0.0  0.0    0.0    1.5   0   0 md/d11
    0.1    0.5    0.5    1.1  0.0  0.0    0.0    1.3   0   0 md/d12
    0.0    0.0    0.1    0.1  0.0  0.0    0.0   30.5   0   0 md/d20
    0.0    0.0    0.1    0.1  0.0  0.0    0.0   33.6   0   0 md/d21
    0.0    0.0    0.1    0.1  0.0  0.0    0.0   24.7   0   0 md/d22

Thursday Dec 04, 2008

Importing zfs pools

How to import disks recovered after a reinstall or upgrade. Run zpool import -d against the directory that holds your disk devices.

root@thumper #  zpool import -d /dev/dsk/
  pool: compilers
    id: 12162905211102209752
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

        compilers    ONLINE
          raidz1     ONLINE
            /c0t0d0  ONLINE
            /c1t0d0  ONLINE
            /c4t0d0  ONLINE
            /c6t0d0  ONLINE
            /c7t0d0  ONLINE
Each zpool device will have a unique id. Run zpool import -d $DEVICE_DIRECTORY -f $ID where ID is either the name or the numeric id. I used the numeric id when I did the import. First try without the -f option. If the disks were exported then you won't need the -f. If you're recovering from a system crash or hardware failure it's likely the pool was not exported prior to the crash. -f forces zfs to import the pool even if it thinks it's active.
root@thumper # zpool import -d /dev/dsk -f 12162905211102209752 
If the import was successful zfs won't return any state message. Check that the pool was imported correctly by zpool status.
root@thumper # zpool status
  pool: compilers
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        compilers   ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0t0d0  ONLINE       0     0     0
            c1t0d0  ONLINE       0     0     0
            c4t0d0  ONLINE       0     0     0
            c6t0d0  ONLINE       0     0     0
            c7t0d0  ONLINE       0     0     0

Wednesday Jun 11, 2008

Please sir, could I have more Solaris?

With the latest release of opensolaris you now get a snazzy LiveCD image that conveniently fits on a cdrom, dvd or usb drive. With that you can then try out all those Solaris only features like the new package manager, Service Management Framework, Fault Management, DTrace, zfs, zones and containers and the Solaris older kernel profiling tools without wipeing out your existing OS. So go ahead, get some more Solaris and while you're at it, benchtest some DTrace with this hands on lab.

Wednesday Feb 06, 2008

Where am I?

How do you know if the host you're logged in to is a zone or a global zone?[Read More]

Tuesday Feb 06, 2007

Time Management for system administrators

This year started off with a bang. I did finally start to stick to that organizer. Being a system admin, I never really have time to leave a terminal. There's this one more ticket, there's this one more server, this just this one more email, no, over here, this really important user, over there, there's this important manager's manager. The good news is sysadmintaffyosis can be treated.

I found Tom Limoncelli's "Time Management for System Administrators". It has absolutely saved my sanity (other than cycling.) Tom presented to BayLISA a while back, and google has some of the videos.

Tom's system admin blog is also a great resource,

and Ben Rockwood recently updated his techniques.

I have been able to get more work done, by implementing the cycle system. Step one is to not read email when I first get to work. After 21 days, I've trained myself to resist email until I update my PAA and (manually) sync it with my electronic calendar. (For those of you who don't have the book and have missedemailphobia, if something truly important were to happen an email would go to my phone, or someone would call me.)

Way to go Tom for this awesome resource.

Also, if any of you readers really dig the videos please email the good people at baylisa so they will know to continue to tape and update the videos. Without feedback from the community, the baylisa group will just go back to playing nethack.

Monday Jan 29, 2007

Resource Management on Solaris 10 and beyond

At the last Opensolaris Silicon valley user group meeting, Ben Rockwood lamented how hard it is to understand resource management.

That reminded me of how I really had to read through the docs and play with the commands to get the tunings right. In this forum post I explained how to set the tunings needed for DB2 at the time of the forum post.

A few key concepts can help with tunings:

  • check that you really need to tune. If you're running the latest and greatest OS build, chances are the tunings are right, or if your hardware is sufficient large, the default tunings will be sized up, and should be correct for most apps.
  • beware of documention that says add this setting to /etc/system. /etc/system isn't used in Solaris 10 and beyond for managing most resources. (There are some bug fixes, that can only be set in /etc/system.) Most resource tuning and tweaking can only happen by resource management commands, prjadd(1M), projmod(1M), project(4), rctladm(1M), setrctl(2), prctl(1)
  • Always use the documentation that came with the specific OS version you're running. The tunings vary from releases, the what's new, and Tunable Parameters Reference Manual will have the most accurate instructions.
  • Some tunings can be set on a per project basis others are set on a per process basis, but is inherited by user per project.
  • When in doubt test: use newtask -p project id name command, and check the resource tunings with prctl.
    bash-3.00$ newtask -p myproject sleep 100 &
    bash-3.00$ prctl $!
    [1] 14027
    process: 14027: sleep 100
    NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT
    process.max-port-events
            privileged      65.5K       -   deny                                 -
    ...
    

Wednesday Jun 14, 2006

Fixing sound on an Ultra 20

Today we're going to look debugging a sound problem on an Ultra 20.

A friend called me up saying he couldn't get sound working on his ultra20. I didn't have an ultra 20 just lying around running Solaris 10 (build 74l2a) so he gave me access to his system via vpn.

I didn't know why the sound wasn't working. Sound was rumored to work, and he said he had upgraded and reinstalled all the sound drivers. I checked dmesg; there wasn't any system issue preventing the sound from working. I asked him to run sdtaudiocontrol, a sound program I was familar with.

sdtaudiocontrol has a nice status panel which shows the playback sampling. He said sdtaudiocontrol would show the record feature, but the playback tab was grayed out. I confirmed the same output by redisplaying the program to my desktop.

Since I had no clue as to why this was not working, I had no choice but to watch sdtaudiocontrol via truss. (Yes, I would have used dtrace if I had more experience with it.) truss is my main tool of choice. It's bulky; there's loads of data to wade though when looking at the system calls of any medium or large app. I needed to be sure the sound program could access all of the libraries and devices, and other pieces of programs it interfaced with. I also didn't know which libraries or devices sound was using. Also I wanted to see any weirdness unobstructed. truss will definitely do that, although some would argue the pertinent bits may be obstructed by the eyesore of digging through all that data.

I have, of course, since then found nice dtrace scripts to find file access issues, and pfiles would have shown me which files are in use.

Start truss of /ust/dt/bin/sdtaudiocontrol

truss -f -o /var/tmp/sound.truss.out /ust/dt/bin/sdtaudiocontrol
14398:  execve("/usr/bin/ksh", 0x08047D48, 0x08047D54)  argc = 2
14398:  resolvepath("/usr/bin/ksh", "/usr/bin/ksh", 1023) = 12
14398:  sysconfig(_CONFIG_PAGESIZE)                     = 4096
...
14419/1:        pipe()                                          = 11 [12]
14419/1:        open("/dev/audioctl", O_WRONLY)                 = 13
14419/1:        ioctl(13, I_SETSIG, S_MSG)                      = 0
...

The truss showed no errors when opening /dev/audioctl, and no other identifiable errors or whackiness.

Ok, it opened the device just fine. Let's see what's at the end of that device. Interestingly /dev/audio points to a usb device which is different compared to an OpenSolaris box down the hall.

bash-3.00# ls -l audio*
lrwxrwxrwx   1 root     root          10 May  2 11:50 audio -> usb/audio0
lrwxrwxrwx   1 root     root          18 May  2 11:50 audioctl ->
usb/audio-control0
My friend confirms that there is no speaker off the usb hub.
Humm. Let's look for raw audio devices.
bash-3.00# find /devices -name \*sound\* -ls 
2163736579  1 drwxr-xr-x   2 root     sys           512 Apr  7 16:31 /devices/pci@0,0/pci108e,5347@2/hub@1/device@3/sound@2
32505860    0 crw-------   1 root     sys       62,   0 May 24 16:35 /devices/pci@0,0/pci108e,5347@2/hub@1/device@3/sound@2:usb_as
2163474435    1 drwxr-xr-x   2 root     sys           512 Apr  7 16:31 /devices/pci@0,0/pci108e,5347@2/hub@1/device@3/sound-control@1
31981574    0 crw-------   1 user staff     61,   1 Apr  7 11:11 /devices/pci@0,0/pci108e,5347@2/hub@1/device@3/sound-control@1:sound,audioctl
31981572    0 crw-------   1 user staff     61,   0 Apr  7 11:11 /devices/pci@0,0/pci108e,5347@2/hub@1/device@3/sound-control@1:sound,audio
31981586    0 crw-------   1 root     sys       61,   7 May 24 16:35
/devices/pci@0,0/pci108e,5347@2/hub@1/device@3/sound-control@1:mux
19398662    0 crw-------   1 user staff     37,   1 May  2 09:44
/devices/pci@0,0/pci108e,5347@4:sound,audioctl
19398660    0 crw-------   1 user staff     37,   0 May  2 09:44
/devices/pci@0,0/pci108e,5347@4:sound,audio
The raw audio devices are definitely pointing to the usb hub. My friend says he doesn't have any speakers on the usb hub. He does have a webcam though. I ask him to remove the webcam. Maybe we can get devfsadm to find the correct devices.
bash-3.00# devfsadm -v
bash-3.00# 
Nope. No changes made by devfsadm. Ok, we can force devfsadm to make a move by removing those links.
(cd /dev)
bash-3.00# mv audio audio.old 
bash-3.00# mv audioctl autioctl.old 
See if devfsadm picks up the right device
bash-3.00# devfsadm -v  
devfsadm[14694]: verbose: symlink /dev/audio -> usb/audio0
devfsadm[14694]: verbose: symlink /dev/audioctl -> usb/audio-control0
Nope, devfsadm put back exactly what we removed. So we'll remove it again.
bash-3.00# rm audio audioctl 
Ok, I'll change the links to point to the devices on the PCI hub like the OpenSolaris system down the hall.
bash-3.00# ln -s ../../devices/pci@0,0/pci108e,5347@4:sound,audio audio 
bash-3.00# ln -s ../../devices/pci@0,0/pci108e,5347@4:sound,audioctl audioctl
Check that devfsadm won't undo our work:
bash-3.00# devfsadm -vs 
And as luck would have it, yes, it didn't undo our work.
Check links:
bash-3.00# ls -l *au*
lrwxrwxrwx   1 root     root          48 May 24 16:40 audio ->
../../devices/pci@0,0/pci108e,5347@4:sound,audio
lrwxrwxrwx   1 root     root          10 May  2 11:50 audio.old -> usb/audio0
lrwxrwxrwx   1 root     root          51 May 24 16:40 audioctl ->
../../devices/pci@0,0/pci108e,5347@4:sound,audioctl
lrwxrwxrwx   1 root     root          18 May  2 11:50 autioctl.old ->
usb/audio-control0
The driver loaded in the kernel was still using the old devices. Success! After a reboot the system picked up the new device files, and my co-worker had sound!

It's worth mentioning that mixer(7I) and usb_ac(7D) have futher details on how to tweak usb audio. Also, /kernel/drv/usb_ac.conf will allow usb audio to be enabled or disabled.

Dynamic Reconfiguration on a v1280

Dynamic reconfiguration on a v1280

A while back I ran across a v1280 that threw some cpu errors, and needed to have it's system board replaced before it caused unplanned maintenance. The server was running Solaris 10. We use dynamic reconfiguration to remove the system board while the server was running to minimize downtime.

Today we'll look at dynamic reconfiguration on a v1280. We'll refer to dynamic reconfiguration as DR from here on. The same process will work for 38/48 and 6800s. The commands are firmware dependent, not OS specific. (Check the admin guide in the firmware docs for more details on DR as well as use the most current version of cfgadm, core kernel, and fault management if applicable.) This works best when you are logged in to the System Controler and have a shell on the system.
I don't have any x900 series to try commands on. The same caveats apply; the commands will work better with the latest software. Also the newer the software, the more automatic and better intergrated the DR features will be with SMF and FM. Also here is a developer blog summary on cfgadm titled "A Little Bit About cfgadm(1M)"

  1. Identify the system board you need to replace. Send applicible logs (dmesg, /var/adm/messages, showlogs from the sc, fmdump, , fmadm faulty -a) to support. Support will have confirmed that the errors you sent in are valid, and are severe enough to have the board replaced.
  2. Check if the system board in question is holding the main memory and kernel.
    cfgadm -av | grep permanent
    N0.SB0::memory                 connected    configured   ok         base
    address 0x0, 8388608 KBytes total, 586400 KBytes permanent
    
    If it's not, you can DR with no impact to the running OS. If it isn't you have to wait for the firmware to copy the kernel to the other board.
  3. Unconfigure the board so it is no longer useable from the system. (When I did it on this board it took about three minutes for the copy to complete.)
    # cfgadm -c unconfigure N0.SB0
    System may be temporarily suspended, proceed (yes/no)? yes
    
  4. Check that system board SB0 shows up as unconfigured.
    # cfgadm -av
    Ap_Id                          Receptacle   Occupant     Condition  Information
    When         Type         Busy     Phys_Id
    N0.IB6                         connected    configured   ok powered-on, assigned
    Jun  3 09:30 PCI_I/O_Boa  n        /devices/ssm@0,0:N0.IB6
    N0.IB6::pci0                   connected    configured   ok         device
    /ssm@0,0/pci@19,700000
    Jun  3 09:30 io           n        /devices/ssm@0,0:N0.IB6::pci0
    N0.IB6::pci1                   connected    configured   ok         device
    /ssm@0,0/pci@19,600000, referenced
    Jun  3 09:30 io           n        /devices/ssm@0,0:N0.IB6::pci1
    N0.IB6::pci2                   connected    configured   ok         device
    /ssm@0,0/pci@18,700000, referenced
    Jun  3 09:30 io           n        /devices/ssm@0,0:N0.IB6::pci2
    N0.IB6::pci3                   connected    configured   ok         device
    /ssm@0,0/pci@18,600000, referenced
    Jun  3 09:30 io           n        /devices/ssm@0,0:N0.IB6::pci3
    N0.SB0                         connected    unconfigured ok
    powered-on, assigned
    Jun  3 10:16 CPU          n        /devices/ssm@0,0:N0.SB0
    N0.SB0::cpu0                   connected    unconfigured ok         cpuid
    0, speed 900 MHz, ecache 8 MBytes
    Jun  3 10:16 cpu          n        /devices/ssm@0,0:N0.SB0::cpu0
    N0.SB0::cpu1                   connected    unconfigured ok         cpuid
    1, speed 900 MHz, ecache 8 MBytes
    Jun  3 10:16 cpu          n        /devices/ssm@0,0:N0.SB0::cpu1
    N0.SB0::cpu2                   connected    unconfigured ok         cpuid
    2, speed 900 MHz, ecache 8 MBytes
    Jun  3 10:16 cpu          n        /devices/ssm@0,0:N0.SB0::cpu2
    N0.SB0::cpu3                   connected    unconfigured ok         cpuid
    3, speed 900 MHz, ecache 8 MBytes
    Jun  3 10:16 cpu          n        /devices/ssm@0,0:N0.SB0::cpu3
    N0.SB0::memory                 connected    unconfigured ok         base
    address 0x2000000000, 8388608 KBytes total
    Jun  3 10:16 memory       n        /devices/ssm@0,0:N0.SB0::memory
    
    N0.SB2                         connected    configured   ok
    powered-on, assigned
    Jun  3 09:30 CPU          n        /devices/ssm@0,0:N0.SB2
    N0.SB2::cpu0                   connected    configured   ok         cpuid
    8, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu0
    N0.SB2::cpu1                   connected    configured   ok         cpuid
    9, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu1
    N0.SB2::cpu2                   connected    configured   ok         cpuid
    10, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu2
    N0.SB2::cpu3                   connected    configured   ok         cpuid
    11, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu3
    N0.SB2::memory                 connected    configured   ok         base
    address 0x0, 8388608 KBytes total, 586400 KBytes permanent
    Jun  3 10:16 memory       n        /devices/ssm@0,0:N0.SB2::memory
    N0.SB4                         empty        unconfigured unknown    assigned
    Jun  3 09:30 unknown      n        /devices/ssm@0,0:N0.SB4
    c0                             connected    configured   unknown
    unavailable  scsi-bus     n        /devices/ssm@0,0/pci@18,700000/ide@3:scsi
    c0::dsk/c0t0d0                 connected    configured   unknown    TOSHIBA
    DVD-ROM SD-C2612
    unavailable  CD-ROM       n
    /devices/ssm@0,0/pci@18,700000/ide@3:scsi::dsk/c0t0d0
    c1                             connected    configured   unknown
    unavailable  scsi-bus     n        /devices/ssm@0,0/pci@18,600000/scsi@2:scsi
    c1::dsk/c1t0d0                 connected    configured   unknown    FUJITSU
    MAP3735N SUN72G
    unavailable  disk         n
    /devices/ssm@0,0/pci@18,600000/scsi@2:scsi::dsk/c1t0d0
    c1::dsk/c1t1d0                 connected    configured   unknown    FUJITSU
    MAG3182L SUN18G
    unavailable  disk         n
    /devices/ssm@0,0/pci@18,600000/scsi@2:scsi::dsk/c1t1d0
    c2                             connected    unconfigured unknown
    unavailable  scsi-bus     n        /devices/ssm@0,0/pci@18,600000/scsi@2,1:scsi
    c3                             connected    unconfigured unknown
    unavailable  fc           n
    /devices/ssm@0,0/pci@19,700000/SUNW,qlc@2/fp@0,0:fc
    
  5. Then run cfgadm with the disconnect option to power off the board. (This takes less than a minute to complete.)
    # cfgadm -c disconnect N0.SB0
    
  6. Next check the system board's status. This time it says unconnected and unconfigured. Also, from the SC prompt, if you run showboards it will display Off, although I didn't do that step.
    # cfgadm -av
    Ap_Id                          Receptacle   Occupant     Condition  Information
    When         Type         Busy     Phys_Id
    ...
    
    N0.SB0                         disconnected unconfigured unknown    assigned
    Jun  3 10:19 CPU          n        /devices/ssm@0,0:N0.SB0
    
    N0.SB2                         connected    configured   ok
    powered-on, assigned
    Jun  3 09:30 CPU          n        /devices/ssm@0,0:N0.SB2
    N0.SB2::cpu0                   connected    configured   ok         cpuid
    8, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu0
    N0.SB2::cpu1                   connected    configured   ok         cpuid
    9, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu1
    N0.SB2::cpu2                   connected    configured   ok         cpuid
    10, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu2
    N0.SB2::cpu3                   connected    configured   ok         cpuid
    11, speed 900 MHz, ecache 8 MBytes
    Jun  3 09:30 cpu          n        /devices/ssm@0,0:N0.SB2::cpu3
    N0.SB2::memory                 connected    configured   ok         base
    address 0x0, 8388608 KBytes total, 586400 KBytes permanent
    Jun  3 10:16 memory       n        /devices/ssm@0,0:N0.SB2::memory
    ...
    
    Now it's ready for replacement. No need to bring the system down. At the tail end of the unconfig command this message appeared on the system controler due to the reconfiguration.
    lom>{/N0/SB0/P0}     test case reset reason = 00000001.0404ff05
    {/N0/SB0/P0}     test case ecache_size=00000000.00800000,
    tag_size=00000000.00004000
    {/N0/SB0/P0}     test case Ecache Mode: 4:4:4
    {/N0/SB0/P0}     test case E$ control register = 00000000.07a34c00
    {/N0/SB0/P0} Controller PCI Config Space Test for aid 0x18
    {/N0/SB0/P0} Subtest: IDE Controller Bus Probe for aid 0x18
    {/N0/SB0/P0}    Removable ATAPI device, TOSHIBA DVD-ROM SD-C2612
    
    {/N0/SB0/P0} Subtest: SCSI Controller PCI Config Space Test for aid 0x18
    {/N0/SB0/P0} Subtest: SCSI Controller Register Test for aid 0x18
    {/N0/SB0/P0} Subtest: SCSI Controller SCRIPTS RAM Test for aid 0x18
    {/N0/SB0/P0} Subtest: SCSI Controller SCSI Timers Test for aid 0x18
    {/N0/SB0/P0} Subtest: SCSI Controller DMA Test for aid 0x18
    {/N0/SB0/P0} Subtest: PCI IO Controller Register Initialization for aid 0x19
    {/N0/SB0/P0} Subtest: PCI IO Controller IOMMU  TLB Compare Tests for aid 0x19
    {/N0/SB0/P0} Subtest: PCI IO Controller IOMMU TLB Flush Tests for aid 0x19
    {/N0/SB0/P0} Subtest: PCI IO Controller DMA loopback Tests for aid 0x19
    {/N0/SB0/P0} Subtest: PC    test case IoSram Add : 0000041c.00900000
    {/N0/SB0/P0}     test case Estate = 00000000.0000000b
    {/N0/SB0/P0}     test case Ecache control = 00000000.07a34c00
    {/N0/SB0/P0}     test case CPU features = 0000224f.004204ff
    {/N0/SB0/P0}     test case After setting CPU features, DCU = 0000ee00.0000000f
    {/N0/SB0/P0}     test case DCR = 00000000.0000103f
    {/N0/SB0/P1}     test case reset reason = 00000000.0404ff05
    {/N0/SB0/P1}     test case ecache_size=00000000.00800000,
    tag_size=00000000.00004000
    {/N0/SB0/P1}     test case Ecache Mode: 4:4:4
    {/N0/SB0/P1}     test case E$ control register = 00000000.07a34c00
    {/N0/SB0/P1} @(#) lpost         5.18.1      test case IoSram Add :
    0000041c.00900000
    {/N0/SB0/P1}     test case Estate = 00000000.0000000b
    {/N0/SB0/P1}     test case Ecache control = 00000000.07a34c00
    {/N0/SB0/P1}     test case CPU features = 0000224f.004204ff
    {/N0/SB0/P1}     test case After setting CPU features, DCU = 0000ee00.0000000f
    {/N0/SB0/P1}     test case DCR = 00000000.0000103f
    {/N0/SB0/P2} @(#) lpost         5.18.1  {/N0/SB0/P3} @(#) lpost
    5.18.1
    
  7. After the board is replaced verify that the new board is running the same firmware. From the SC run
    lom>showboards -p prom
    
    Component   Compatible Version
    ---------   ---------- -------
    SSC1        Reference  5.18.1 Build_01
    /N0/IB6     Yes        5.18.1 Build_01
    /N0/SB0     Yes        5.18.1 Build_01
    /N0/SB2     Yes        5.18.1 Build_01
    
    If they aren't all running the same version, you need to upgrade the firmware. (use flashupdate from the LOM prompt, see docs.)
  8. If they are the same, then from the OS run configure to configure the board for use.
    cfgadm -c configure
    At the LOM prompt you will see a bunch of messages as the board runs through post, before it's finally brought back on line to be usable for Solaris.
    lom>{/N0/SB0/P0} Running CPU POR and Set Clocks
    {/N0/SB0/P2} Running CPU POR and Set Clocks
    {/N0/SB0/P1} Running CPU POR and Set Clocks
    {/N0/SB0/P3} Running CPU POR and Set Clocks
    {/N0/SB0/P0} @(#) lpost         5.18.1  2004/12/09 12:32
    {/N0/SB0/P2} @(#) lpost         5.18.1  2004/12/09 12:32
    {/N0/SB0/P1} @(#) lpost         5.18.1  2004/12/09 12:32
    {/N0/SB0/P3} @(#) lpost         5.18.1  2004/12/09 12:32
    {/N0/SB0/P0} Copyright 2001-2004 Sun Microsystems, Inc.  All rights reserved.
    {/N0/SB0/P1} Copyright 2001-2004 Sun Microsystems, Inc.  All rights reserved.
    {/N0/SB0/P2} Copyright 2001-2004 Sun Microsystems, Inc.  All rights reserved.
    {/N0/SB0/P3} Copyright 2001-2004 Sun Microsystems, Inc.  All rights reserved.
    {/N0/SB0/P0} Use is subject to license terms.
    {/N0/SB0/P2} Use is subject to license terms.
    {/N0/SB0/P1} Use is subject to license terms.
    ...
    
    Also there will be messages from the console ...
    Jun  3 10:16:48 j1hol-1280 genunix: [ID 408114 kern.info]
    /ssm@0,0/memory-controller@0,400000 (mc-us30) offline
    ...
    Jun  3 12:25:26 j1hol-1280 lw8: [ID 477720 kern.notice] SB0, hotplug
    status, SB0, module removed (9,16)
    Jun  3 12:58:29 j1hol-1280 lw8: [ID 328834 kern.notice] /N0/SB0, hotplug
    status, SB0, module inserted (9,17)
    Jun  3 13:00:07 j1hol-1280 sgsbbc: [ID 402060 kern.notice] NOTICE: Timed
    out waiting for SC response
    Jun  3 13:00:08 j1hol-1280 last message repeated 2 times
    Jun  3 13:00:08 j1hol-1280 picld[107]: [ID 653604 daemon.error] sgfru ioctl
    0xf handle 0xe failed: Connection timed out
    Jun  3 13:00:37 j1hol-1280 sgsbbc: [ID 402060 kern.notice] NOTICE: Timed
    out waiting for SC response
    Jun  3 13:02:03 j1hol-1280 last message repeated 1 time
    Jun  3 13:04:03 j1hol-1280 sgsbbc: [ID 402060 kern.notice] NOTICE: Timed
    out waiting for SC response
    Jun  3 13:05:18 j1hol-1280 unix: [ID 950921 kern.info] cpu0:
    UltraSPARC-III+ (portid 0 impl 0x15 ver 0x23 clock 900 MHz)
    ....
    Jun  3 13:05:37 j1hol-1280 sbdp: [ID 713682 kern.info] cpu3 initialization
    complete - restarted
    Jun  3 13:05:42 j1hol-1280 unix: [ID 700753 kern.info]
    kphysm_add_memory_dynamic: adding 8388608K at 0x2000000000
    Jun  3 13:05:46 j1hol-1280 unix: [ID 323408 kern.info]
    kphysm_add_memory_dynamic: mem = 16777216K (0x400000000)
    Jun  3 13:05:46 j1hol-1280 unix: [ID 401001 kern.info]
    kphysm_add_memory_dynamic: avail mem = 16237010944
    Jun  3 13:05:46 j1hol-1280 ssm: [ID 349649 kern.info] memory-controller0 at
    ssm0: Node 0 Safari id 0 0x400000 ...
    Jun  3 13:05:46 j1hol-1280 genunix: [ID 936769 kern.info] mc-us30 is
    /ssm@0,0/memory-controller@0,400000
    Jun  3 13:05:46 j1hol-1280 genunix: [ID 408114 kern.info]
    /ssm@0,0/memory-controller@0,400000 (mc-us30) online
    
And it's done. The domain did experience another pause when it copied the kernel resident memory back to system board 0.

Calendar

Feeds

Search

Links

Navigation

Referrers