« January 2007 »
SunMonTueWedThuFriSat
 
1
2
3
6
7
16
20
27
30
31
   
       
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20070114 Sunday January 14, 2007
Dualboot into WinXP is hosed

So I can boot back into Solaris just fine, but if I select my Windows partition it will not boot. I get:

A disk read error occured
Press Ctrl+Alt+Del to restart

Now I'm actually okay with this for right now. Normally I'd be a bit put off, you are supposed to install WinXP first and then your other OS. But, with the trick of booting up in single user mode from the DVD and then using the installgrub tip we learned from Derek (Solaris 11 GRUB), I'm willing to try to fix the WinXP partition and then recover. If not, it just means we need to restart the install process.

The two paths are:

  1. Reinstall WinXP into the partition.
  2. Try to recover the WinXP partition.

Well, in my mind, even if the recovery fails, we are back to the first path. And before we then retry to install the Solaris partition, we can try to fix the MBR.

The evil thought in the back of my mind is that the WinXP registration process has nuked my install to teach me to not pirate software.

Okay, I did some reading, it could be a bad cable or a too large HD. Yes, in spite of Solaris booting okay. I'm going to add the ATA/133 card into the system. I'm unhappy with the DVD being on the same path as the ATA drive anyway.

We can see what a mess it is back there:

Not shown

The ribbon could be twisted too much for WinXP. Also, it tends to end up back in the cooler fan.

I take this time to add a Soundblaster card:

Not shown

I take the ATA drive off of the cable and I am neatly able to tuck it up in the unused space above the DVD:

Not shown

With the controller added, both PCI slots are now in use:

Not shown

And we can see the ribbon cables on the disks:

Not shown

At this point I am expecting two problems:

  1. The BIOS won't find the disk to boot from, and
  2. The Solaris partition will fail to boot because we've changed the disk's device name effectively.

And I am right on both counts. The first is easy to solve, thanks to the capabilities of the BIOS. The second I'll deal with when I fix the MBR.

But this did not fix the root problem, so I'm about to try and repair the WinXP partition. fixmbr worked, but chkdsk -r refused to do anything. I'll reboot to see if the MBR was fixed enough, but if not, time for a fresh install. Hmm, it booted into grub.

Okay, I'll add a new entry once I get WinXP reinstalled and I try to fix the Solaris booting.


Originally posted on Kool Aid Served Daily
Copyright (C) 2007, Kool Aid Served Daily
Creating a zfs pool, adding some accounts, etc

Okay, the system booted after being off all night. Yes, this is a concern for me because of the labeling problems. Last year this step failed.

We want to create a large pool, so we need to find out what is available to us:

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c1d0 <DEFAULT cyl 9565 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@c/ide@0/cmdk@0,0
       1. c2d0 <DEFAULT cyl 30398 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@d/ide@0/cmdk@0,0
       2. c3d0 <DEFAULT cyl 30398 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@d/ide@1/cmdk@0,0
       3. c4d0 <DEFAULT cyl 30398 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@d,1/ide@0/cmdk@0,0
       4. c5d0 <DEFAULT cyl 30398 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@d,1/ide@1/cmdk@0,0
Specify disk (enter its number): ^D

And now we can create a pool (with raidz) for playing with. Note that I've given the pool the entire disks and I don't have a spare. This isn't a production system. I'm also not worried about silent data loss. These are all things I would challange in a setting where I cared about my data. But, if you think about it, most home desktops have been running this way for years.

# zpool create zoo raidz c2d0 c3d0 c4d0 c5d0
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
zoo                     928G    147K    928G     0%  ONLINE     -
# zfs create zoo/isos
# zfs create zoo/home
# zfs set mountpoint=/export/zfs zoo/home
# zfs set sharenfs=on zoo/home
# zfs set compression=on zoo/home
# zfs create zoo/home/nfsv2
# zfs create zoo/home/nfsv3
# zfs create zoo/home/nfsv4
# zfs create zoo/home/tdh
# zfs list
NAME             USED  AVAIL  REFER  MOUNTPOINT
zoo              376K   683G  38.2K  /zoo
zoo/home         189K   683G  42.6K  /export/zfs
zoo/home/nfsv2  36.7K   683G  36.7K  /export/zfs/nfsv2
zoo/home/nfsv3  36.7K   683G  36.7K  /export/zfs/nfsv3
zoo/home/nfsv4  36.7K   683G  36.7K  /export/zfs/nfsv4
zoo/home/tdh    36.7K   683G  36.7K  /export/zfs/tdh
zoo/isos        36.7K   683G  36.7K  /zoo/isos

I didn't show it explicitly, but the NFS server was not yet enabled. You can do it with svcadm(1M) or count on the fact that either issuing a share(1M) command or setting the sharenfs property on a ZFS filesystem will cause the service and server to be started. We can check this on the server:

# share
-@zoo/home      /export/zfs   rw   ""
-@zoo/home      /export/zfs/nfsv2   rw   ""
-@zoo/home      /export/zfs/nfsv3   rw   ""
-@zoo/home      /export/zfs/nfsv4   rw   ""
-@zoo/home      /export/zfs/tdh   rw   ""
# zfs set sharenfs=on zoo/isos
# share
-@zoo/isos      /zoo/isos   rw   ""
-@zoo/home      /export/zfs   rw   ""
-@zoo/home      /export/zfs/nfsv2   rw   ""
-@zoo/home      /export/zfs/nfsv3   rw   ""
-@zoo/home      /export/zfs/nfsv4   rw   ""
-@zoo/home      /export/zfs/tdh   rw   ""

And we can see that NFS gets automatically enabled by checking from a client:

[tdh@adept ~/tmp]> showmount -e kanigix
Export list for kanigix:
/export/zfs       (everyone)
/export/zfs/nfsv2 (everyone)
/export/zfs/nfsv3 (everyone)
/export/zfs/nfsv4 (everyone)
/export/zfs/tdh   (everyone)
/zoo/isos         (everyone)

We can create some user accounts. First I add the following to /etc/group:

users:x:100:

And then the following users are created:

# useradd -m -u 1094 -g 100 -c "Mr. NFSv2" -d /export/zfs/nfsv2 nfsv2
# useradd -m -u 1813 -g 100 -c "Mr. NFSv3" -d /export/zfs/nfsv3 nfsv3
# useradd -m -u 3530 -g 100 -c "Mr. NFSv4" -d /export/zfs/nfsv4 nfsv4
# useradd -m -u 1066 -g 10 -c "Tom Haynes" -d /export/zfs/tdh tdh

Note I could connect to my NIS server to get this stuff, but I prefer some local accounts.

I forgot to do my account such that I get tcsh as a shell:

useradd -m -u 1066 -g 10 -c "Tom Haynes" -s /bin/tcsh -d /export/zfs/tdh tdh

No biggie, I can edit that in /etc/passwd directly.

I use gid 10, staff, for granting sudo permissions for not providing a password. I then use gid 100, users, for having to provide a password. It lets me know when I'm in the wrong role. I've never learned the RBAC stuff.

Let's get my environment over there:

[tdh@adept ~]> scp .tcshrc kanigix:/export/zfs/tdh
Password:

Whoops, it won't take a blank password. Need to set one up:

# passwd tdh
New Password:
Re-enter new Password:
passwd: password successfully changed for tdh

And back on the other box:

[tdh@adept ~]> scp .tcshrc kanigix:/export/zfs/tdh
Password:
scp: /export/zfs/tdh/.tcshrc: Permission denied

What is up with that? Even if the uids are different on the two boxes, it shouldn't matter. ssh uses the string names. We need to look at the permissions on the server:

# ls -la /export/zfs
total 22
drwxr-xr-x   6 root     sys            6 Jan 14 14:11 .
drwxr-xr-x   4 root     sys          512 Jan 14 14:10 ..
drwxr-xr-x   2 root     sys            2 Jan 14 14:10 nfsv2
drwxr-xr-x   2 root     sys            2 Jan 14 14:10 nfsv3
drwxr-xr-x   2 root     sys            2 Jan 14 14:11 nfsv4
drwxr-xr-x   2 root     sys            2 Jan 14 14:11 tdh
# chown -R nfsv2:users /export/zfs/nfsv2
# chown -R nfsv3:users /export/zfs/nfsv3
# chown -R nfsv4:users /export/zfs/nfsv4
# chown -R tdh:staff /export/zfs/tdh
# ls -la /export/zfs
total 22
drwxr-xr-x   6 root     sys            6 Jan 14 14:11 .
drwxr-xr-x   4 root     sys          512 Jan 14 14:10 ..
drwxr-xr-x   2 nfsv2    users          2 Jan 14 14:10 nfsv2
drwxr-xr-x   2 nfsv3    users          2 Jan 14 14:10 nfsv3
drwxr-xr-x   2 nfsv4    users          2 Jan 14 14:11 nfsv4
drwxr-xr-x   2 tdh      staff          2 Jan 14 14:11 tdh

And now:

[tdh@adept ~]> scp .tcshrc kanigix:/export/zfs/tdh
Password:
.tcshrc                                                                                            100% 5417     5.3KB/s   00:00

Can we get there?

[tdh@adept ~]> ssh kanigix
Password:
Last login: Sun Jan 14 14:24:24 2007 from adept.internal.
Sun Microsystems Inc.   SunOS 5.11      snv_55  October 2007
[tdh@kanigix ~]> ls -la
total 16
drwxr-xr-x   2 tdh      staff          3 Jan 14 14:24 .
drwxr-xr-x   6 root     sys            6 Jan 14 14:11 ..
-rw-------   1 tdh      staff       5417 Jan 14 14:24 .tcshrc

What would have happened if we hadn't fixed the permissions?

# zfs create zoo/home/monster
# useradd -m -u 2025 -g 100 -c "The Monster" -s /bin/tcsh -d /export/zfs/monster monster
# ls -la /export/zfs/monster
total 8
drwxr-xr-x   2 root     sys            2 Jan 14 14:25 .
drwxr-xr-x   7 root     sys            7 Jan 14 14:25 ..
# passwd monster
New Password:
Re-enter new Password:
passwd: password successfully changed for monster

And from the client:

[tdh@adept ~]> ssh moster@kanigix
Password:
Password:
Password:

[tdh@adept ~]> ssh monster@kanigix
Password:
Last login: Sun Jan 14 14:27:11 2007 from adept.internal.
Sun Microsystems Inc.   SunOS 5.11      snv_55  October 2007
> touch foo
touch: foo cannot create

Notice there is no indication that moster is not a valid account.

I was expecting that perhaps we wouldn't be able to login to that directory. Put the permissions allowed us in. If we play with them a bit:

# chmod go-rwx /export/zfs/monster
# ls -la /export/zfs/monster
total 8
drwx------   2 root     sys            2 Jan 14 14:25 .
drwxr-xr-x   7 root     sys            7 Jan 14 14:25 ..

We end up getting bounced:

[tdh@adept ~]> ssh monster@kanigix
Password:
Last login: Sun Jan 14 14:27:11 2007 from adept.internal.
Could not chdir to home directory /export/zfs/monster: Permission denied
Sun Microsystems Inc.   SunOS 5.11      snv_55  October 2007
> pwd
/

Back to the zfs stuff. Time to reboot and see if I still have the pool. Note, with any other set of disks, I wouldn't even question this part. But after my experiences with them, I'm a doubter.

And no problems. After the reboot:

# df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c1d0s0         63G   5.2G    57G     9%    /
/devices                 0K     0K     0K     0%    /devices
/dev                     0K     0K     0K     0%    /dev
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                   4.3G   812K   4.3G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
/usr/lib/libc/libc_hwcap2.so.1
                        63G   5.2G    57G     9%    /lib/libc.so.1
fd                       0K     0K     0K     0%    /dev/fd
swap                   4.3G    40K   4.3G     1%    /tmp
swap                   4.3G    40K   4.3G     1%    /var/run
/dev/dsk/c1d0s7        6.7G   6.8M   6.6G     1%    /export/home
zoo/home               683G    44K   683G     1%    /export/zfs
zoo/home/monster       683G    36K   683G     1%    /export/zfs/monster
zoo/home/nfsv2         683G    36K   683G     1%    /export/zfs/nfsv2
zoo/home/nfsv3         683G    36K   683G     1%    /export/zfs/nfsv3
zoo/home/nfsv4         683G    36K   683G     1%    /export/zfs/nfsv4
zoo/home/tdh           683G    40K   683G     1%    /export/zfs/tdh
zoo                    683G    38K   683G     1%    /zoo
zoo/isos               683G    36K   683G     1%    /zoo/isos

Orginally posted on Kool Aid Served Daily
Copyright (C) 2007, Kool Aid Served Daily
BAD PBR sig and Solaris

Okay, Solaris is installed, and we reboot the system. When it comes back up, it hangs and my stomach drops. Okay, it can't get DHCP on nge0 - probably a missing driver. No biggie. When loading the devices, it complains about the labels again on the SATA drives. Again, not an issue. Just booting is a big win.

Okay, the system is not on the network.

I go into format and get the SATA drives into shape. Basically I think I did a backup to get the EFI label loaded. I then did an fdisk to change the type. I then ran partition to give the bulk of the data to the first slice. Note that if I didn't do the backup step, I got a funky partition table.

Also, the fact that these drives were messed up is something unique to me. Most people would not suffer what I am about to go through.

Okay, I'm not going to install zfs just yet. I want to reboot and see if these new labels are hunky-dory. I power the system down (an issue I had with wont, so I wanted to make sure I had the power off) and took the DVD out. I also removed the post board. In my mind, I was getting close to wrapping it all up. I also added back the right side panel.

Power the puppy on, oh by the way, I went through will, phantom, corsair, and finally settled on kanigix as the name of the box. Okay, where were we? Oh yes, stuck on the dreaded BAD PBR sig. And did I mention I went back to my USB keyboard?

I got this on wont last year and I got sick. A quick search turned up: Re: [s-x86] bad PBR Sig. Okay, I rebooted and noticed that I couldn't see the boot drive in the list of attached drives. The cable was loose from when I took out the DVD. A quick fix and I still got the bad PBR sig. I also told the BIOS to no longer boot from the CDROM. No luck again.

I think I know what is going on. When I was fixing the SATA drives, they were being marked as the Active disk/partition. I'm not sure it mentions that disk is the boot disk. I'm pretty sure that the last SATA drive is what the system is trying to boot from.

Screw it all, I'm going to put that loud DVD drive in the case for right now. Okay, we need to pop off one of the black faceplates:

Not shown

And now we start twisting the metal plate:

Not shown

And now we have a place to put our drive:

Not shown

We need to line the rails up on the drive. We can figure out which row of screws by sliding the drive into the opening and eyeballing it.

Not shown

Okay, both rails are on. We want the latch to be about even with the edge of the drive:

Not shown

When we mount it, it looks like there is too much lip. Who cares, it will come out sooner or later.

Not shown

And we have to twist the hd ribbon to get the thing connected. Again, we don't want the DVD on the same chain as the HD. Oh well, for now it is okay.

Not shown

I don't want to reinstall everything. I don't want to use WinXP to change the boot drive. I want to use Solaris if possible. The MBR is being controlled by GRUB. And I don't want to fudge with that because I want to keep the system as dual-boot.

The version of GRUB I have doesn't seem to have a maintenance mode. Of course I find this out after I reboot and can't move about in the menu with my USB keyboard!

I needed to get into the grub menu and edit one of the selections. I wanted the kernel line to be this:

kernel /boot/multiboot kernel/unix -s

Which tells it to boot in single user mode.

I then had it mount the first drive. I went into format and visited all of the drives. Three of them were marked as Active. I fixed them all to not be Active except for the Solaris partition on the first drive.

Reboot and I get "No active partition". Well, I'm on the right track!

Regrub (press 'e' to edit any entry and press 'e' again to edit that line, hit return when done, and press 'b' to boot with your new change) and boot back up in single user mode. All of the drives are marked correctly in format.

Okay, can we rewrite the MBR? Derek things so here at Solaris 11 GRUB. (By the way, I forgot all about his great blog until Google.com revealed it again to me.)

# /sbin/installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0d0s0

Except I'm going to try:

# /a/sbin/installgrub -m /a/boot/grub/stage1 /a/boot/grub/stage2 /dev/rdsk/c1d0s1

Okay, another misstep. This time though, I select F12 at the bios prompt, which lets me pick the boot order and disk. So I get to the first disk and grub throws up a prompt. I think I messed up above and zapped the MBR.

Regrub and this time pay attention. When it boots in single user mode, it tells us that there is a Solaris installed in '/dev/dsk/c1d0s0'. So I was right to change the path, but not the slice. I do need:

# /a/sbin/installgrub -m /a/boot/grub/stage1 /a/boot/grub/stage2 /dev/rdsk/c1d0s0

Reboot and we get the same message about the "No active partition". But, reboot again and use F12 to get to the correct disk. And I get a login prompt. I think the system is trying to boot from one of the DVD drives and failing. I think clearing all of the active partitions earlier fixed things as far as the "bad PBR sig" went and if the DVDs had not been mounted, it would have booted.

A quick test is to change the boot order and reboot. And I am wrong. Okay, I can get the system to boot if I use F12 to get me to the correct disk. Hmm, I wonder if I have to tell the BIOS which one is the boot disk? YES! YES! YES! I found it. And the ATA drive was behind all of the SATA drives.

My guess is that before I fixed the labels in Solaris, the drives had not been showing up as bootable to the BIOS. Who knows what it saw with the corrupt labels? Anyway, I fixed the order and the system now boots again. The network is not up, but that is a battle I can fix when I drag myself out of bed in the afternoon. It is 4AM here.

I lied, I did a sys-unconfig and now kanigix is on the net!

# uname -a
SunOS kanigix 5.11 snv_55 i86pc i386 i86pc

When I fixed the labels, I had several which had active partitions, i.e., places we could boot from. When the BIOS added these in front of the ATA drive, it picked one to boot from and found no MBR. If I had changed the boot order of the disks first, I wouldn't have had all of this fun! Who knew?

It looks like I could have set the USB mode to 1.1 instead of 2.0 in the BIOS. That may have been a way to get by the install issue I was seeing. I'll try that later.

I lied again. I tried this and it still would not see the USB DVD to install from. Well, with the system coming up and networked, I can get the system information needed to file a bug.

A lot went on here and I am actually quite happy. I learned a lot about my new system and I've got working SATA drives. When I did wont, I couldn't get past this part and that was a big reason the machine ended up with my son. That and he needed to kill Rebel Scum or Imperial Plastic Soldiers. Don't knock the power of Lucas Arts in this house.


Originally posted on Kool Aid Served Daily
Copyright (C) 2007, Kool Aid Served Daily
Heat and noise, some rough estimates

As far as noise goes, the DVD is the loudest component in the system. It doesn't matter if it is the external one or the exposed internal one. When it isn't going, I can't hear the system over my desktop Shuttle: adept.

As far as heat, when it was installing and running WinXP, the CPU cooler was not warm to the touch. The graphics cooler was warm. I remember seeing the BIOS stating that the case temperature was 25C and the CPU core was at 27-28C. Both of the case fans and the cooler fan were going.

When I flipped it onto its side, the rear case fan and cooler fan did not turn on when I started to install Solaris. The HD ribbon was stopping the cooler fan. The CPU cooler was warm to the touch and so was the video cooler. After 5-10 minutes of the two fans being back on, the CPU cooler is not as warm to the touch.

I put a cheap digital thermometer over there - it had said room temperature was about 23C. After 10 minutes ontop of the DVD drive, it said 26.7C. Hmm, the DVD drive is pretty warm - not as warm as the video cooler, but warm.

I'll find a way to get at the temperature from Solaris.

Anyway, the Solaris install is done and I'm off to play with it. It is also 30 minutes later and the CPU cooler is not warm. The fans are working.


Orginally posted on Kool Aid Served Daily
Copyright (C) 2007, Kool Aid Served Daily
WinXP registration is braindead

I realized that the Solaris DVD I was trying to install from was corrupt. I made sure that this time it made its way into the trash bucket. I tried a known good image and it failed the same way.

So I turned the machine off and flipped it on its side. Note, be sure to either unplug the power cord in the back or if possible turn the PSU power switch off. I managed to press the front power button on through the front door.

I started trying to cable the DVD in and realized that the data ribbon was too short. I was going to either try another cable or just pull the HD out as well. Too many things perched on the case would just be too much:

Not shown

Anyway, you can notice in the above picture that the ribbon cable is upside down. Now I hadn't wanted to either flip the data ribbon or put the devices in a set master or slave mode. I didn't know what that would do to the disk device names under Solaris after I was done. But I was willing to fix this before I installed Solaris. So, we pull it off and see the correct side for connecting two drives:

Not shown

We flip it over, reconnect everything and we have an ad-hoc installation.

Not shown

Now we shouldn't be upset that beta-software didn't boot up correctly. Not only am I trying to install cutting edge beta bits, but I've got the set currently being tested internally (okay, I just saw b56 bits get posted late yesterday). You can be sure I'll feed my results back into the other developers.

But, here comes the braindead aspect of WinXP registration - I didn't get the Solaris DVD into the drive in time. I ended up back in WinXP. And it declared that my system had changed drastically since the installation. Let's see, possibly the HD changed locations on a chain, the external USB drive was gone, and I added an internal DVD drive. Yes, that would never happen unless someone was trying to pirate the OS.

Fine, I've got a legal copy of the software, I'll let it register itself again. And that means putting in the CD key again. And that means being insulted by the registration process which has decided I'm trying to install the license on too many machines in too short of a time. I can call Microsoft up to explain why I am not pirating software and please, could I get my registration reset. Please! Aargh is too polite for what I feel right now.

What are you supposed to do in a lab situation where you have to reinstall WinXP all the time on the same machine to get to a stock system? I guess you are supposed to ghost the drive and never upgrade?

For some good news, the Solaris installation is going along just fine. It saw the SATA drives and decided it did not like the labels. Hehe, I knew that from way back in b34. Anyway, I'm hoping once I get the system back up (hehe again, I'm hoping it boots with those drives in) that I can quickly put a ZFS filesystem in place on them.


Orginally posted on Kool Aid Served Daily
Copyright (C) 2007, Kool Aid Served Daily