Grub booting
Grub's really quite interesting for me so this will be another one I'll update regularly.
Sometimes questions come up on how to boot the machine, especially when the machine is having a bad day, This isn't a complete post by a long way yet but it will get you off to, what I believe, will be a good start. Try not to get over ambitious with grub to begin with, making mistakes on a lab/dev machine is better than a p45.
I suppose the prime assumption is that grub is already loaded, if not then read the grub manual (the link is in this text).
General Grub Stuff
---o---
once at the grub menu hit "c", this will drop you to the grub> command line interface from here you can:
1) find the most likely bootable disk
find /boot/grub/stage1 or findroot (depending on your version of Solaris and grub)
This returns a list of possible boot disks/slices in the (<disk>,<partition>,<slice>) format
2) install grub itself (risky if your not familiar with grub)
setup <disk> which is usually something like... hd0 hd1 etc
or
setup <disk,partition> which would look like hd0,0 for the first disk and the first partition on that disk
---o---
Booting whatever will boot
---o---
This is useful. I've taken this directly from the grub manual which can be found in the following link since I couldn't explain it better. The grub manual is definitely worth a good read.
http://www.gnu.org/software/grub/manual/grub.txt and is Copyright (C) 2004 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02111, USA
Booting fallback systems
----------------------------
GRUB supports a fallback mechanism of booting one or more other entries if a default boot entry fails. You can specify multiple fallback entries if you wish.
Suppose that you have three systems, `A', `B' and `C'. `A' is a system which you want to boot by default. `B' is a backup system which is supposed to boot safely. `C' is another backup system which is used in case where `B' is broken.
Then you may want GRUB to boot the first system which is bootable among `A', `B' and `C'. A configuration file can be written in this way:
Remember though that for ufs and zfs the root (hd0,0) entry and all that it relies on becomes root (hd0,0,a) or rootfs or similar (disk,partition,slice), I've written some examples of this in the boot and the boot archive section.
default saved # This is important!!!
timeout 10
fallback 1 2 # This is important!!!
title A
root (hd0,0)
kernel /kernel
savedefault fallback # This is important!!!
title B
root (hd1,0)
kernel /kernel
savedefault fallback # This is important!!!
title C
root (hd2,0)
kernel /kernel
savedefault
Note that `default saved' (*note default::), `fallback 1 2' and `savedefault fallback' are used. GRUB will boot a saved entry by default and save a fallback entry as next boot entry with this configuration.
When GRUB tries to boot `A', GRUB saves `1' as next boot entry, because the command `fallback' specifies that `1' is the first fallback entry. The entry `1' is `B', so GRUB will try to boot `B' at next boot time.
Likewise, when GRUB tries to boot `B', GRUB saves `2' as next boot entry, because `fallback' specifies `2' as next fallback entry. This makes sure that GRUB will boot `C' after booting `B'.
It is noteworthy that GRUB uses fallback entries both when GRUB itself fails in booting an entry and when `A' or `B' fails in starting
up your system. So this solution ensures that your system is started even if GRUB cannot find your kernel or if your kernel panics.
However, you need to run `grub-set-default' (*note Invoking grub-set-default::) when `A' starts correctly or you fix `A' after it crashes, since GRUB always sets next boot entry to a fallback entry. You should run this command in a startup script such as `rc.local' to boot `A' by default:
# grub-set-default 0
where `0' is the number of the boot entry for the system `A'.
If you want to see what is current default entry, you can look at the file `/boot/grub/default' (or `/grub/default' in some systems). Because this file is plain-text, you can just `cat' this file. But it is strongly recommended *not to modify this file directly*, because GRUB may fail in saving a default entry in this file, if you change this file in an unintended manner. Therefore, you should use
`grub-set-default' when you need to change the default entry.
---o---
Grub and ZFS
---o---
To boot from a ZFS root filesystem, the kernel$ or module$ commands must include "-B $ZFS-BOOTFS" to expand to the zfs-bootfs boot path.
If you see zfs_open failures that may well mean that the pool is offline or a file in the pool is missing.
example menu.lst entries for zfs boot
title s10x_u6wos_07b_zfs
findroot (BE_s10x_u6wos_07b_zfs,0,a)
bootfs rpool/ROOT/s10x_u6wos_07b_zfs
kernel$ /platform/i86pc/multiboot -B $ZFS-BOOTFS
module /platform/i86pc/boot_archive
title s10x_u6wos_07b_zfs failsafe
findroot (BE_s10x_u6wos_07b_zfs,0,a)
bootfs rpool/ROOT/s10x_u6wos_07b_zfs
kernel /boot/multiboot kernel/unix -s
module /boot/x86.miniroot-safe
Zfs and ufs are supported in grub as provided by Sun, you cannot expect standard grub to do the same.
---o---
Grub Boot Stages
---o---
Grub Stage1:
This attempts to read bios and understand the disk geometry, it also finds the disk where stage2 can be loaded from including the sector is starts at. It's also here that you might norice the first disk error (i/o error of some sort). If you have made some sort of error in your bios regarding disk geometry (and not allowed the bios to read that information from the disk itself) you may also see sector read errors and a failure to load stage2. Always check bios if you see read beyond or sector based errors before logging a call.
Grub Stage2
Once stage1 has found the stage2 sector it hands over entirely to stage2. at this point grub attempts to setup cdrom boot emulation, if this is a hard disk then cdrom boot emulation is discarded. grub then goes on to assess the drive geometry again and also attempts to cope with a few peculiarities than the various bios manufacturers have. It's best not to believe that all bios's are equal and certainly do not assume you have a bug free bios (you almost certainly do not).
So, once grub stage2 has gotten past fixing the vagaries of the various bios it goes on to attempt it's own version of booting (that is, getting a image ready to run). this is where multiboot is also setup along with trying to understand the potential images to load and pass off to. grub stage2 make a really decent attempt at reading the images and it does take time to both read and decompress large image files. don't forget, these image files have a lot in them, they have the kernel itself, kernel modules and configuration files. almost always they are compressed (this is why the stage2 dot's take so much longer than stage1's).
The grub stage2 boot also sets up any video output, don't expect too many fancy things, there's a limited amount you can expect to find without operating system drivers available.
Here also grub stage2 makes the effort to work with both new and aged kernels and their issues, it also checks memory size too. don't forget though that the issues known about are as good as the last gub patch or installation you installed.
Grub also provides for editing of the command line used to load the image, this is important since it cannot possibly know all the ways in which you prefer to boot. this part can make or break a boot. it takes a lot to get here and if you do not understand or clumsily use some parameter that you have spied on the internet somewhere don't expect it to always work. the ethos here is that you "understand your kernel and the parameters you can use to boot in the most efficient way for you".
---o---
Sometimes questions come up on how to boot the machine, especially when the machine is having a bad day, This isn't a complete post by a long way yet but it will get you off to, what I believe, will be a good start. Try not to get over ambitious with grub to begin with, making mistakes on a lab/dev machine is better than a p45.
I suppose the prime assumption is that grub is already loaded, if not then read the grub manual (the link is in this text).
General Grub Stuff
---o---
once at the grub menu hit "c", this will drop you to the grub> command line interface from here you can:
1) find the most likely bootable disk
find /boot/grub/stage1 or findroot (depending on your version of Solaris and grub)
This returns a list of possible boot disks/slices in the (<disk>,<partition>,<slice>) format
2) install grub itself (risky if your not familiar with grub)
setup <disk> which is usually something like... hd0 hd1 etc
or
setup <disk,partition> which would look like hd0,0 for the first disk and the first partition on that disk
---o---
Booting whatever will boot
---o---
This is useful. I've taken this directly from the grub manual which can be found in the following link since I couldn't explain it better. The grub manual is definitely worth a good read.
http://www.gnu.org/software/grub/manual/grub.txt and is Copyright (C) 2004 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02111, USA
Booting fallback systems
----------------------------
GRUB supports a fallback mechanism of booting one or more other entries if a default boot entry fails. You can specify multiple fallback entries if you wish.
Suppose that you have three systems, `A', `B' and `C'. `A' is a system which you want to boot by default. `B' is a backup system which is supposed to boot safely. `C' is another backup system which is used in case where `B' is broken.
Then you may want GRUB to boot the first system which is bootable among `A', `B' and `C'. A configuration file can be written in this way:
Remember though that for ufs and zfs the root (hd0,0) entry and all that it relies on becomes root (hd0,0,a) or rootfs or similar (disk,partition,slice), I've written some examples of this in the boot and the boot archive section.
default saved # This is important!!!
timeout 10
fallback 1 2 # This is important!!!
title A
root (hd0,0)
kernel /kernel
savedefault fallback # This is important!!!
title B
root (hd1,0)
kernel /kernel
savedefault fallback # This is important!!!
title C
root (hd2,0)
kernel /kernel
savedefault
Note that `default saved' (*note default::), `fallback 1 2' and `savedefault fallback' are used. GRUB will boot a saved entry by default and save a fallback entry as next boot entry with this configuration.
When GRUB tries to boot `A', GRUB saves `1' as next boot entry, because the command `fallback' specifies that `1' is the first fallback entry. The entry `1' is `B', so GRUB will try to boot `B' at next boot time.
Likewise, when GRUB tries to boot `B', GRUB saves `2' as next boot entry, because `fallback' specifies `2' as next fallback entry. This makes sure that GRUB will boot `C' after booting `B'.
It is noteworthy that GRUB uses fallback entries both when GRUB itself fails in booting an entry and when `A' or `B' fails in starting
up your system. So this solution ensures that your system is started even if GRUB cannot find your kernel or if your kernel panics.
However, you need to run `grub-set-default' (*note Invoking grub-set-default::) when `A' starts correctly or you fix `A' after it crashes, since GRUB always sets next boot entry to a fallback entry. You should run this command in a startup script such as `rc.local' to boot `A' by default:
# grub-set-default 0
where `0' is the number of the boot entry for the system `A'.
If you want to see what is current default entry, you can look at the file `/boot/grub/default' (or `/grub/default' in some systems). Because this file is plain-text, you can just `cat' this file. But it is strongly recommended *not to modify this file directly*, because GRUB may fail in saving a default entry in this file, if you change this file in an unintended manner. Therefore, you should use
`grub-set-default' when you need to change the default entry.
---o---
Grub and ZFS
---o---
To boot from a ZFS root filesystem, the kernel$ or module$ commands must include "-B $ZFS-BOOTFS" to expand to the zfs-bootfs boot path.
If you see zfs_open failures that may well mean that the pool is offline or a file in the pool is missing.
example menu.lst entries for zfs boot
title s10x_u6wos_07b_zfs
findroot (BE_s10x_u6wos_07b_zfs,0,a)
bootfs rpool/ROOT/s10x_u6wos_07b_zfs
kernel$ /platform/i86pc/multiboot -B $ZFS-BOOTFS
module /platform/i86pc/boot_archive
title s10x_u6wos_07b_zfs failsafe
findroot (BE_s10x_u6wos_07b_zfs,0,a)
bootfs rpool/ROOT/s10x_u6wos_07b_zfs
kernel /boot/multiboot kernel/unix -s
module /boot/x86.miniroot-safe
Zfs and ufs are supported in grub as provided by Sun, you cannot expect standard grub to do the same.
---o---
Grub Boot Stages
---o---
Grub Stage1:
This attempts to read bios and understand the disk geometry, it also finds the disk where stage2 can be loaded from including the sector is starts at. It's also here that you might norice the first disk error (i/o error of some sort). If you have made some sort of error in your bios regarding disk geometry (and not allowed the bios to read that information from the disk itself) you may also see sector read errors and a failure to load stage2. Always check bios if you see read beyond or sector based errors before logging a call.
Grub Stage2
Once stage1 has found the stage2 sector it hands over entirely to stage2. at this point grub attempts to setup cdrom boot emulation, if this is a hard disk then cdrom boot emulation is discarded. grub then goes on to assess the drive geometry again and also attempts to cope with a few peculiarities than the various bios manufacturers have. It's best not to believe that all bios's are equal and certainly do not assume you have a bug free bios (you almost certainly do not).
So, once grub stage2 has gotten past fixing the vagaries of the various bios it goes on to attempt it's own version of booting (that is, getting a image ready to run). this is where multiboot is also setup along with trying to understand the potential images to load and pass off to. grub stage2 make a really decent attempt at reading the images and it does take time to both read and decompress large image files. don't forget, these image files have a lot in them, they have the kernel itself, kernel modules and configuration files. almost always they are compressed (this is why the stage2 dot's take so much longer than stage1's).
The grub stage2 boot also sets up any video output, don't expect too many fancy things, there's a limited amount you can expect to find without operating system drivers available.
Here also grub stage2 makes the effort to work with both new and aged kernels and their issues, it also checks memory size too. don't forget though that the issues known about are as good as the last gub patch or installation you installed.
Grub also provides for editing of the command line used to load the image, this is important since it cannot possibly know all the ways in which you prefer to boot. this part can make or break a boot. it takes a lot to get here and if you do not understand or clumsily use some parameter that you have spied on the internet somewhere don't expect it to always work. the ethos here is that you "understand your kernel and the parameters you can use to boot in the most efficient way for you".
---o---
Great post, some of which I was familiar with from Linux days.
If you plan to continue the series I'd like to suggest a deeper dive into the boot-archive. I've had a lot of problems with this recently. The Solaris admin manual only covers the basics of booting the failsafe option and recreating the boot-archive - but what if that doesn't fix the problem as I have found twice now.
The first time I had to delete the existing archive first then re-create it. The second time, a panic or the subsequent fsck had deleted /etc/driver_aliases in the failsafe partition which the boot-archive update command needed to copy, I ended up copying them from my root filesystem.
Pete
Posted by Pete on September 18, 2008 at 09:38 AM BST #