Andrew Rutz's blog
My opensolaris fixes ...
Here's a list of the code modifications I've contributed to OpenSolaris .
My largest contribution is this pseudo-driver written from "whole cloth". Imagine: ...a device-driver that actually contains comments.... who would have thought? :-)
Posted at 04:35PM Jul 26, 2007 by Andrew Rutz in Solaris | Comments[0]
One way to exit an infinite-reboot-loop
If you're in a situation where the system is panic'ing during boot, you can use
# boot net -sto regain control of your system.
In my case, I'd added some diagnostic code to a (PCI) driver (that is used to boot the root filesystem). There was a bug in the driver, and each time during boot, the bug occurred, and so caused the system to panic:
... 000000000180b950 genunix:vfs_mountroot+60 (800, 200, 0, 185d400, 1883000, 18aec00) %l0-3: 0000000000001770 0000000000000640 0000000001814000 00000000000008fc %l4-7: 0000000001833c00 00000000018b1000 0000000000000600 0000000000000200 000000000180ba10 genunix:main+98 (18141a0, 1013800, 18362c0, 18ab800, 180e000, 1814000) %l0-3: 0000000070002000 0000000000000001 000000000180c000 000000000180e000 %l4-7: 0000000000000001 0000000001074800 0000000000000060 0000000000000000 skipping system dump - no dump device configured rebooting...
If you're logged in via the console, you can send a BREAK sequence in order to gain
control of the firmware's (OBP's) prompt. Enter Ctrl-Shift-[ in order to get the "telnet>" prompt. Once telnet has control, enter this:
telnet> send brk
You'll be presented with OBP's "OK prompt":
ok
You then enter the following in order to boot into single-user mode via the network:
ok boot net -s
Note that booting from the network under Solaris will implicitly cause the system to be INSTALLED with whatever software had last been configured to be installed. However, we are using boot net -s as a "handle" with which to get at the Solaris prompt. Once at that prompt, we can perform actions as root that will let us back out our buggy driver (ok... MY buggy driver :-)) ...and replace it with the original, non-buggy driver.
Entering the boot command caused the following output, as well as left us at the Solaris prompt (in single-user-mode):
Sun Blade 1500, No Keyboard Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.16.4, 1024 MB memory installed, Serial #53463393. Ethernet address 0:3:ba:2f:c9:61, Host ID: 832fc961. Rebooting with command: boot net -s Boot device: /pci@1f,700000/network@2 File and args: -s 1000 Mbps FDX Link up Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet 4000 1000 Mbps FDX Link up Requesting Internet address for 0:3:ba:2f:c9:61 SunOS Release 5.10 Version Generic_118833-17 64-bit Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Booting to milestone "milestone/single-user:default". Configuring devices. Using RPC Bootparams for network configuration information. Attempting to configure interface bge0... Configured interface bge0 Requesting System Maintenance Mode SINGLE USER MODE #
Our goal is to now move to the directory containing the buggy driver and replace it with the original driver (that we had saved away before ever loading our buggy driver! :-)
However, since we booted from the network, the root filesystem ("/") is NOT mounted on one of our local disks. It is mounted on an NFS filesystem exported by our install server. To verify this, enter the following command:
# mount | head -1 / on my-server:/export/install/media/s10u2/solarisdvd.s10s_u2dvd/latest/Solaris_10/Tools/Boot remote/read/write/setuid/devices/dev=4ac0001 on Wed Dec 31 16:00:00 1969
As a result, we have to create a temporary mount point and then mount the local disk onto that mount point:
# mkdir /tmp/mnt # mount /dev/dsk/c0t0d0s0 /tmp/mnt
Note that your system will not necessarily have had its root filesystem on "c0t0d0s0". This is something that you should also have recorded before you ever loaded your.. er... "my" buggy driver! :-)
One can find the local disk mounted under the root filesystem by entering:
# df -k / Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 76703839 4035535 71901266 6% /
To continue with our example, we can now move to the directory of buggy-driver in order to replace it with the original driver. Note that /tmp/mnt is prefixed to the path of where we'd "normally" find the driver:
# cd /tmp/mnt/platform/sun4u/kernel/drv/sparcv9 # ls -l pci* -rw-r--r-- 1 root root 288504 Dec 6 15:38 pcisch -rw-r--r-- 1 root root 288504 Dec 6 15:38 pcisch.aar -rwxr-xr-x 1 root sys 211616 Jun 8 2006 pcisch.orig # cp -p pcisch.orig pcisch
We can now synchronize any in-memory filesystem data structures with those on disk... and then reboot. The system will then boot correctly... as expected:
# sync;sync # reboot syncing file systems... done Sun Blade 1500, No Keyboard Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.16.4, 1024 MB memory installed, Serial #xxxxxxxx. Ethernet address 0:3:ba:2f:c9:61, Host ID: yyyyyyyy. Rebooting with command: boot Boot device: /pci@1e,600000/ide@d/disk@0,0:a File and args: SunOS Release 5.10 Version Generic_118833-17 64-bit Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Hostname: my-host NIS domain name is my-campus.Central.Sun.COM my-host console login:
...so that's how it's done! Of course, the easier way is to never write a buggy-driver... but.. then.. we all "have an eraser on the end of each of our pencils"... don't we ? :-)
"...thank you... and good night..."
Posted at 07:05PM Dec 06, 2006 by Andrew Rutz in Solaris | Comments[0]
Using JumpStart to directly install from a CD/DVD
The following tutorial will save you time when installing Solaris via a CD or DVD. The default method is to do an Interactive Install, where one has to answer queries that the installer poses. This becomes tedious and error-prone if one has to install a large number of either machines or software versions. The solution is to use JumpStart.
However, I could not find any documentation describing how to use JumpStart when installing directly from a CD or DVD. In comparison, there are several places where "installing from a CD/DVD" is described, but they do not describe the desired solution (eg, they describe how to setup an install server using the CD/DVDs so that "network installs" can then be performed via the install server.
Note that there are two wonderful descriptions of how to create bootable CDs and DVDs ("Building a Bootable JumpStart Installation CD-ROM" and "Building a Bootable DVD to Deploy a Solaris Flash Archive" at this link) but they cater to the most general case: they provide an algorithm to modify a copy of the bootable image (and then write it to a CD/DVD).
However, for the cases where the bootable image does not need to be modified, the following
describes how to use JumpStart so that its normal files (sysidcfg, profile,
begin scripts, and finish scripts) can be referenced when directly booting from
a CD/DVD. This capability allows one to do an unattended install.
Setting up the install server - Yes, I know I said one will be directly
installing from the CD/DVD, but one has to get the JumpStart files from somewhere!
=:-). The install server is actually an rpc.bootparamd(1M)-server,
as we simply need a system other than the install client (eg, "target") that
will provide the JumpStart configuration information via /etc/bootparams. This
server must be on the same subnet as the target.
The trick is to hand-craft an /etc/bootparams entry for the target. The
entry must use only the sysid_config and install_config keywords.
The former identifies the directory where the sysidcfg file is located. The
latter identifies where the other JumpStart files reside (eg, rules.ok,
begin scripts, etc.).
You can use the following Korn Shell script to generate an /etc/bootparams
entry for a target named foo and a server named my_server.
foo's JumpStart files reside at /export/jump/foo on
my_server:
function bootpadd {
f=/etc/bootparams
cat >> $f <<EOF
$1 sysid_config=${2}:$(pwd) install_config=${2}:$(pwd)
EOF
}
# bootpadd foo my_server
# grep foo /etc/bootparams
foo sysid_config=my_server:/export/jump/foo install_config=my_server:/export/jump/foo
All of the target's JumpStart files can be put in one directory on the server. You will
need a sysidcfg file, a rules file, and any file referenced by
the rules file. For example, for a target with the name my_target, one could have the following rules file. Accordingly,
a directory on the server would have these files:
# cd .../jumpstart/my_target # cat rules any - - prof finish_script # ls -l total 3952 -rwxr-xr-x 1 root other 2920 Sep 17 2003 finish_script -rwxr-xr-x 1 root other 250 May 9 2005 prof -rwxr-xr-x 1 root other 27 Sep 17 2003 rules -rwxr-xr-x 1 root other 53 Nov 18 11:55 rules.ok -rwxr-xr-x 1 root other 243 Nov 18 14:25 sysidcfg
Note that the rules.ok file is created by running check(1M)
on the files in the above directory. check(1M) is located on the server
in the filesystem containing the install images for the Solaris version you are installing
(from CD/DVD):
# .../Solaris_10/Misc/jumpstart_sample/check Validating rules... Validating profile prof... The custom JumpStart configuration is ok.
Note that there is nothing in the contents of the above files that changes due to
directly installing via CD/DVD. Eg, as expected, these files simply have to pass the
verifications done by check(1M). Also, note that the profile file
(prof in our example) will be used when doing the JumpStart. Eg, there is
a profile file on the CD/DVD, but it is not used if the install_config
keyword is used in the /etc/bootparams file. Also, if that keyword is not
present, an Interactive Install will be performed.
Booting from the CD/DVD - All that is needed to use the above configuration is to
reach the SPARC OBP command line and enter this boot command:
# sync; sync # eeprom | grep auto-boot auto-boot?=false # shutdown -i0 -y -g0 ... ok boot cdrom - install ...
Adding the "- install" suffix causes JumpStart to be used. JumpStart will
query our server for the target's /etc/bootparams entry. When you are
done installing, you can use this Korn Shell function to remove our target's
/etc/bootparams entry (as rm_install_client(1M) will not know
about it, because we never ran add_install_client(1M)):
function bootprm {
eval sed '/^${v}/d' /etc/bootparams
}
For example, this will remove my_target's entry:
# bootprm my_target
Posted at 06:52PM Nov 18, 2005 by Andrew Rutz in Solaris | Comments[0]
JumpStart and multiple root slices
Here's a nice two-step trick that lets one have multiple root slices
on a medium that is installed using
JumpStart. Having multiple root slices on a
physical disk is a space- and time-saver. One uses the disk space
more efficiently, and the turnaround time on switching between Solaris
versions is determined by the time to
shutdown(1M)
the system and
boot(1M)
the
desired slice.
This example assumes that one is familiar with JumpStart. As a result, it
will focus on how to setup two profile files, as these are what
enables the multi-root slice functionality. Our profile
is named prof in our rules file:
# cd <my_jumpstart_dir> # cat rules any - - prof finish_script
Step One - This step is what "carves out" space for the desired number of root slices. In our example, there will be three root slices. As a result, Step One will be performed once, and Step Two will be performed twice (eg., one less than the total number of root slices being installed). If it's decided that a disk is to have more root slices in the future, then any installed root slices will be destroyed when Step One is repeated.
Our profile file for this step is named prof.step1.
Create a file that has these contents:
# cat prof.step1 install_type initial_install system_type standalone partitioning explicit # 'rootdisk' will be set to value of root_device #root_device c0t0d0s0 #root_device c0t0d0s6 root_device c0t0d0s7 # root_device must match the filesys whose name is '/' filesys rootdisk.s0 16000 filesys rootdisk.s1 free swap filesys rootdisk.s6 16000 filesys rootdisk.s7 16000 /
Points to note:
- The slice referenced by
root_device(e.g.,s7) must match thefilesysentry whose name is "/". - All slices specify an explicit size (eg, "16000" MB) in their
filesysentry. - The two slices not being installed to (eg,
s0ands6) are not given a name in theirfilesysentry (eg, there is no identifier following their size argument)
Before running the normal JumpStart binaries (
check(1M) and
add_install_client(1M)),
create a symbolic link so that prof references our
profile file:
# ln -s prof.step1 prof
After logging in to the install client and executing
ok boot net - install
you will have space carved out for three root slices and
your Solaris version will be installed on s7.
Step Two
For the two remaining slices, s0 and s6,
perform the following. We'll use s0 in the example.
Create the following profile file:
# cat prof.step2 install_type initial_install system_type standalone partitioning explicit # 'rootdisk' will be set to this root_device c0t0d0s0 #root_device c0t0d0s6 #root_device c0t0d0s7 # '/' slice must match root_device; complementary devices # must be 'ignore' filesys rootdisk.s0 existing / filesys rootdisk.s1 free swap filesys rootdisk.s6 existing ignore filesys rootdisk.s7 existing ignore
The differences between prof.step1 and prof.step2 are highlighted.
Points to note:
root_devicereferencess0and thefilesysentry named "/" also referencess0.- The two slices not being installed to have a name of
ignore. - All three slices use a size of
existing. (This has the meaning of: "use the size that was specified in Step One" (eg, "16000")).
To install your desired Solaris version onto s0, execute
this permutation of commands on the install server (with the appropriate arguments):
# rm prof # ln -s prof.step2 prof # check ... # add_install_client ...
On the install client, execute our favorite JumpStart command:
ok boot net - install
Repeat Step Two in order to install your desired Solaris
version onto s6.
...now.......... get to WORK!!!! :-)
Posted at 11:27PM Nov 11, 2005 by Andrew Rutz in Solaris | Comments[0]
Solaris boot-time and time-complexity
The only problem with algorithms reified as software programs is that they have to run on a real computer. Wall Street and Main Street are not interested in the theoretical world of Turing machines . As experienced in the real world, an algorithm must have a "running time" that is "practical". As taught in computer science classes, "practicality" is represented as time-complexity, which is represented using "O notation" , as in kO(n) (for an algorithm whose execution time is linearly proportional to a number of inputs, n). It's that darn little k, though, that makes this blog more interesting.
Until Solaris supported platforms with very many resources (CPUs, main memory, IO devices), its boot algorithm, in general, was "conveniently agnostic" towards resource-dense systems. Until the last half of the 1990's, user and engineering focus had been given to measurements that flattered the system's steady-state capabilities (e.g., Dhrystones, Whetstones, MIPs , and bandwidth). However, a fairly recent industry focus on RAS (Reliability, Availability, and Serviceability) accordingly swayed the system's internal and external customers to the degree that the system's non-steady-state lifecycle events (boot time, crash time, low-power-cycle time) were equally important. Mission-critical systems are anything if not practical, and their return-on-investment and effectiveness were begrudgingly admitted to also be a function of the "punctuations" within a system's lifetime.
So, along comes Solaris 10 and its internal attack on significantly reducing boot time. Optimization of one function within the boot algorithm reduced boot time about 30% (on systems with a sparse IO device tree). cached_va_to_pa() is the optimized form of va_to_pa(). cached_va_to_pa() caches the last virtual- to physical-address translation so as to significantly decrease the translation cost.
va_to_pa() translates a kernel virtual address ("VA") into a physical address ("PA"). After Solaris has booted, the kernel uses its virtual memory (VM) metadata to quickly translate a VA to a PA. Early during boot, however, "the firmware" (OBP) is (still) responsible for this translation.
As any optimization in good-standing would be expected to do, cached_va_to_pa() leverages empirically-derived edge-cases to specialize a general purpose algorithm. Even though va_to_pa() is still one of the most popular dynamic call sites in Solaris, its generality during steady-state execution is its liability during boot-time.
cached_va_to_pa()'s lone static call site has this call-graph:
_start
main
startup
startup_modules
sfmmu_init_nucleus_hblks
cached_va_to_pa
However, this lone static call-site is (dynamically) called millions of times during boot. It initializes one field within a key data structure used within Solaris's virtual memory implementation. At this point during boot, cached_va_to_pa() is able to leverage these facts:
- the kernel is single-threaded
- a physical page has a statically-known and fixed size (8KB)
- the caller is initializing a static array of small data structures whose virtual addresses are contiguous
These facts allow cached_va_to_pa() to quickly translate a VA to a PA. cached_va_to_pa() caches the last (starting) PA that was computed. If the next translation maps to that same physical page, then the PA is reused. If not, a call to va_to_pa() is made (and the computed (starting) PA is saved). Trivially, since the kernel is single-threaded when cached_va_to_pa() is called, one is guaranteed that the previous and current calls are from the same kernel thread. As a result, the hit ratio of this one-element lookup cache is optimal: only one in twenty five calls (per 8KB physical page) requires a call to the "heavyweight" va_to_pa() implementation.
So, even though the time-complexity of this part of Solaris's boot algorithm is still linear (k * O(n)), the constant number of instructions used per physical page, k, is significantly smaller.
The large reduction is due to not having to call OBP. OBP's binary interface requires its client (e.g., Solaris) to implement a call that is consistent with OBP's stack-based machine. As a result, a call occurs by having Solaris translate its "C" language implementation values into FORTH language tokens, pushing the token(s), executing the tokens, popping the tokens, and translating the tokens back into "C" from FORTH.
Posted at 09:16PM Aug 23, 2005 by Andrew Rutz in Solaris | Comments[0]