The BeleniX LiveCD Performance Story Part 3
Here is another update on the current performance work that we did and will be evident in the upcoming 0.4.3 release. As a result of the latest optimizations I have been able to reduce the boot up time by upto 40 seconds.
- Optimize the system/filesystem/root service. Some of the processing going on in this script is not relevant to a livecd and at present not relevant to x86 as well. One of the things is a check for sun4v platform for getting an optimized libc_psr library. Another one is a check for "/usr" and "/boot" mounts in /etc/vfstab.
- Reduce processing in the system/identity/node service for setting the hostname. For a livecd, simply setting the hostname via uname is enough.
- BeleniX scans disk device nodes in devices-local using fstyp to try and identify supported filesystems that can be automatically mounted. Executing /usr/sbin/fstyp for each /dev/dsk node was found to be too heavyweight so a simplified shell function was implemented in the service script. In addition udfs fstyp support has been dropped for now. It was found from the DTrace output that udfs fstyp does lots of I/O.
- Change the dependency of the xserver service. Now it has an "optional_all" dependency on sshd as well. Sshd was found to do lots of I/O during startup which was creating a lot of disk contention with xserver which also does a lot of I/O. Now after this change it has reduced contention in a big way. This change alone gives upto 15 seconds reduction.
Essentially the concept of parallel service startup to reduce boot up time is turned on it's head in a LiveCD. We actually need more or less serial startup rather than heavy parallelism to reduce contention for slow CDROM access! - The last change that gave about 20 seconds reduction is a new algorithm to sort file names from DTrace iosnoop output to achieve a more optimal file data ordering in mkisofs. Earlier we were just sorting by first access which was inefficient. The new algorithm attempts to identify contention regions and tries to keep the affected files close to the contention regions and close to each other with the objective of minimizing CDROM head movement. Here's the rundown:

Objective: Files being accessed piecemeal and in bunches should be close to each other and be close to the contention region.
Q: How do we define the limits of a contention region ?
We took an arbitrary value 5. Now loop through the list and keep track of every file access and assign weights.
We are not interested in consecutive repeated access to blocks of the same file, so we collapse all such consecutive accesses (uniq) and assign a weight of 1. In addition we also note the total number of bytes accessed for each consecutive access group..We wish to examine scattered accesses.
So as an example lets say libX11 is accessed at relative position 10. We create an entry at position 10 in a list and assign a weight of 1. The last position pointer for libX11 is set to 10. Now there are accesses to other files and after 3 entries libX11 is accessed again. Since 3<=5 we increment the weight of the previous libX11 entry by 1 instead of creating a new entry. The last position pointer for libX11 is also updated to 14. We keep on doing this unless current libX11 access position – last libX11 position > 5 when we create a new entry:
The next step is to scan the new list and for each file retain only the entries with the highest weight at the relative position. If two entries have the same weight then retain the entry having greater number of bytes accessed. This tends to “pull” contending files closer to the heavy access region and closer to each other.
Essentially this is a sliding window algorithm that identifies contention regions by looking for file access repetitions that fall within the window as it slides over the iosnoop data.
Posted by Sriram Narayanan on April 28, 2006 at 04:24 AM PDT #