Friday Aug 14, 2009

My colleague, Ed Clark, has made significant improvements to the Solaris 10 Recommended and Sun Alert patch clusters.  These improvements have just been released and are in the current clusters available to contract customers from the Patch Cluster & Patch Bundle Downloads on SunSolve.

Ed's improvements include:

  • Filtering out "false negatives" from the patch utility return codes, so that if the cluster install script returns "1", you know you've got a real problem which needs investigating.   As you may know, the Solaris patch utility, 'patchadd', can return errors for some acceptable situations - for example, if the patch is already applied to the system, or a later revision of the patch or a patch which obsoletes it is already applied to the system, or none of the packages in the patch are on the target system (e.g. because a reduced Install Metacluster was used to install it or the system has been security hardened by package removal), etc.   Such conditions are acceptable "errors" which do not usually require further investigation by the user.  By filtering these conditions out, if the 'installcluster' script returns "1", you know it isn't because of one of these acceptable "errors", and therefore you need to look at the logfiles to find out what's gone wrong.  For further information, please see the cluster README and Analyzing a patchadd or patchrm Failure in the Solaris OS.
  • The new 'installcluster' script will exit as soon as it encounters an unexpected failure - i.e. not one of the acceptable "errors" mentioned above.  This prevents potentially compounding issues by attempting to apply further patches.
  • The new 'installcluster' script includes context intelligence for patching operations.   It informs the user when zones need to be halted, and it provides phased installation to handle patches which absolutely require an immediate reboot before further patches can be applied.  Such interim reboots are only needed when patching a live boot environment on a system below Kernel patch 118833-36 (SPARC) / 118855-36 (x86) and well as the earlier interim reboot required on x86 related to 'libc.so' patches and Kernel patch 118844-14.  On systems below these patch levels, the 'installcluster' will stop at the appropriate point when patching the live boot environment, and inform the user to reboot and re-invoke the 'installcluster' script.  (In the old cluster install script, it simply tried to carry on blindly past such interim reboots, spewing out error messages, although code in the relevant patches prevented any harm from being done).  These interim reboots, when required, are dealt with relatively early in the cluster install sequence so that once completed, the Sys Admin can leave the rest of the installation to finish unattended and move onto other systems.
  • The new 'installcluster' script provides better integration with Solaris Live Upgrade as the user can now specify the Live Upgrade alternate boot environment to patch by name.
  • The new 'installcluster' script performs space checking prior to installing each patch, and will halt if it believes there is insufficient space to complete the installation successfully.  For example, this helps avoid non-global zones getting out of sync regarding patch levels with respect to the global zone.  This is an important enhancement as running out of space during patching can potentially leave the system in an inconsistent state and is to be avoided.  Even removing a patch requires space, so immediate removal of a patch which has failed to apply correctly due to space issues should be avoided until sufficient space is freed up and potential issues caused by its partial installation investigated - for example, was the undo.Z file successfully created to enable backout ? (Tip: It may be better to retry the patch installation once space has been freed up rather than patch removal in such circumstances.  Contact Sun Support for instructions if you encounter such issues.).   The space checking enhancements in the 'installcluster' script are designed to prevent such problems occurring.
  • The messages and log files produced by the 'installcluster' script are clear and well structured.  For example, a "failed" log is created if a patch fails to apply.  See the Cluster README for further information.
  • The 'patch_order' places patches in an optimal order for installation to avoid known issues - for example, the patch utilities patches are installed as early in the sequence as possible to avoid hitting patch installation bugs which are fixed in the patch utility patches, and the Kernel patch procedural script override patch, 125555 (SPARC) / 125556 (x86), is ordered prior to 137137-09 (SPARC) / 137138-09 (x86) to resolve some known issues.  When patching an alternate boot environment (which is recommended), a small sub-set of pre-requisite patches, primarily the patch utility patches, need to be applied to the live boot environment to ensure correct patching operation.  The 'installcluster' script will check for these pre-requisite patches are halt installation if they are not present, advising the user of the 'installcluster' script option to use to install these pre-requisite patches.   Further patches may need to be installed on the live boot environment to support Live Upgrade.  See the cluster README for further information.
  • The patches have been moved to a 'patches' sub-directory, to de-clutter the top level directory of the unzipped cluster.
  • Please see the cluster README file for further information.  Customers should read the cluster README file and look at the Special Install Instructions in the patches within the cluster prior to installation.

I really want to thank Ed Clark for the enormous amount of thought and effort he has put into improving the cluster installation experience.   The work he's done on the Solaris 10 Recommended and Sun Alert patch cluster is a continuation of his previous work on the Solaris Update Patch Bundles and the Solaris 10 Live Upgrade Zones Starter Patch Bundle.  Nice work, Ed!

While the 'installcluster' script is copyrighted, I am happy for customers to use it, and the 'patch_order' file, as a starting point for their own customized patch bundles, so long as it is for their own use and is not to be given to a 3rd party or used for commercial gain (e.g. by a 3rd party maintainer or 3rd party commercial automation tool).

We have also made significant improvements to the back end processes to ensure higher and more consistent cluster quality. 

Originally, the clusters were created by the Patch Operations and Distribution (POD) team after patch release.  The POD Cluster QA process left a lot to be desired, resulting in inconsistent cluster quality.   To plug this gap, my Patch System Test team have been testing the clusters for several years, but the old process only allowed us to test them in parallel with their release, which meant that we found issues at the same time that early downloaders of the cluster encountered them.  Although we ensured such issues were fixed as quickly as possible, it still obviously compromised our customers' experience.

In the new process, the clusters are routed to Patch System Test (PST) prior to release.  PST run a transformation script on them to optimize the patch installation order, etc.  The clusters will only be released once they have passed PST testing.  This should ensure higher and more consistent quality for customers.  Work is continuing to move the entire patch cluster generation process to PST, although these future backend enhancements in this regard should be invisible to customers.

Thursday Mar 26, 2009

Here's a Solaris patch presentation (updated July 28th 2009) which I hope will be of use to customers, partners, and field engineers.

It distills much of the information contained in this blog, including the recommended patching strategy, best practices, news on patching enhancements, background information on Solaris and the bug fix process, etc.

Patching isn't a one-size-fits-all operation, but I hope you find the information provided a useful guideline.

Tuesday Mar 10, 2009

My team and I have been working with the SunSolve team on improvements to the Patch Cluster pages on SunSolve.  These improvements went live on April 20, 2009.

The old "Recommended Patch Clusters" and "Recommended and Security Patches" pages have been combined into a single Patch Cluster & Patch Bundle Downloads page.

A Notice Board section at the top of the page will be used to alert customers to current issues.

Click on the cluster headings to see a brief description of the purpose of the cluster, with links to view the cluster README as well as a download link.   The date the cluster was last updated and the size of the cluster are also shown.

No change has been made to the underlying cluster file names, so scripts using 'wget' to access the patch clusters should be unaffected.

This is part of an ongoing effort to improve our patch presentation to customers.

As before, customers need a valid support contract in order to be able to access patch clusters. 

If you are not registered in Member Support Center, simply log into SunSolve and associated one or more support contracts with your Sun Online Account using the "Change Contract" option in the top right hand menu.

If you are registered in Member Support Center, your contracts will be automatically associated with your account (and the "Change Contract" option will not be shown when you log into SunSolve).

Thursday Mar 05, 2009

A fix to the unzip utility is available in recent patch utility patch revisions.  This fix is required in order to be able to successfully unzip very large files such as the Solaris 10 Recommended and Sun Alert Patch Clusters.

Please download the latest revision of the patch utilities patch first and install it, before attempting to unzip the Solaris 10 Recommended or Sun Alert Patch Clusters.

The fix was incorporated in the putback to CRs 6344676 and 6464056.

The following are the earliest revisions of the patch utilities containing the fix:

  • Solaris 10 SPARC: 119254-46 or above
  • Solaris 10 x86:        119255-46 or above
  • Solaris 9 SPARC:   112951-14 or above
  • Solaris 9 x86:          114194-11 or above
  • Solaris 8 SPARC:   108987-19 or above
  • Solaris 8 x86:          108988-19 or above

Without the fix to unzip provided by the above patches, the following error will be seen when attempting to unzip the Solaris 10 Patch Clusters:

# unzip -q 10_Recommended.zip

note:  didn't find end-of-central-dir signature at end of central dir.
  (please check that you have transferred or created the zipfile in the
  appropriate BINARY mode and that you have compiled UnZip properly) 

In addition, do not unzip Solaris patch clusters on Windows. Solaris patch clusters, and solaris patches more generally, can contain case-sensitive file names. Consequently clusters and patches must be unzipped on a case-sensitive filesystem (corruption can occur if unzipping on filesystems that are not case-sensitive). 

The above information is now published in Infodoc http://sunsolve.sun.com/search/document.do?assetkey=1-62-252447-1

This blog copyright 2009 by Gerry Haskins