Friday Sep 18, 2009

The new Patching Pre-flight Checks ('ppc') tool is now available to all customers who have a support contract.

The idea for this tool comes directly from customer feedback.  

The customer wanted to reduce the cost of patching Solaris systems by enabling more junior Sys Admins to successfully patch Solaris 10 zones systems.   Their concern was that potential zones patching issues in versions of Solaris prior to Solaris 10 8/07 (Update 4) meant that they needed to assign senior System Administrators to patch such systems to identify and resolve potential issues.  

Furthermore, the customer was concerned that such issues had the potential to derail planned maintenance windows - for example, if during the patching session an unexpected issue was encountered and the patching session couldn't be completed as planned.

To address these concerns, my colleague, Ronan O'Connor, has written the Patching Pre-flight Checks tool, 'ppc'.  It can be run prior to a planned patching session to check that the target system is in a clean state ready for patching.

It's important to understand the scope of the tool.  It checks a target system (and a patch set, if supplied) for a variety of inconsistencies which could cause problems.

It looks for left over lock files from previously aborted patching or packaging operations, inconsistencies in the contents database, IDRs installed on the target system, zones "mountability", space issues, etc.  Some of these issues can occur on early versions of Solaris 10, particularly in a Zones environment.  Many of the underlying causes of such issues are fixed in the latest versions of the patch utility patches (119254 SPARC / 119255 x86), which is why we always recommend you apply the latest patch utility patches before applying other patches.

If you have a directory of patches to be applied, 'ppc' checks the integrity of those patches, and cross-checks whether any of the patches patch pkgs which have been locked down by any IDRs on the system and warns if there is a conflict.

The 'ppc' Release Notes provide information to help interpret the messages produced.

The idea is that 'ppc' can be run by a junior Sys Admin prior to a planned patching session, and any potential issues uncovered can then be analyzed by a more experienced Sys Admin.  This helps avoid nasty surprises during patches sessions and also helps to reduce the level of expertise required to patch Solaris systems, leading to cost savings for customers.

It is outside the scope of the 'ppc' tool to do root cause analysis of why the inconsistency arose or what actions may be needed, if any, to correct the situation.

If 'ppc' returns without noting any problems, you can be pretty confident that the patching session will succeed.  If 'ppc' notes potential issues, they can be investigated prior to the planned maintenance window.

The next version of 'ppc' will include a Zones consistency check to check that all zones are at a consistent patch level.   It will also contain a more sophisticated space checking algorithm.  There's no planned release date yet for Version "2.0" yet as we're awaiting feedback on Version 1.0.x first.

Some of the ideas in 'ppc' may find their way back into 'patchadd', although it's probably appropriate to keep 'ppc' as a separate tool.

To download the Patching Pre-flight Checks tool, 'ppc', go to the 'ppc' thread on the Customer Patch Forum.  If you have not accessed the Customer Patch Forum before, please see my blog entry on the initial "secret-handshake" login. 

The Patching Pre-flight Checks tool, 'ppc', and the Customer Patch Forum are only available to customers with a support contract.

We're very interested in your feedback as to the usefulness of this tool and how you'd like to see 'ppc' develop going forward.

Many thanks to Ronan O'Connor for all his work on the tool!

Best Wishes,

Gerry Haskins
Director, Software Patch Services

Friday Aug 14, 2009

My colleague, Ed Clark, has made significant improvements to the Solaris 10 Recommended and Sun Alert patch clusters.  These improvements have just been released and are in the current clusters available to contract customers from the Patch Cluster & Patch Bundle Downloads on SunSolve.

Ed's improvements include:

  • Filtering out "false negatives" from the patch utility return codes, so that if the cluster install script returns "1", you know you've got a real problem which needs investigating.   As you may know, the Solaris patch utility, 'patchadd', can return errors for some acceptable situations - for example, if the patch is already applied to the system, or a later revision of the patch or a patch which obsoletes it is already applied to the system, or none of the packages in the patch are on the target system (e.g. because a reduced Install Metacluster was used to install it or the system has been security hardened by package removal), etc.   Such conditions are acceptable "errors" which do not usually require further investigation by the user.  By filtering these conditions out, if the 'installcluster' script returns "1", you know it isn't because of one of these acceptable "errors", and therefore you need to look at the logfiles to find out what's gone wrong.  For further information, please see the cluster README and Analyzing a patchadd or patchrm Failure in the Solaris OS.
  • The new 'installcluster' script will exit as soon as it encounters an unexpected failure - i.e. not one of the acceptable "errors" mentioned above.  This prevents potentially compounding issues by attempting to apply further patches.
  • The new 'installcluster' script includes context intelligence for patching operations.   It informs the user when zones need to be halted, and it provides phased installation to handle patches which absolutely require an immediate reboot before further patches can be applied.  Such interim reboots are only needed when patching a live boot environment on a system below Kernel patch 118833-36 (SPARC) / 118855-36 (x86) and well as the earlier interim reboot required on x86 related to 'libc.so' patches and Kernel patch 118844-14.  On systems below these patch levels, the 'installcluster' will stop at the appropriate point when patching the live boot environment, and inform the user to reboot and re-invoke the 'installcluster' script.  (In the old cluster install script, it simply tried to carry on blindly past such interim reboots, spewing out error messages, although code in the relevant patches prevented any harm from being done).  These interim reboots, when required, are dealt with relatively early in the cluster install sequence so that once completed, the Sys Admin can leave the rest of the installation to finish unattended and move onto other systems.
  • The new 'installcluster' script provides better integration with Solaris Live Upgrade as the user can now specify the Live Upgrade alternate boot environment to patch by name.
  • The new 'installcluster' script performs space checking prior to installing each patch, and will halt if it believes there is insufficient space to complete the installation successfully.  For example, this helps avoid non-global zones getting out of sync regarding patch levels with respect to the global zone.  This is an important enhancement as running out of space during patching can potentially leave the system in an inconsistent state and is to be avoided.  Even removing a patch requires space, so immediate removal of a patch which has failed to apply correctly due to space issues should be avoided until sufficient space is freed up and potential issues caused by its partial installation investigated - for example, was the undo.Z file successfully created to enable backout ? (Tip: It may be better to retry the patch installation once space has been freed up rather than patch removal in such circumstances.  Contact Sun Support for instructions if you encounter such issues.).   The space checking enhancements in the 'installcluster' script are designed to prevent such problems occurring.
  • The messages and log files produced by the 'installcluster' script are clear and well structured.  For example, a "failed" log is created if a patch fails to apply.  See the Cluster README for further information.
  • The 'patch_order' places patches in an optimal order for installation to avoid known issues - for example, the patch utilities patches are installed as early in the sequence as possible to avoid hitting patch installation bugs which are fixed in the patch utility patches, and the Kernel patch procedural script override patch, 125555 (SPARC) / 125556 (x86), is ordered prior to 137137-09 (SPARC) / 137138-09 (x86) to resolve some known issues.  When patching an alternate boot environment (which is recommended), a small sub-set of pre-requisite patches, primarily the patch utility patches, need to be applied to the live boot environment to ensure correct patching operation.  The 'installcluster' script will check for these pre-requisite patches are halt installation if they are not present, advising the user of the 'installcluster' script option to use to install these pre-requisite patches.   Further patches may need to be installed on the live boot environment to support Live Upgrade.  See the cluster README for further information.
  • The patches have been moved to a 'patches' sub-directory, to de-clutter the top level directory of the unzipped cluster.
  • Please see the cluster README file for further information.  Customers should read the cluster README file and look at the Special Install Instructions in the patches within the cluster prior to installation.

I really want to thank Ed Clark for the enormous amount of thought and effort he has put into improving the cluster installation experience.   The work he's done on the Solaris 10 Recommended and Sun Alert patch cluster is a continuation of his previous work on the Solaris Update Patch Bundles and the Solaris 10 Live Upgrade Zones Starter Patch Bundle.  Nice work, Ed!

While the 'installcluster' script is copyrighted, I am happy for customers to use it, and the 'patch_order' file, as a starting point for their own customized patch bundles, so long as it is for their own use and is not to be given to a 3rd party or used for commercial gain (e.g. by a 3rd party maintainer or 3rd party commercial automation tool).

We have also made significant improvements to the back end processes to ensure higher and more consistent cluster quality. 

Originally, the clusters were created by the Patch Operations and Distribution (POD) team after patch release.  The POD Cluster QA process left a lot to be desired, resulting in inconsistent cluster quality.   To plug this gap, my Patch System Test team have been testing the clusters for several years, but the old process only allowed us to test them in parallel with their release, which meant that we found issues at the same time that early downloaders of the cluster encountered them.  Although we ensured such issues were fixed as quickly as possible, it still obviously compromised our customers' experience.

In the new process, the clusters are routed to Patch System Test (PST) prior to release.  PST run a transformation script on them to optimize the patch installation order, etc.  The clusters will only be released once they have passed PST testing.  This should ensure higher and more consistent quality for customers.  Work is continuing to move the entire patch cluster generation process to PST, although these future backend enhancements in this regard should be invisible to customers.

Friday Jun 19, 2009

The Zones Parallel Patching enhancement for the Solaris 10 patch utilities was released this week giving customers a choice of how to improve zones patching performance.

In the Zones "Update On Attach" section of a previous blog posting, I mentioned that the Zones "Update On Attach" feature could also be used to improve Zones patching perfomance.

Zones Parallel Patching is a true patching solution utilizing the 'patchadd' utility.  

Whereas Zones "Update On Attach" uses zones functionality similar to that used during zones creation to provide a pseudo-patching solution that does not utilize 'patchadd'. 

So which one to choose ?

Let's look at the two options in more detail:

Zones Parallel Patching

Zones Parallel Patching is an enhancement to the standard Solaris 10 patch utilities and is delivered in the patch utilities patch, 119254-66 (SPARC) and 119255-66 (x86).

Simply install this patch, set the maximum number of non-global zones to be patched in parallel in the config file /etc/patch/pdo.conf, and away you go.

It works for all Solaris 10 systems. 

It also works well in conjunction with higher level patch automation tools such as xVM Ops Center. 

It can dramatically improve zones patching performance by patching non-global zones in parallel.  The global zone is still patched first.

While the performance gain is dependent on a number of factors, including the number of non-global zones, the number of on-line CPUs, the speed of the system, the I/O configuration of the system, etc., a performance gain of ca. 300% can typically be expected for patching the non-global zones - e.g. On a T2000 with 5 sparse root non-global zones.

See my previous Zones Parallel Patching blog entry for further information.

Since it's a pure enhancement to 'patchadd', it's normal 'patchadd' functionality.  You can subsequently remove patches using 'patchrm', etc.  Nothing has changed except that it's now much faster to patch non global Zones with Zones Parallel Patching invoked.

Zones "Update On Attach"

The primary purpose of Zones "Update on Attach" is Zones migration from one server to another.  

For example, a database instance in a non-global zone hosted on a server has grown to the extent that the Sys Admin wants to transfer it to a better spec'd server which can better handle the workload.   The Sys Admin can detach it from the old server (e.g. a Sun4u) and reattach it to the new server (e.g. a Sun4v) using Zones "Update On Attach".   This will bring the OS Software level on the non-global zone up to the same level as the new server's global zone.

Zones "Update On Attach" can certainly be used for patching but there are limitations you need to be aware of as outlined below.

For example, detach the non-global zones from a system, apply a bunch of patches to the global zone, reattach the non-global zones using "Update On Attach" and viola, the non-global zones will be brought up to the same software level as the global zone (for OS type packages), effectively patching the non-global zones without using 'patchadd' at all.   This is typically even faster than using Zones Parallel Patching.  But there are limitations to this approach which users must be aware of (see below).

My senior engineer, Enda O'Connor, has just published an interesting article on The Zones Update on Attach Feature and Patching in the Solaris 10 OS

Zones "Update On Attach" limitations as a patching aid

Zones "Update On Attach" only works for packages which are SUNW_PKG_ALLZONES=true - i.e. typically OS level packages, and not application packages.

So when to use Zones Parallel Patching in 'patchadd' and when to use Zones "Update On Attach" ?

Here's what my senior engineer, Enda O'Connor, says:

"The Zones Update on Attach Feature and Patching in the Solaris 10 OS document may help customers understand how the technology works, applying a cluster via patching and via zones Update On Attach is not quite the same really.

It really depends on the patches being applied, i.e. applying a firefox patch via Update On Attach would not work if you wanted it to apply to the global zone and all non-global zones as well.

One has to understand how Update On Attach works and then apply that to the list of patches to see if it gets them to a desirable state.

There is no black or white answer here.

I'd recommend Zones Parallel Patching using 'patchadd' as it has a known outcome all the time, whereas Update On Attach makes it's own internal determination based on a number of things, that can vary from system to system ( e.g. inherited directories ).

But if time to patch is critical then if the customer does proper testing to validate things, and are happy with the results, then by all means use Update On Attach.

But using Update On Attach without:

1. Understanding how it determines what packages to update

2. Not inspecting the patches being applied.

...will most likely lead to grief at some point."

And my other senior engineer, Ed Clark, says:

"In terms of giving guidance on which technology to use, there are a number of considerations -- two of these considerations are:

1. Using Update On Attach to update sparse zones can require significantly more disk storage space than would be needed by applying patches with 'patchadd' (3-4 times as much space would not be uncommon i think), due to Update On Attach copying fully populated global zone 'undo' files into the non-global zones, as opposed to having patchadd build sparsely populated 'undo' files in the non-global zones.

2. If a customer is really concerned about the ability to back out patches reliably, then 'patchadd' is a lower risk option than Update On Attach -- 'patchrm' of a patch from a non-global zone that has a copy of the global zones 'undo' pkg data (as is the case after Update On Attach) may potentially have unexpected side effects." [although we have yet to see any actual cases of negative results from this.]

Conclusion

In general, we recommend using the Zones Parallel Patching enhancement in the patch utilities rather than the Zones "Update On Attach" feature as Zones Parallel Patching is standard patching functionality, only faster, whereas Zones "Update On Attach" is really designed for migrating zones from one server as another and was not primarily designed to speed up patching.  

Because Zones "Update On Attach" uses Zones functionality similar to the zone creation functionality, rather than 'patchadd' functionality, limitations exist on what will be patched (typically the OS but not applications) and there's the potential for anomalies around things like the "undo" files which would be used by 'patchrm' if patches applied using Zones "Update On Attach" were subsequently removed from the non-global zones using 'patchrm' (although we have yet to see any actual cases of serious issues resulting from this).

So in patching situations where time is absolutely critical, Zones "Update On Attach" may provide a good option, as long as it's well tested in the customer environment prior to deployment on production systems.

Remember too, Live Upgrade is also your friend in such situations, enabling you to patch an inactive boot environment while the system is still in production.   So a combination of Live Upgrade and Zones Parallel Patching would be ideal.

I hope you find this helpful!

Best Wishes,

Gerry.

Thursday Jun 18, 2009

The Solaris 10 5/09 (Update 7) patch bundle is now available for download from the SunSolve Patch Cluster & Patch Bundle Download Page.  Click on the "Solaris Update Patch Bundles" link.

As with previous patch bundles, it contains the patches which are included in the corresponding Solaris Update, in this case Solaris 10 5/09 (Update 7).

This is useful for Sys Admins who wish to bring all their systems up to the same patch level as the Solaris Update without wanting to upgrade to the release - for example, due to change control policy restrictions in their organizations.

See previous blog entries for previous Solaris Update patch bundles for further information.

Wednesday Jun 17, 2009

The Zones Parallel Patching feature is now available in the latest Solaris 10 patch utilities patch, 119254-66 (SPARC) and 119255-66 (x86).

This is available for use on all Solaris 10 systems. 

Simply install this patch, set the maximum number of non-global zones to be patched in parallel in the config file /etc/patch/pdo.conf, and away you go.

Prior to this feature, each non-global zone was patched sequentially, leading to unnecessarily long patching times for zones systems.  (Sequential patching remains the default behavior unless the config file is edited to enable Zones Parallel Patching.)

With this feature invoked, the global zone continues to be patched first, but then the non-global zones can be patched in parallel, leading to significant performance gains in patching operations on Zones systems.

While the performance gain is dependent on a number of factors, including the number of non-global zones, the number of on-line CPUs, the speed of the system, the I/O configuration of the system, etc., a performance gain of ca. 300% can typically be expected for patching the non-global zones - e.g. On a T2000 with 5 sparse root non-global zones.

Here's the relevant note from the patch README file:

NOTE 10: 119255-66 is the first revision of the patch utilities to deliver "zones parallel patching".
          This new functionality allows multiple non-global zones to be patched in parallel by patchadd.
          Prior to revision 66, patchadd would patch all applicable non-global zones sequentially,
          that is one after another. With zones parallel patching, a sysadmin can now set the number
          of zones to patch in parallel in a new configuration file for patchadd called /etc/patch/pdo.conf.

         The two factors that affect the number of non-global zones that can be patched in parallel are
         1. Number of on-line CPUs
         2. The value of num_proc in /etc/patch/pdo.conf

          If the value of num_proc is less than or equal to 1.5 times the number of on line CPUs,
          then patchadd limits the maximum number of non-global zones that will be patched in
          parallel to num_proc. If the value of num_proc is greater than 1.5 times the number of on line CPUs,
          then patchadd limits the maximum number of non-global zones that will be patched in parallel
          to 1.5 times the number of on line CPUs. Note that patchadd will patch all applicable non-global
          zones on a system, the above description outlines only how patchaadd determines the
          maximum number of job slots to be used during parallel patching of non-global zones.

          An example of this in operation would be where:
          num_proc=8
          and number of on line CPU's is 4

          In this case the maximum setting for num_proc would be 6, that is the maximum number
          of zones that could be patched in parallel is 6.  If there are more than this number of non-global zones on the
          system, the first 6 will be patched in parallel, then the remaining non-global zones will be patched
          as processes finish patching the first 6 non-global zones.   Only one patch process will be used for each
          non-global zone, so if there are less than 6 non-global zones on the system, then only the number of processes
          equal to the number of non-global zones will be initiated.

          Please see comments in /etc/patch/pdo.conf for more details on setting num_proc.

I would like to thank Ed Clark and Enda O'Connor from my own team for all their work in developing and testing Zones Parallel Patching.

I would also like to thank Jon Bowman, Arindam Sarkar, and the rest of the RPE (Sustaining) Install team for all their work in getting this feature integrated into the patch utilities and delivered to production.

I would also like to thank our selected key customers who kindly Beta tested the feature for us.

I believe this feature is an important milestone in improving our customers' patching experience in a Zones environment as it addresses a long standing customer complaint on Zones patching performance.

Enjoy!

Thursday Dec 04, 2008

New title, same role, same me

I was promoted to Director, Software Patch Services in September.  The last couple of months have been quite hectic, as I've suddenly got a whole new bunch of buddies in Marketing and elsewhere who want some of my time.  That's a good thing, and I believe it will help me to drive and co-ordinate improvements for you, our customers, patching experience. 

Resources are limited and, as always, I'm interested in getting your thoughts as to what areas I should concentrate on next.  

Some of the stuff we're currently working on is outlined below as well as other information which I hope you will find useful.

Solaris 10 10/08 Patch Bundle

The Solaris 10 10/08 Patch Bundle, which delivers the equivalent set of patches to the Solaris 10 10/08 (Update 6) release image, is now available from SunSolve.  See my blog entry below on the Solaris 10 5/08 (Update 5) Patch Bundle for further information on why we produce it, what it contains, why you might wish to use it, how to download it, etc.

Recommended and Sun Alert patch cluster contents updated

I discussed the purpose of, and difference between, the Solaris Recommended and Sun Alert patch clusters in a previous blog posting. To recap:

The "Recommended" Cluster contains the latest revision of any Solaris OS patch which addresses a Sun Alert issue.  That is, a fix for a Security, Data Corruption, or System Availability issue.  The cluster also contains the latest revision of the patch utility patches to ensure correct patch application and any patch required by any other patch in the cluster.

The Sun Alert Cluster is newer, and contains the minimum revision of any Solaris OS patch which addresses a Sun Alert issue. The cluster also contains the latest revision of the patch utility patches to ensure correct patch application and any patch required by any other patch in the cluster.  Therefore, the Sun Alert Cluster provides the minimum amount of change to fix all Solaris OS Sun Alert issues. 

Both clusters are updated whenever a new patch meeting their inclusion criteria is released.  The Sun Alert Cluster changes less frequently than the "Recommended" Cluster as it contains only what is really needed to address Sun Alert issues and apply the patches.

One of my team members has been reconciling the cluster contents against the Sun Alert reports and the cluster contents have been updated as a result.  Some issues where found, largely to do with patches for things like GNOME which are also part of the Solaris OS.  A process has been put in place to ensure the cluster contents match the patches specified in the Sun Alert reports.   

Keeping as up to date as possible with the SunAlert or Recommended Cluster contents is advisable.   Remember also to keep firmware up to date.

BTW: The monthly EIS (Enterprise Installation Standards) patch baseline is based upon the Recommended Cluster contents but also includes ca. 150 additional patches to address irritants which are not Sun Alert fixes and includes patches for SunCluster, SunVTS, etc.  The monthly EIS patch baselines are available through xVM Ops Center and Sun Proactive Services.

I am planning to merge the Recommended and Sun Alert patch clusters into a single cluster using the Sun Alert cluster criteria as having two very similar clusters tends to confuse customers unnecessarily.  

I also intend to merge the two cluster pages on SunSolve as one is essentially a better formated subset of the other. 

ZFS and Zones features fully contained in patches

As I've mentioned previously, there's effectively a single customer visible code branch for each Solaris named release.  That means that there's one set of patches for all of Solaris 10, a separate set for Solaris 9, and a separate set for Solaris 8.  Within a named release, e.g. Solaris 10, the same set of patches will apply to any of the Solaris 10 releases, from the original Solaris 10 3/05 release right up to the current Solaris 10 10/08 (Update 6) release.  This simplifies System Administration and enables Sun to provide very long term support at reasonable cost for each Solaris named release. 

A consequence of effectively having a single code branch for each Solaris named release is that any change to pre-existing packages will be delivered in patch format.

New features are typically only added to the current Solaris named release, which is currently Solaris 10.  (They are also available via OpenSolaris.)

This means that if new features don't add any new packages, then the entire feature functionality is fully available in patches.  Customers can utilize the new features by simply applying the appropriate patches to their existing Solaris 10 system.  This is the case with all current Zones and ZFS* functionality, including neat features like ZFS Root, ZFS Boot, and Zones "Update on Attach".

Other features which deliver new packages are only available from the Solaris Update release in which they were first included.  So, for example, if a new package was first delivered in Solaris 10 8/07 (Update 4), then a customer wishing to use that feature would need to install or upgrade to the Solaris 10 8/07 (Update 4) or subsequent update release image.   Such features are not available in patches.

*OK, we cheated with ZFS.  ZFS does deliver new packages, but they are streamed into existence from a patch.  This type of patch is called a "genesis" patch, but they are hard to perfect, so we don't intend to release any more "genesis" patches.

Improving Zones Patching Performance

Zones Parallel Patching

My team has been working with those awfully nice folks in the Sustaining organization to deliver a Zones Parallel Patching enhancement to the patch utilities to dramatically improve Zones patching performance.  We have a fully stable prototype which has been given to selected Beta customers to trial. 

For a simple T2000 with 5 sparse non-global zones, the performance improvement is >3x.  On systems with optimized I/O (as Zones patching is primarily I/O bound), we expect the performance improvement to be even better.  A configuration file will allow users to select how many Zones to patch in parallel.  This will typically equate to the number of processors or threads available on the target system.

The general release of this feature is planned for April 2009.

Zones "Update on Attach" 

The Kernel patch associated with Solaris 10 10/08 (Update 6), 137137-09 (SPARC) / 137138-09 (x86) contains some cool new features, such as ZFS Root, ZFS Boot, and Zones "Update on Attach".  Beware, installing this patch requires significant free disk space to install!  See Sun Alert http://sunsolve.sun.com/search/document.do?assetkey=1-66-246207-1

Zones "Update on Attach" is a very cool feature indeed.

For example, if the patch level of non-global Zones is out-of-sync with respect to the global Zone, e.g. because the non-global Zones ran out of disk space during patch application, Zones "Update on Attach" provides a very neat way to bring the Zones back into sync.  Simply detach the affected non-global Zones, apply Kernel patch 137137-09 (SPARC) / 137138-09 (x86) to the global zones, and reattach the affected non-global Zones using 'zoneadm -z <zone-name> attach -u'.  The non-global Zones will be automagically updated to the same patch level as the global Zone.  Neat!

There are other interesting possibilities.  For example, detach all non-global Zones, apply an arbitrary set of patches to the global Zone (including 13713[78]-09), and reattach the non-global Zones using 'zoneadm -z <zone-name> attach -u'.  Viola!, the non-global Zones will be automagically updated with all of the patches applied to the global Zone.  Way neat!  And more importantly, way faster than even the Zones Parallel Patching solution we're working on.  And even better, it's available now!  This could be a key solution for customers having difficulty completing patching updates on Zones systems during tight maintenance windows.

We are working to explore potential caveats.  For example, when a patch is applied using 'patchadd' to a non-global zone, an "Undo.Z" file containing the data necessary to back out the patch is created specifically for each non-global zone to which the patch is applied.   Using Zones "Update on Attach" to patch non-global Zones will cause the "Undo.Z" file from the global Zone to be propagated to the non-global Zones.  This could theoretically cause issues if the patch is subsequently backed out (e.g. data from global Zone config files could potentially be merged into non-global Zone config files during patch backout which could potentially cause issues), although we've never actually encountered such an issue.  BTW: The same caveat applies to creating non-global Zones after the global Zone has been patched.  Again, we have yet to see this causing an actual issue, so it appears to be more of a theoretically caveat than a practical issue.

Improvements to 'smpatch' and Update Manager

The way the PatchPro analysis engine for 'smpatch' and Update Manager used to work was fine in theory, but in practice was what I call "a process with too many moving parts".   Too many steps had to happen correctly for the overall result to be correct.  In Six Sigma terms, there was too much error opportunity.  Occasionally, it would end up recommending a SPARC patch for an x86 system or a Solaris 8 patch for a Solaris 10 system.  Not surprisingly, its reputation suffered.

I'm pleased to say that a major overhaul to dramatically simplify the back end processing of 'smpatch' and Update Manager has just been rolled out by their engineering team.  The way 'smpatch' and Update Manager work is that Realization Detector(s) are associated with each patch.  These Realization Detectors determine whether it's appropriate to recommend a patch for application on a target system.  In the vast majority of cases, the Realization Detectors are simply comparing the packages contained in the patch to the packages installed on the system to see if the patch is applicable.  The enhancement is to replace these myriad Realization Detectors, which could potentially contain coding bugs, with a single Generic Realization Detector to map patch packages to packages on the target system.  It looks at the package name, package version, and package architecture fields (in pkginfo) for each package in the patch, and compares them to the same values for the packages installed on the target system.  If they match, the patch is recommended, else not.  Guess what, this is exactly how 'patchadd' decides whether a patch is applicable or not when installing a patch.  It's also how 'pca' works too in determining which patches to apply.

A few specialist Realization Detectors remain for a small number of patches which require special handling.

The changes to 'smpatch' and Update Manager should dramatically improve the reliability of these tools and the accuracy of their patching recommendations.

One remaining distinction between 'smpatch' / Update Manager and 'pca' is that 'pca' "knows" about all current Sun patches via the patchdiag.xref file, whereas 'smpatch' / Update Manager "knows" about all patches containing a 'patchinfo' file, including older patch revisions.  All Solaris OS and Java Enterprise System (middleware) patches contain a 'patchinfo' file.  These account for 49% of patches.  For patching the Solaris OS, the tools should produce similar results.  A decision was made not to "auto-include" all other patches for 'smpatch' and Update Manager, as it was felt that the explicit step of the patch creator including a non-blank PATCH_CORRECTS realization detector specification line in the 'patchinfo' file to signal that the patch was suitable for patch automation was potentially useful.  (Don't worry about what value the PATCH_CORRECTS field has.  This is overriden by the Generic Realization Detector in the vast majority of cases.  It has no meaning from a customer perspective.)

This enhancement is not an attempt to undermine 'pca'.  It's simply to improve 'smpatch' and Update Manager.  I will continue to work closely with Martin Paul to give him heads-ups on any initiative which may impact 'pca' and resolve any issues with patchdiag.xref.

One thing I want to do when I can free up some resources, is a comparative study of the patching recommendations of the various available patch automation tools, 'smpatch' / Update Manager, 'pca', UCE (a.k.a Sun Connection Satellite),  xVM Ops Center*, and TLP (Traffic Light Patching) which is used by Sun Proactive Services to provide tailored patching solutions for customers in conjunction with SRAS (Sun Risk Analysis Service) and the EIS (Enterprise Installation Standards) methodology, with a view to ensuring that the patching recommendations of the various tools are coherent and consistent, with the higher value tools providing more sophisticated analysis.  It's part of my efforts to co-ordinate patching improvements to improve our customers' patching experience.

*xVM OC also utilitizes the monthly EIS patch "baselines".

Same Patch Entitlement policy, new Patch Entitlement implementation

Solaris changed its business model a few years ago from selling Solaris and providing patches for free to a model of giving away the software releases for free and charging for patches. 

The policy is that patches delivering new security fixes will remain free to all customers, irrespective of whether or not they have a support contract, but most other patches require that customers have a valid support contract to access them.  (See my earlier blog entry on the subject.)

All fixes will all be available for free in the next Solaris Update release (and OpenSolaris), so customers not willing to pay for a support contract can still get the fixes by installing or upgrading to the next Solaris Update release.  They'll just need to wait for it to ship.  Alternatively, they can use OpenSolaris.

This policy is not changing.

What is changing is the implementation of patch entitlement to ensure it matches the policy.  Currently, circa 60% of Solaris patches are free, including most of the key patches.  Under the new entitlement implementation, 18% of Solaris patches will remain free, including the specific revision of all Solaris patches which include new security fixes.  The rest will require a valid support contract to access. 

Any of the following support contracts will provide access to all Solaris patches and patch clusters: a Solaris subscription, a Software Support Contract, a Sun System Service Plan for Solaris, a Sun Spectrum Storage Plan, or a Sun Spectrum Enterprise Service Plan.  Since the names of the support contracts change from time-to-time, this list may change.

The new implementation will roll out in Phases, starting this month.  The roll-out should be transparent to customers with valid support contracts.

Patch signing certificate renewal

The signing certificate used to sign Sun patches expires shortly.  A new signing certificate will be rolled out in January and instructions provided on how to adopt it.

Customers who download the unsigned patch versions will not need to take any action.

"Accumulation-only" patches

The "SplitGate" source code management model we first introduced in Solaris 10 8/07 (Update 4) has dramatically improved Solaris 10 patch quality.  A side-effect of the "SplitGate" model is that base PatchIDs (the first 6 digits) change at the end of each Update release.  See my earlier Solaris 10 Kernel PatchID Sequence posting.

In the "SplitGate" model, when building an Update release, we effectively have two parallel source code gates, one called the Sustaining Gate containing just the bug fixes we need to release to customers in patches asynchronous to the Update release, and the other called the Update Gate containing a superset of the the Sustaining Gate and as well as new features and less critical bug fixes which will be released as part of the Update release. 

The two gates remain separate (split) for the duration of the Update release build process.  Once the Update release has reached release quality, the Update Gate is promoted to become the new Sustaining Gate and the process repeats.  Since the Update Gate is always a strict superset of the Sustaining Gate, no regressions should result from the promotion of the Update Gate to become the new Sustaining Gate.  Each patch in the old Sustaining Gate is obsoleted by a corresponding patch from the Update Gate which has accumulated its contents.  When the Update is released, these new PatchIDs are released to SunSolve.  This is why you see the base PatchIDs changing after each Update release. 

If the Update Gate patch doesn't contain any additional code changes over the corresponding Sustaining Gate patch, then there's no need for customers to install the new Update Gate patch.  Such patches are called "accumulation-only" patches and can be identified as they have a different base PatchID (the first 6 digits) but don't contain any additional CR numbers over the Sustaining patch which they obsolete.

The reason Sun releases these "accumulation-only" patches is because some customers insist that all of the PatchIDs pre-applied into a Solaris Update release image be also available from SunSolve.

Monday Jun 09, 2008

Last week, the Solaris 10 05/08 (Update 5) Patch Bundle was released on SunSolve.  The patch bundle provides another option to customers when deciding their patching strategy to maintain their Solaris systems.

What is the Solaris 10 05/08 Patch Bundle ?

The Solaris 10 05/08 Patch Bundle contains the equivalent set of patches contained in the Solaris 10 05/08 (Update 5) release image.

Why use the Solaris 10 05/08 Patch Bundle ?

The Solaris 10 05/08 Patch Bundle was created as a result of direct customer feedback after the Solaris 10 08/07 (Update 4) release.  New hardware may require a specific minimum Solaris 10 Update release such as the Solaris 10 05/08 (Update 5) release.  Some customers may wish to bring their other existing Solaris 10 systems up to the same patch level as the new hardware running Solaris 10 05/08.  The recommended way to do this is to upgrade the existing systems to the Solaris 10 05/08 release using either regular Solaris Upgrade or Solaris Live Upgrade.  But some customers may have policies in place which make it difficult to upgrade but OK to patch a system.  The Solaris 10 05/08 Patch Bundle facilitates such customers to bring their existing systems up to the equivalent patch level to the Solaris 10 05/08 (Update 5) Release.  In theory, this should mean that pre-existing functionality on all of the customers' systems should react the same, warts and all. This makes for a more homogeneous environment which may help lower support costs.

The Solaris 10 Update releases are very intensely tested by a wide variety of QA teams within Sun.  Therefore, the functionality contained in the patches within the Solaris 10 05/08 Patch Bundle have been intensely tested as a unit through the testing performed on the Solaris 10 05/08 (Update 5) release image.  Additional testing of the Solaris 10 05/08 Patch Bundle has also been performed by the Patch System Test team.  Therefore, the Solaris 10 05/08 Patch Bundle provides a well tested "baseline" option on which to standardize systems.

So while the patch bundle may deliver more change than some other patching strategies, that change has been well tested as a unit and hence may actually reduce the risk of introducing regressions when compared to "dim sum" patching (i.e. choosing an arbitrary combination of patches).  Note that intensive processes are also in place to ensure "dim sum" patching works, and it's rare to encounter a problem caused by "dim sum" patching.

How does the Patch Bundle differ from the Solaris 10 05/08 (Update 5) release image ?

The Solaris 10 05/08 (Update 5) release is a complete Solaris release image.  It contains new packages to support new features in the Solaris 10 05/08 (Update 5) release as well as all Solaris patches which were available when the Update was built.  The patches are pre-applied into the Solaris 10 05/08 release image.  This means that one doesn't have to spend time adding the patches using 'patchadd'.  On the flipside, since the patches are pre-applied into the release image, they cannot be backed out using 'patchrm'.  This isn't generally a problem as the Solaris Update release images are very intensely tested.  One can do a fresh install of the Solaris 10 05/08 (Update 5) release, or upgrade to it from an earlier Solaris release.

The Solaris 10 05/08 Patch Bundle contains the equivalent set of patches to the Solaris 10 05/08 (Update 5) release. The patch bundle does not include the new packages contained in the Solaris 10 05/08 (Update 5) release.  Therefore, new features in Update 5 which depend upon new packages introduced in that release will not be available in the patch bundle.  However, as discussed in a previous blog entry, any change to pre-existing code is delivered in a patch.  This includes features as well as bug fixes.  Therefore some feature enhancements will be available in the patch bundle.  ZFS, for example, is typically self-contained in patches and hence ZFS enhancements will typically be available via the patch bundle as well as via the Update release image. So will most Zones enhancements.  The patch bundle is simply a collection of patches with an install order file (patch_order) and an install script wrapper (installbundle.sh) around 'patchadd'.  Patches in the patch bundle can be backed out using 'patchrm', so long as the '-d' (no save) option wasn't used when applying the patch bundle.

There are a number of "special" or "script" patches included in each Solaris Update release.  These patches are used to correct issues in how patches are pre-applied into the Solaris Update release image and have no purpose whatsoever outside of the Solaris Update release process.  Therefore, these "special" or "script" patches are not released to SunSolve and are not included in the patch bundle.  See the Solaris 10 05/08 Patch Bundle README file for further information on these and other minor differences between the patch set pre-applied in the Solaris 10 05/08 release image and the patch set included in the Solaris 10 05/08 Patch Bundle.

Access 

The Solaris 10 05/08 Patch Bundle is available from the usual patch cluster location. 

Log onto Sunsolve, click on Patches and Updates, then Recommended Patch Clusters and scroll down the box under "Recommended Solaris Patch Clusters, J2SE and Java Enterprise System Clusters" to the Solaris 10 SPARC 05/08 Patch Bundle and Solaris 10 x86 05/08 Patch Bundle entries.  

The cluster is chunked to aid download.  There are 2 chunks for x86 and 3 chunks for SPARC. 

Follow the download instructions to the right of the scroll-down box or read the README file for any chunk.

As with all patch clusters, you need a valid support contract to download the cluster.   The following support contracts include access entitlement to Solaris patches and Patch Clusters (BTW: Software Update = patch), plus a wide range of additional support services:  Solaris Subscriptions, which includes Basic, Standard, Premium, and Solaris Everywhere Service Plans (compare here); Sun Software Service Plans, including Basic, Standard, Premium, and Premium Plus; Sun System Service Plans for Solaris, which includes Bronze, Silver, Gold, or Platinum options (compare here); or a Sun Spectrum Enterprise Service Plan.  See also http://www.sun.com/servicelist/ for country specific details.

Installation

Read the Patch Bundle README file for full installation instructions.

The patch bundle can be installed either on the active boot environment (i.e. the live system) or an inactive boot environment. 

Patching an inactive boot environment is recommended as, depending on the starting patch level of the target system, it may involve less system downtime as only a single reboot is required at the end to activate the boot environment. 

If you patch the active boot environment (i.e. the live system), then depending on the starting patch level of the target system, you may need to reboot an x86 system up to three times (twice at specific points during the installation process and once at the end) and a SPARC system up to two times (once after installing Kernel patch 118833-36 and once at the end).  See the patch bundle README for details.

The Solaris 10 05/08 Patch Bundle includes a new install script, installbundle.sh, which guides users through the installation process. 

The patches are ordered in such a way as to process any reboots required when patching an active boot environment as near the start of the installation process as possible.  This is to facilitate System Administrators by allowing them to get over the interim reboots early in the process and kick off the final patching sequence and let the process complete. 

The screen output and logfiles produced are also designed to be as clear and self-explanatory as possible, providing both overview and drill-down capabilities.

Approximate Installation Time 

How long it will take to install the Solaris 10 05/08 Patch Bundle will depend upon a number of factors:

  • The speed of the hardware and its I/O.
  • Which Solaris 10 release is installed on the target system and what patch level the system is at.  The higher the Solaris 10 Update release or patch level, the quicker the patch bundle will apply.
  • Whether Zones are installed on the system and which type of Zone.  Currently, the time to apply the cluster to each whole root non-global Zone will be approximately linear - i.e. multiple the install time by the number of whole root non-global Zones on the system.  Sparse root non-global Zones will be a little faster. (BTW: Sparse root non-global Zones is the recommended option when creating non-global Zones.)  As mentioned in a previous blog posting, there is a project in development to improve Zones patching performance.

For example, I installed the Solaris 10 x86 05/08 Patch Bundle on a v65x running the original Solaris 10 3/05 "FCS" (First Customer Shipment) release with no additional patch applied (worst case) and no non-global Zones.  I applied the patch bundle to the active boot environment.  Installation took a total of 3 hours and 58 minutes plus 3 reboots (see the Patch Bundle README for an explanation of the reboots when patching an active boot environment).

Conclusion

The Solaris 10 05/08 Patch Bundle will not suit everyone.  It is a large collection of patches and hence is slow to download and install.

As described in a previous blog posting, the Sun Alert patch clusters (available from the same location on SunSolve - see above) provide the minimum amount of change to address the most critical Solaris issues.  The Sun Alert cluster contains all available Solaris patch fixes for Security, Data Corruption, and System Availability issues. New versions of the Sun Alert cluster are posted whenever a new patch to fix a Sun Alert issue becomes available.  Customers should try to keep as current as possible with the contents of the Sun Alert clusters.

For customers who want to bring all their systems to the Solaris 10 05/08 (Update 5) release patch level, installing or upgrading to the Solaris 10 05/08 (Update 5) Release image remains the recommended option where feasible.  The Solaris 10 05/08 Patch Bundle was simply created in response to a demand from customers for an alternative option where upgrading was not feasible due to internal customer policies.

Since Solaris Update releases are intensely tested, the patch bundle provides a good quality patch "baseline" on which to standardize systems.

From customer feedback to date, the next Patch Bundle for the equivalent set of patches for Update 6 is likely to also be a complete set of patches from Solaris 10 3/05 "FCS" (First Customer Shipment - i.e. the original Solaris 10 release) and not an incremental bundle just containing the patch set delta between Updates 5 and 6 as I had previously suggested.  Feel free to post a comment with your preference.

Enjoy!

Thursday Jan 10, 2008

As mentioned in my initial posting, there isn't a "one size fits all" patching strategy for all customers to use in all circumstances.

Perhaps the most common question which customers ask is, "What patches should I apply to my system ?"

The answer, unfortunately, is "It depends."

Many factors determine what patching strategy is appropriate for a particular system.  These may include:

  • Risk profile of the customer.  For example, Financial institutions tend to be very risk adverse.  Their change control processes can be onerous.
  • Criticality of the system.  Is it Life Critical, Mission Critical, Business Critical, or relatively expendable ?
  • Risk profile of the system.  For example, is it behind a firewall, is it vulnerable to Denial Of Service attacks, etc. ?
  • Cost of planned downtime (for proactive patching and maintenance) versus the cost of unplanned downtime (for reactive break-and-fix patching and maintenance).
  • Available Maintenance Windows
  • Upgrade strategy - Is the customer still on older versions such as Solaris 8 or Solaris 9 or is there a desire to leverage some of the cool new software features (e.g. Containers (Zones), ZFS, DTrace, etc.) or support for cool new hardware available in Solaris 10 ?
  • Desire to keep a relatively homogeneous Operating Environment across similar servers
  • etc.

While I can discuss some evolving thoughts on patching strategies here, please note that Sun Services offer comprehensive solutions tailored for the needs of specific customers.  The thoughts expressed here are not a substitute for careful analysis of the specific needs of individual customers.

Minimizing Risk

Risk minimization is a key consideration for many customers when deciding on a patching strategy.

Change implies risk.

There are industry studies which show that for every x number of bug fixes or lines of code changed, a new bug or regression is introduced.

One might logically conclude therefore, that the more change that is applied to a system, the more the risk of introducing a regression.  Hence, one might conclude that applying the minimum number of patches and hence the minimum amount of change would minimize risk.

But that's just one factor.

It's also important to consider the test coverage of the various change delivery options.  This includes test coverage by Sun as well as test coverage by the customer, channel partner, or other vendor.

Always install the latest patch utilities patch first

Always install the latest version of the Solaris patch utilities patch before installing any other patches.  This is important to ensure that you have all the latest fixes to the patch utilities.  The patch utility patches are always contained in the Solaris Recommended and SunAlert patch clusters and are always installed first, along with any patches which they themselves require.

The patch utilities patches are currently:

  Solaris 10 SPARC:    119254
  Solaris 10 x86:           119255
  Solaris 9 SPARC:      112951
  Solaris 9 x86:             114194
  Solaris 8 SPARC:      110380
  Solaris 8 x86:             110403

Depending on the OS version, several other patches may be required to avoid issues which can impact correct patch application.  Such patches are listed in the "Latest patch updates" section on the SunSolve home page.

Solaris Patch Management: Recommended Strategy 

The Solaris Patch Management: Recommended Strategy available from http://docs.sun.com and linked off the SunSolve "Patches and Updates" page is a good starting point. 

Perhaps surprisingly, it shows from an analysis of customer Explorer data that the more patches which are applied to a system, the more downtime that system will experience.  This is largely because, as discussed in the preceding posting, a number of patches require downtime in order to be installed on a live boot environment. 

However, in many cases the cost of unplanned downtime to fix issues is much, much higher than the cost of planned downtime to facilitate preventative patching to prevent issues from occurring in the first place. 

The trick is to know which patches are likely to prevent issues on a particular system.

Recommended and Sun Alert Patch Clusters 

When deciding what patches to apply to Solaris, the Recommended and the Sun Alert Patch Clusters, which are available from SunSolve to customers with a valid support contract, are a good starting point.  They provide:
  • The latest revision of patch utilities patch
  • Solaris patches which address Sun Alert issues - that is, patches which address Security, Data Corruption, or System Availability issues.
  • Any patch which is required by either of the above.

The main difference between the Recommended Patch Cluster and the newer Sun Alert Patch Cluster is that the Sun Alert Patch Cluster contains the lowest revision of patches which address Sun Alert issues while the Recommended Patch Cluster contains the latest available revision of such patches.  Both are good options.

Note, the Recommended and Sun Alert Clusters only contain patches for the Solaris OS.  They do not contain patches for middleware or application layer products such as Java ES, SunStudio, etc. 

Both the Recommended and the Sun Alert Patch Clusters come with an install_cluster script and a patch_order file listing the order in which the patches are to be installed.  See the Cluster README files linked off the "Patches and Updates" page on SunSolve for further information.   (On http://sunsolve.sun.com/show.do?target=patches/patch-access , "Solaris 10 x86" is the Solaris 10 x86 Recommended Cluster and "Solaris 10 x86 Sun Alert Patch Cluster" is self-explanatory.)

Applying the Solaris Recommended or Sun Alert patch cluster at each available maintenance window, plus any patches for fixes for bugs which you as a customer have filed yourself, is a good approach to proactive patching.

In between maintenance windows, monitor new Sun Alerts which are issued and determine whether your systems are vulnerable to the issue.  If the risk of the issue occurring is low or the consequences of the potential problem manageable, you may decide that it's OK to wait until the next maintenance window before applying the patch or taking whatever other action is recommended in the Sun Alert.  If the risk of the issue occurring is high and the potential problem severe, consider applying the patch or taking whatever other action is recommended in the Sun Alert as soon as possible.

Apart from these Solaris patch clusters, other patch clusters are available on http://sunsolve.sun.com/show.do?target=patches/patch-access for other products, such as J2SE and Java ES.

Installing or Upgrading to the latest Solaris Update Release

Each bi-weekly build of the next Solaris Marketing Release ( "Nevada" ) and the next Solaris 10 Update Release is intensely tested by a large number of test teams throughout Sun.  Each team has a particular focus, from functional testing of new features, regression testing of pre-existing features, performance improvement testing (each release should be faster than the last), new hardware testing, hardware regression testing, Desktop, Globalization, Accessibility, SunCluster, Java Enterprise System, patch testing, etc.

Due to the intensive testing of Update releases, installing or upgrading a system to the latest available Update Release should be seriously considered by customers wishing to minimize risk. 

While each Update release contains significant amount of code change, it has been intensely tested as a unit.  As previously mentioned, each Update contains all the available bug fixes at the time it was built.  Therefore, pre-existing functionality should be more stable and more performant in each successive Update.  The latest available Update should therefore provide a good stable baseline for customers. 

For example, Solaris 10 8/07 (Update 4) not only introduces cool new software features and support for cool new hardware, it also contains many fixes and enhancements to pre-existing functionality such as Containers (Zones), such as the ability to Upgrade Zones, significantly improving the maintainability of Zones.

The complexities of patching the live boot environment of a pre-Update 4 Zones systems can be avoided by upgrading to Update 4 instead.

"Dim Sum" Patching

As previously mentioned, new bug fixes as well as features "soak" in the next Marketing Release of Solaris under development (currently "Nevada";) to shake out any bugs in the code, before the code changes are allowed to be put back into an older release of Solaris for inclusion in a patch and, if the patch is for Solaris 10, included in the next Update Release.

 

In this way, patches leverage the intensive testing done on the Marketing and Update releases.  Indeed, Solaris 10 patches leverage this intensive testing twice: once in the Marketing Release "soak" test, and again when the bug fix is included in builds of the next Solaris 10 Update.

The bug fix in each patch is verified by the responsible engineers in-house using a test case and/or by providing the T-Patch (Test Patch) to escalating customers to verify that it fixes the issue.  In addition, the patch will be tested by the Patch System Test group and potentially by other test teams such as the Desktop QA or Hardware QA teams.  Only when all verification and testing has been successfully completed will the patch be released to SunSolve.  See an Overview of Patch Testing on SunSolve for further details.

Patch System Test test the patch both individually with any required patches, and cumulatively along with all other available patches. Testing the patch on its own helps ensure that all patch requirements have been correctly specified.  Testing the patch in combination with all other available patches helps ensure that there are no bad interactions between patches.  Testing these boundary conditions gives confidence that other patch combinations should work.

Extreme lengths are also taken in the code development and putback approval processes to ensure that patch requirements are correctly specified and that the change is compatible, well designed, and will not introduce regressions.

Nevertheless, if a customer takes individual patches which they feel are appropriate to their system, outside of a defined patch cluster such as the Recommended or Sun Alert Patch Cluster, they may end up running a code combination which has never been tested as a unit. 

The various checks and balances in the patch process should be fully sufficient to ensure this code combination is stable and functional.  But from a risk management perspective, running code which may not have been tested as a unit remains a finite risk.

This is what Bart Smaalders refers to as "Dim Sum" patching.

Most customers have practiced "Dim Sum" patching for years and, in general, it works very well.  Even with the massive amount of code changes included in Solaris 10 Update Releases compared to Solaris 8 or Solaris 9 Update Releases, there have been very few issues as a result of "Dim Sum" patching.

But using the latest available Solaris Update release or Recommended or SunAlert Cluster or EIS CD (via Sun Connection 1.1.1 Satellite or xVM Ops Center 1.0) or other set of patches as a baseline has the advantage that that baseline has been tested to varying degrees as a unit, with Solaris Update releases the most intensely tested of those options.

This is a case where taking more change by installing or upgrading to the latest Solaris Update may actually imply less risk.

Customer Testing

Testing by Sun is just one factor.  Testing by the customer, channel partner, or other vendor also plays a significant part in managing risk.

If the customer has a test set up which exactly mirrors their live production environment, with tests which mimic normal and peak loads, then their confidence level in any patching strategy they choose can be very high.

The less sophisticated the customer test environment, the more the customer is relying on Sun's Development and QA processes to  catch all the issues.

Patch Quality 

The good news is that the Sun's Development processes are meticulous and mature and the QA processes sophisticated and effective. 

That doesn't stop all issues escaping to customers but, in general, the quality of patches is very high. 

For example, out of approximately 4,500 patches released by Sun in 2007, only 70 have been subsequently withdrawn due to serious issues with them.

Patch Maturity

A number of customers like to wait a set period of time after a patch has been released before considering installing it to see if Sun or other customers find issues with it.

This is a reasonable strategy.

Some customers wait until 6 weeks after a patch has been released before applying it.  Data analysis shows that there is no particular significance in this time period.

Data analysis shows that on the rare occasion where patches which are withdrawn from SunSolve after their release due to serious issues with them, the length of time between when a patch was released and when it was withdrawn shows no clear period of time before which a patch can be considered unsafe or after which a patch can be considered safe.  Some patches are withdrawn within the first 3 weeks of release.  Others not until 18 months or 2 years later. 

However, it is reasonable to assume that patches which aren't withdrawn for a significant period of time only have a serious issue is a rare configuration which most customers won't encounter.

This blog copyright 2009 by Gerry Haskins