Ramblings from Richard's Ranch

/etc/system viruses

Wednesday Jun 16, 2004

There is a class of problems I call "/etc/system viruses" which occur when someone blindly copies someone else's /etc/system settings. This can really cause problems in addition to the propagation of inappropriate settings.

For example, there is at least one Sun Cluster customer who has


set halt_on_panic=1

in /etc/system. This setting means the system will not automatically reboot after a panic.

There may be a good reason for this setting. Perhaps this is a test cluster. Perhaps they have a 24x7x365 managed datacenter and this is their policy.

I do not recommend this as standard operating procedure for most cases because human intervention will be required to boot if there is a panic. Fortunately, panics are not very common. In general, if you are trying to build a highly available system, then you want to reduce scenarios where human intervention is required.

[2] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg
Comments:

You should also have a look at http://au.sun.com/news/onsun/2003-10/tech_tips_print.html which I wrote last year for the Australian Customer magazine.

Posted by Alan Hargreaves on June 17, 2004 at 06:29 PM PDT #

Richard. You make a very good point. Availability is a continuum, with clustering at one end of the continuum. There are many things customers can do to make a single system highly available.

Posted by Mark Harrison on June 18, 2004 at 02:39 AM PDT #

Post a Comment:
Comments are closed for this entry.