help & trueness
Juergen Schleich's Weblog
Archives
« December 2009
SunMonTueWedThuFriSat
  
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today
XML
Search this blog

Links
 
OpenSolaris: Innovation Matters
Recent Entries
Get the Source
Technorati

Today's Page Hits: 286

Main | Next page »
20091206 Sunday December 06, 2009
Sun Cluster 3.2 11/09 Update3 Patches

The Sun Cluster 3.2 11/09 Update3 is released. Click here for further information.

The package version of the Sun Cluster 3.2 11/09 Update3 are the same for the core framework and the agents as for Sun Cluster 3.2, Sun Cluster 3.2 2/08 Update1 and Sun Cluster 3.2 1/09 Update2. Therefore it's possible to patch up an existing Sun Cluster 3.2, Sun Cluster 3.2 2/08 Update1 or Sun Cluster 3.2 1/09 Update2.

The package version of the Sun Cluster Geographic Edition 3.2 11/09 Update3 are NOT the same as Sun Cluster Geographic Edition 3.2. But it's possible to upgrade the Geographic Edition 3.2 without interruption of the service. See documentation for details.

The following patches (with the mentioned revision) are included/updated in Sun Cluster 3.2 11/09 Update3. If these patches are installed on Sun Cluster 3.2, Sun Cluster 3.2 2/08 Update1 or Sun Cluster 3.2 1/09 Update2 release, then the features for framework & agents are identical with Sun Cluster 3.2 11/09 Update3. It's always necessary to read the "Special Install Instructions of the patch" but I made a note behind some patches where it's very important to read the "Special Install Instructions of the patch" (Using shortcut SIIOTP).

Included/updated patch revisions of Sun Cluster 3.2 11/09 Update3 for Solaris 10 05/09 Update7 or higher
126106-38 Sun Cluster 3.2: CORE patch for Solaris 10 Note: Please read SIIOTP
125992-05 Sun Cluster 3.2: SC Checks patch for Solaris 10
126017-03 Sun Cluster 3.2: HA-DNS Patch for Solaris 10
126032-09 Sun Cluster 3.2: Ha-MYSQL Patch for Solaris 10 Note: Please read SIIOTP
126035-06 Sun Cluster 3.2: HA-NFS Patch for Solaris 10
126044-06 Sun Cluster 3.2: HA-PostgreSQL Patch for Solaris 10 Note: Please read SIIOTP
126047-12 Sun Cluster 3.2: Ha-Oracle patch for Solaris 10 Note: Please read SIIOTP
126050-04 Sun Cluster 3.2: HA-Oracle E-business suite Patch for Solaris 10 (-04 not yet on SunSolve)
126059-05 Sun Cluster 3.2: HA-SAPDB Patch for Solaris 10
126071-02 Sun Cluster 3.2: HA-Tomcat Patch for Solaris 10
126080-04 Sun Cluster 3.2: HA-Sun Java Systems App Server Patch for Solaris 10
126083-04 Sun Cluster 3.2: HA-Sun Java Message Queue Patch for Solaris 10 Note: Please read SIIOTP
126095-06 Sun Cluster 3.2: Localization patch for Solaris 9 sparc and Solaris 10 sparc
128556-04 Sun Cluster 3.2: Man Pages Patch for Solaris 9 and Solaris 10, sparc
137931-02 Sun Cluster 3.2: Sun Cluster 3.2: HA-Informix patch for Solaris 10


Included/updated patch revisions of Sun Cluster 3.2 11/09 Update3 for Solaris 10 x86 05/09 Update7 or higher
126107-38 Sun Cluster 3.2: CORE patch for Solaris 10_x86 Note: Please read SIIOTP
125993-05 Sun Cluster 3.2: Sun Cluster 3.2: SC Checks patch for Solaris 10_x86
126018-05 Sun Cluster 3.2: HA-DNS Patch for Solaris 10_x86
126033-10 Sun Cluster 3.2: Ha-MYSQL Patch for Solaris 10_x86 Note: Please read SIIOTP
126036-07 Sun Cluster 3.2: HA-NFS Patch for Solaris 10_x86
126045-07 Sun Cluster 3.2: HA-PostgreSQL Patch for Solaris 10_x86 Note: Please read SIIOTP
126048-12 Sun Cluster 3.2: Ha-Oracle patch for Solaris 10_x86 Note: Please read SIIOTP
126060-06 Sun Cluster 3.2: HA-SAPDB Patch for Solaris 10_x86
126072-02 Sun Cluster 3.2: HA-Tomcat Patch for Solaris 10_x86
126081-05 Sun Cluster 3.2: HA-Sun Java Systems App Server Patch for Solaris 10_x86
126084-06 Sun Cluster 3.2: HA-Sun Java Message Queue Patch for Solaris 10_x86 Note: Please read SIIOTP
126096-06 Sun Cluster 3.2: Localization patch for Solaris 10 amd64
128557-04 Sun Cluster 3.2: Man Pages Patch for Solaris 10_x86
137932-02 Sun Cluster 3.2:Sun Cluster 3.2: HA-Informix patch for Solaris 10_x86


Included/updated patch revisions of Sun Cluster 3.2 11/09 Update3 for Solaris 9 5/09 or higher
126105-38 Sun Cluster 3.2: CORE patch for Solaris 9 Note: Please read SIIOTP
125991-05 Sun Cluster 3.2: Sun Cluster 3.2: SC Checks patch for Solaris 9
126016-03 Sun Cluster 3.2: HA-DNS Patch for Solaris 9
126031-09 Sun Cluster 3.2: Ha-MYSQL Patch for Solaris 9 Note: Please read SIIOTP
126034-06 Sun Cluster 3.2: HA-NFS Patch for Solaris 9
126043-06 Sun Cluster 3.2: HA-PostgreSQL Patch for Solaris 9 Note: Please read SIIOTP
126046-12 Sun Cluster 3.2: HA-Oracle patch for Solaris 9 Note: Please read SIIOTP
126049-04 Sun Cluster 3.2: HA-Oracle E-business suite Patch for Solaris 9 (-04 not yet on SunSolve)
126058-05 Sun Cluster 3.2: HA-SAPDB Patch for Solaris 9
126070-02 Sun Cluster 3.2: HA-Tomcat Patch for Solaris 9
126079-04 Sun Cluster 3.2: HA-Sun Java Systems App Server Patch for Solaris 9
126082-04 Sun Cluster 3.2: HA-Sun Java Message Queue Patch for Solaris 9 Note: Please read SIIOTP
126095-06 Sun Cluster 3.2: Localization patch for Solaris 9 sparc and Solaris 10 sparc
128556-04 Sun Cluster 3.2: Man Pages Patch for Solaris 9 and Solaris 10, sparc


The quorum server is an alternative to the traditional quorum disk. The quorum server is outside of the Sun Cluster and will be accessed through the public network. Therefore the quorum server can be a different architecture.

Included/updated patch revisions in Sun Cluster 3.2 11/09 Update3 for quorum server feature:
127404-03 Sun Cluster 3.2: Quorum Server Patch for Solaris 9
127405-04 Sun Cluster 3.2: Quorum Server Patch for Solaris 10
127406-04 Sun Cluster 3.2: Quorum Server Patch for Solaris 10_x86
Please beware of the following note which in the Special Install Instructions of Sun Cluster 3.2 core patch -38 and higher:
NOTE 17: Quorum server patch 127406-04 (or greater) needs to be installed on quorum server host first, before installing 126107-37 (or greater) Core Patch on cluster nodes.
127408-02 Sun Cluster 3.2: Quorum Man Pages Patch for Solaris 9 and Solaris 10, sparc
127409-02 Sun Cluster 3.2: Quorum Man Pages Patch for Solaris 10_x86


If some patches must be applied when the node is in noncluster mode, you can apply them in a rolling fashion, one node at a time, unless a patch's instructions require that you shut down the entire cluster. Follow procedures in How to Apply a Rebooting Patch (Node) in Sun Cluster System Administration Guide for Solaris OS to prepare the node and boot it into noncluster mode. For ease of installation, consider applying all patches at once to a node that you place in noncluster mode.


posted by jschleich Dec 06 2009, 08:39:46 PM CET Permalink Comments [0]

20091205 Saturday December 05, 2009
That gives a deep insight


Do you know The Celestine Insights ?
I have seen the movie The Celestine Prophecy and was quite surprised. This is a movie which is really a difference to other "normal" movies. It is very exciting and brings a lot of informations about bio-energy. The insights are deep and inspire us to study it. This is not a new movie, if you have already seen the movie - GREAT!!!


posted by jschleich Dec 05 2009, 12:10:17 PM CET Permalink

20091204 Friday December 04, 2009
the beauty of fungi


I came across some fungi when I was walking through the forest...
This is a wood and tree fungus, the white was much more beautiful than on the picture.

posted by jschleich Dec 04 2009, 11:00:11 PM CET Permalink

20091030 Friday October 30, 2009
Kernel patch 141444-09 or 141445-09 with Sun Cluster 3.2

As stated in my last blog the following kernel patches are included in Solaris 10 10/09 Update8.
141444-09 SunOS 5.10: kernel patch or
141445-09 SunOS 5.10_x86: kernel patch

Update 4.Dec.2009:
Support of Solaris 10 10/09 Update8 with Sun Cluster 3.2 1/09 Update2 is now announced. The recommendation is to use the 126106-39 (sparc) / 126107-39 (x86) with Solaris 10 10/09 Update8. Note: The -39 Sun Cluster core patch is a feature patch because the -38 Sun Cluster core patch is part of Sun Cluster 3.2 11/09 Update3 which is already released.
For new installations/upgrades with Solaris 10 10/09 Update8 use:
* Sun Cluster 3.2 11/09 Update3 with Sun Cluster core patch -39 (fixes problem 1)
* Use link-based IPMP (workaround for problem 2) the patches 142900-02/142901-02 which fix the issue coming soon.
* Add "set nautopush=64" to /etc/system (workaround for problem 3)

For patch updates to 141444-09/141445-09 use:
* Sun Cluster core patch -39 (fixes problem 1)
* if possible configure link-based IPMP otherwise wait for the fix (workaround for problem 2). The patches 142900-02/142901-02 which fix the issue coming soon.
* Add "set nautopush=64" to /etc/system (workaround for problem 3)


It's time to notify that there are some issues with these kernel patches in combination with Sun Cluster 3.2

1.) The patch breaks the zpool cachefile feature if using SUNW.HAStoragePlus

a.) If the kernel patch 141444-09 (sparc) / 141445-09 (x86) is installed on a Sun Cluster 3.2 system where the Sun Cluster core patch 126106-33 (sparc) / 126107-33 (x86) is already installed then hastorageplus_prenet_start will fail with the following error message:
...
Oct 26 17:51:45 nodeA SC[,SUNW.HAStoragePlus:6,rg1,rs1,hastorageplus_prenet_start]: Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 26 17:51:53 nodeA SC[,SUNW.HAStoragePlus:6,rg1,rs1,hastorageplus_prenet_start]: Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 26 17:51:54 nodeA zfs: [ID 427000 kern.warning] WARNING: pool 'zpool1' could not be loaded as it was last accessed by another system (host: nodeB hostid: 0x8516ced4). See: http://www.sun.com/msg/ZFS-8000-EY
...


b.) If the kernel patch 141444-09 (sparc) / 141445-09 (x86) is installed on a Sun Cluster 3.2 system where the Sun Cluster core patch 126106-35 (sparc) / 126107-35 (x86) is already installed then hastorageplus_prenet_start will work but the zpool cachefile feature of SUNW.HAStoragePlus is disabled. Without the zpool cachefile feature the time of zpool import increases because the import will scan all available disks. The messages look like:
...
Oct 30 15:37:45 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:45 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 792255 daemon.warning] Failed to update the cachefile contents in /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile to CCR table zpool1.cachefile for pool zpool1 : file /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile open failed: No such file or directory.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 792255 daemon.warning] Failed to update the cachefile contents in /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile to CCR table zpool1.cachefile for pool zpool1 : file /var/cluster/run/HAStoragePlus/zfs/zpool1.cachefile open failed: No such file or directory.
Oct 30 15:37:49 nodeA SC[,SUNW.HAStoragePlus:8,nfs-rg,zpool1-rs,hastorageplus_validate]: [ID 205754 daemon.info] All specified device services validated successfully.
...


If the ZFS cachefile feature is not required AND the above kernel patches are installed, problem a.) is resolved by installing Sun Cluster core patch 126106-35 (sparc) / 126107-35 (x86).
Solution for a) and b):
126106-39 Sun Cluster 3.2: CORE patch for Solaris 10
126107-39 Sun Cluster 3.2: CORE patch for Solaris 10_x86

Sun Alert 272669: A Solaris Kernel Change Stops Sun Cluster Using "zpool.cachefiles" to Import zpools Resulting in ZFS pool Import Performance Degradation or Failure to Import the zpools
This is reported in Bug 6895580

2.) The patch breaks probe-based IPMP if more than one interface is in the same IPMP group

After installing the already mentioned kernel patch:
141444-09 SunOS 5.10: kernel patch or
141445-09 SunOS 5.10_x86: kernel patch
then the probe-based IPMP group feature is broken if the system is using more than one interface in the same IPMP group. This means all Solaris 10 systems which are using more than one interface in the same probe-based IPMP group are affected!

After installing this kernel patch the following errors will be sent to the system console after a reboot:
...
nodeA console login: Oct 26 19:34:41 in.mpathd[210]: NIC failure detected on bge0 of group ipmp0
Oct 26 19:34:41 in.mpathd[210]: Successfully failed over from NIC bge0 to NIC e1000g0
...

Workarounds:
a) Use link-based IPMP instead of probe-based IPMP
b) Use only one interface in the same IPMP group if using probe-based IPMP
See the blog "Tips to configure IPMP with Sun Cluster 3.x" for more details if you like to change the configuration.
c) Do not install the listed kernel patch above. Note: Fix is already in progress and can be reached via a service request. I will update this blog when the general fix is available.

Sun Alert 271519: Solaris 10 Kernel Patches 141444-09 and 141445-09 May Cause Interface Failure in IP Multipathing (IPMP)
This is reported in Bug 6888928

3.) When applying the patch Sun Cluster can hang on reboot

After installing the already mentioned kernel patch:
141444-09 SunOS 5.10: kernel patch or
141511-05 SunOS 5.10_x86: ehci, ohci, uhci patch
the Sun Cluster nodes can hang within boot because the Sun Cluster nodes has exhausted the default number of autopush structures. When clhbsndr module is loaded, it causes a lot more autopushes to occur than would otherwise happen on a non-clustered system. By default, we only allocate nautopush=32 of these structures.

Workarounds:
a) Do not use the mentioned kernel patch with Sun Cluster
b) Boot in non-cluster-mode and add the following to /etc/system
set nautopush=64

Sun Alert 273610: Solaris autopush(1M) Changes (with patches 141444-09/141511-04) May Cause Sun Cluster 3.1 and 3.2 Nodes to Hang During Boot
This is reported in Bug 6879232


posted by jschleich Oct 30 2009, 03:57:51 PM CET Permalink Comments [2]

20091019 Monday October 19, 2009
Solaris 10 kernel patches
This is a short overview of Solaris 10 kernel patches. The table show which kernel patch revision is included in the Solaris 10 Update releases and there patch dependencies. Install the kernel patch of a Solaris 10 update release is not the same as do an upgrade to the Solaris 10 update release. Sometimes is advisable to do an upgrade from an Solaris 10 update release to a higher Solaris 10 update release. There is not a general rule but when it is necessary to jump over some Solaris 10 update releases then a upgrade is advisable. The list is sorted from newest to oldest...
142900-xx
142901-xx
newest patchid sparc
newest patchid x86
 requires 141444-09 SunOS 5.10: kernel patch
 requires
141445-09 SunOS 5.10_x86: kernel patch
141444-09
141445-09
Solaris 10 10/09 Update8 sparc
Solaris 10 10/09 Update8 x86
 requires 139555-08
 requires 139556-08
141414-10
141415-10
highest release sparc
highest release x86
Obsoleted by: 141444-09 SunOS 5.10: kernel patch
Obsoleted by:
141445-09 SunOS 5.10_x86: kernel patch
139555-08
139556-08
Solaris 10 05/09 Update7 sparc
Solaris 10 05/09 Update7 x86
 requires 137137-09
 requires 137138-09
138888-08
138889-08
highest release sparc
highest release x86
Obsoleted by: 139555-08 SunOS 5.10: Kernel Patch
Obsoleted by:
139556-08 SunOS 5.10_x86: Kernel Patch
137137-09
137138-09
Solaris 10 10/08 Update6 sparc
Solaris 10 10/08 Update6 x86
 requires 127127-11
 requires 127128-11
137111-08
137112-08
highest release sparc
highest release x86
Obsoleted by: 137137-09 SunOS 5.10: kernel patch
Obsoleted by:
137138-09 SunOS 5.10_x86: kernel patch
127127-11
127128-11
Solaris 10 05/08 Update5 sparc
Solaris 10 05/08 Update5 x86
 requires 120011-14
 requires 120012-14
127111-11
127112-11
highest release sparc
highest release x86
Obsoleted by: 127127-11 SunOS 5.10: kernel patch
Obsoleted by:
127128-11 SunOS 5.10_x86: kernel patch
120011-14
120012-14
Solaris 10 08/07 Update4 sparc
Solaris 10 08/07 Update4 x86
 requires 118833-36
 requires 118855-36
125100-10
125101-10
highest release sparc
highest release x86
Obsoleted by: 120011-14 SunOS 5.10: Kernel Update patch
Obsoleted by:
120012-14 SunOS 5.10_x86: Kernel Update patch
118833-36
118855-36
highest release sparc
highest release x86
 this is a must have for Solaris 10 11/06 Update3 sparc
 this is a must have for Solaris 10 11/06 Update3 x86
118833-33
118855-33
Solaris 10 11/06 Update3 sparc
Solaris 10 11/06 Update3 x86

118833-17
118855-14
Solaris 10 06/06 Update2 sparc
Solaris 10 06/06 Update2 x86

118822-30
118844-30
highest release sparc
highest release x86
Obsoleted by: 118833-36 SunOS 5.10: kernel Patch
Obsoleted by:
118855-36 SunOS 5.10_x86: kernel Patch
118822-25
118844-26
Solaris 10 01/06 Update1 sparc
Solaris 10 01/06 Update1 x86

118822-10
118844-11
Solaris 10 03/05 HW1 sparc
Solaris 10 03/05 HW1 x86


Solaris 10 sparc
Solaris 10 x86



posted by jschleich Oct 19 2009, 12:33:58 PM CEST Permalink Comments [0]

20090924 Thursday September 24, 2009
Configuration steps to create a zone cluster

This is a short overview on how to configure a zone cluster. It is highly recommended to use Solaris 10 5/09 update7 with patch baseline July 2009 (or higher) and Sun Cluster 3.2 1/09 with Sun Cluster 3.2 core patch revision -33 or higher. The name of the zone cluster must be unique throughout the global Sun Cluster and must be configured on a global Sun Cluster. Please read the requirements for zone cluster in Sun Cluster Software Installation Guide


A. Configure the zone cluster into the global cluster

B. Add resource groups and resources to zone cluster

Example output:


Appendix: To delete a zone cluster do:
# clzc halt zc1
# clzc uninstall zc1
# clzc delete zc1
Note:
Zone cluster uninstall can only be done if all resource groups are removed in the zone cluster. The command 'clrg delete -F +' can be used in zone cluster to delete the resource groups recursively.


posted by jschleich Sep 24 2009, 05:38:57 PM CEST Permalink Comments [3]

20090907 Monday September 07, 2009
Entries in infrastructure file if using tagged VLAN for cluster interconnect

In some cases it's necessary to add a tagged VLAN id to the cluster interconnect. This example show the difference of the cluster interconnect configuration if using tagged VLAN id or not. The interface e1000g2 have a "normal" setup (no VLAN id) and the interface e1000g1 got a VLAN id of 2. The used ethernet switch must be configured first with tagged VLAN id before the cluster interconnect can be configured. Use "clsetup" to assign a VLAN id to cluster interconnect.

Entries for "normal" cluster interconnect interface in /etc/cluster/ccr/global/infrastructure - no tagged VLAN:
cluster.nodes.1.adapters.1.name e1000g2
cluster.nodes.1.adapters.1.properties.device_name e1000g
cluster.nodes.1.adapters.1.properties.device_instance 2
cluster.nodes.1.adapters.1.properties.transport_type dlpi
cluster.nodes.1.adapters.1.properties.lazy_free 1
cluster.nodes.1.adapters.1.properties.dlpi_heartbeat_timeout 10000
cluster.nodes.1.adapters.1.properties.dlpi_heartbeat_quantum 1000
cluster.nodes.1.adapters.1.properties.nw_bandwidth 80
cluster.nodes.1.adapters.1.properties.bandwidth 70
cluster.nodes.1.adapters.1.properties.ip_address 172.16.1.129
cluster.nodes.1.adapters.1.properties.netmask 255.255.255.128
cluster.nodes.1.adapters.1.state enabled
cluster.nodes.1.adapters.1.ports.1.name 0
cluster.nodes.1.adapters.1.ports.1.state enabled


Entries for cluster interconnect interface in /etc/cluster/ccr/global/infrastructure - with tagged VLAN:
cluster.nodes.1.adapters.2.name e1000g2001

cluster.nodes.1.adapters.2.properties.device_name e1000g
cluster.nodes.1.adapters.2.properties.device_instance 1
cluster.nodes.1.adapters.2.properties.vlan_id 2
cluster.nodes.1.adapters.2.properties.transport_type dlpi
cluster.nodes.1.adapters.2.properties.lazy_free 1
cluster.nodes.1.adapters.2.properties.dlpi_heartbeat_timeout 10000
cluster.nodes.1.adapters.2.properties.dlpi_heartbeat_quantum 1000
cluster.nodes.1.adapters.2.properties.nw_bandwidth 80
cluster.nodes.1.adapters.2.properties.bandwidth 70
cluster.nodes.1.adapters.2.properties.ip_address 172.16.2.1
cluster.nodes.1.adapters.2.properties.netmask 255.255.255.128
cluster.nodes.1.adapters.2.state enabled
cluster.nodes.1.adapters.2.ports.1.name 0
cluster.nodes.1.adapters.2.ports.1.state enabled

The tagged VLAN interface is a combination of the VLAN id and the used network interface. In this example e1000g2001, the 2 after the e1000g is the VLAN id and the 1 at the end is the instance of the e1000g driver. Normally this would be the e1000g1 interface but with the VLAN id it becomes the interface e1000g2001.

The ifconfig -a of the above configuration is:
# ifconfig -a

lo0: flags=20010008c9 mtu 8232 index 1
       inet 127.0.0.1 netmask ff000000
e1000g0: flags=9000843 mtu 1500 index 2
      inet 10.16.65.63 netmask fffff800 broadcast 10.16.55.255
      groupname sc_ipmp0
      ether 0:14:4f:20:6a:18
e1000g2: flags=201008843 mtu 1500 index 4
      inet 172.16.1.129 netmask ffffff80 broadcast 172.16.1.255
      ether 0:14:4f:20:6a:1a
e1000g2001: flags=201008843 mtu 1500 index 3
      inet 172.16.2.1 netmask ffffff80 broadcast 172.16.2.127
      ether 0:14:4f:20:6a:19

clprivnet0: flags=1009843 mtu 1500 index 5
      inet 172.16.4.1 netmask fffffe00 broadcast 172.16.5.255
      ether 0:0:0:0:0:1


posted by jschleich Sep 07 2009, 04:59:17 PM CEST Permalink Comments [0]

20090723 Thursday July 23, 2009
Sun Cluster 3.x command line overview

I wrote together a quick reference guide for Sun Cluster 3.x. The guide includes the "old" command line which is used for Sun Cluster 3.0, 3.1, 3.2 and the already known Sun Cluster 3.2 new object based command line. Please do not expect the whole command line in this two pages. It should be a reminder for the most used commands within Sun Cluster 3.x. I added the pictures to this blog but also the pdf file is available for download.





Further reference guide are available:
Sun Cluster 3.2 Quick Reference Guide
German Sun Cluster 3.2 Quick Reference Guide
Sun Cluster 3.1 command line cheat sheet


posted by jschleich Jul 23 2009, 04:30:21 PM CEST Permalink Comments [0]

20090722 Wednesday July 22, 2009
Our Home The Earth


Have you ever thought about our real home? It seems it's to time to do it. I came across the homeproject . This is a great movie/documentation about our lovely earth and it's really worth to take a look. It's a cinema movie and is also already available as DVD.



A short trailer is available on YouTube.
The whole movie in german is also available on YouTube in fifteen parts.
The whole movie is back on YouTube in various languages.



posted by jschleich Jul 22 2009, 09:13:52 PM CEST Permalink

20090617 Wednesday June 17, 2009
Ready for Sun Cluster 3.2 1/09 Update2?

Now it's time to install/upgrade to Sun Cluster 3.2 1/09 Update2. The major bugs of Sun Cluster 3.2 1/09 Update2 are fixed in
126106-33 or higher Sun Cluster 3.2: CORE patch for Solaris 10
126107-33 or higher Sun Cluster 3.2: CORE patch for Solaris 10_x86
126105-33 or higher Sun Cluster 3.2: CORE patch for Solaris 9

This means the core patch should be applied immediately after the installation of Sun Cluster 1/09 Update2 software. The installation approach in short words:

  • Install Sun Cluster 3.2 1/09 Update2 with java enterprise installer

  • Install the necessary Sun Cluster 3.2 core patch as mentioned above

  • Configure Sun Cluster 3.2 with scinstall

  • Further details available in Sun Cluster Software Installation Guide for Solaris OS on docs.sun.com.
    Also Installation services are available.


    posted by jschleich Jun 17 2009, 05:56:29 PM CEST Permalink Comments [0]

    20090508 Friday May 08, 2009
    Administration of zpool devices in Sun Cluster 3.2 environment


    Carefully configure zpools in Sun Cluster 3.2. Because it's possible to use the same physical device in different zpools on different nodes at the same time. This means the zpool command does NOT care about if the physical device is already in use by another zpool on another node. e.g. If node1 have an active zpool with device c3t3d0 then it's possible to create a new zpool with c3t3d0 on another node. (assumption: c3t3d0 is the same shared device on all cluster nodes).

    Output of testing...


    If problems occurred due to administration mistakes then the following errors have been seen:

    NODE1# zpool import tank
    cannot import 'tank': I/O error

    NODE2# zpool import tankothernode
    cannot import 'tankothernode': one or more devices is currently unavailable

    NODE2# zpool import tankothernode
    cannot import 'tankothernode': no such pool available

    NODE1# zpool import tank
    cannot import 'tank': pool may be in use from other system, it was last accessed by NODE2 (hostid: 0x83083465) on Fri May 8 13:34:41 2009
    use '-f' to import anyway
    NODE1# zpool import -f tank
    cannot import 'tank': one or more devices is currently unavailable


    Furthermore the zpool command also use the disk without any warning if it used by Solaris Volume Manager diskset or Symantec (Veritas) Volume Manager diskgroup.

    Summary for Sun Cluster environment:
    ALWAYS MANUALLY CHECK THAT THE DEVICE WHICH USING FOR ZPOOL IS FREE!!!


    This is addressed in bug 6783988.


    posted by jschleich May 08 2009, 03:44:23 PM CEST Permalink Comments [5]

    20090504 Monday May 04, 2009
    cluster configuration repository can get corrupted on installation Sun Cluster 3.2 1/09 Update2


    The issue only occurs if the Sun Cluster 3.2 1/09 Update2 will be installed with a non-default netmask address for cluster interconnect.

    Seen problems if system is affected:
    Errors with:
    * did devices
    * quorum device
    * 'scstat -i' can look like:
    -- IPMP Groups --
                 Node Name           Group    Status    Adapter   Status
                 ---------           -----    ------    -------   ------
    scrconf: RPC: Authentication error; why = Client credential too weak
    scrconf: Failed to get zone information for s4u-4800f-domc-muc07 - unexpected error.
    scrconf: RPC: Authentication error; why = Client credential too weak
    scrconf: Failed to get zone information for s4u-4800f-doma-muc07 - unexpected error.
    scrconf: RPC: Authentication error; why = Client credential too weak
    scrconf: Failed to get zone information for s4u-4800e-domc-muc07 - unexpected error.
    scrconf: RPC: Authentication error; why = Client credential too weak
    scrconf: Failed to get zone information for s4u-4800e-doma-muc07 - unexpected error.
    IPMP Group: s4u-4800f-domc-muc07 sc_ipmp0 Online    qfe0      Online
    IPMP Group: s4u-4800f-doma-muc07 sc_ipmp0 Online    qfe0      Online
    IPMP Group: s4u-4800e-domc-muc07 sc_ipmp0 Online    qfe0      Online
    IPMP Group: s4u-4800e-doma-muc07 sc_ipmp0 Online    qfe0      Online


    How the problem occur?
    After the installation of Sun Cluster 3.2 1/09 Update2 product with the java installer it's necessary to run the #scinstall command. If choose "Custom" installation instead of "Typical" installation then it's possible to change the default of the netmask of cluster interconnect. The following questions come up within the installation procedure if answering the default netmask question with 'no'.

    Example scinstall:
           Is it okay to accept the default netmask (yes/no) [yes]? no
           Maximum number of nodes anticipated for future growth [64]? 4
           Maximum number of private networks anticipated for future growth [10]?
           Maximum number of virtual clusters expected [12]? 0
           What netmask do you want to use [255.255.255.128]?
    Prevent the issue by answering the virtual clusters question with '1' or other serious consideration to future growth potential if necessary.
    Do NOT answer the virtual clusters question with '0'!


    Example of the whole scinstall log when corrupted ccr occur:

    In the /etc/cluster/ccr/global/infrastructure file the error can be found by an empty entry for cluster.properties.private_netmask. Furthermore some other lines are not reflect the correct values for netmask as choosen within scinstall.
    Wrong infrastructure file:
    cluster.state enabled
    cluster.properties.cluster_id 0x49F82635
    cluster.properties.installmode disabled
    cluster.properties.private_net_number 172.16.0.0
    cluster.properties.cluster_netmask 255.255.248.0
    cluster.properties.private_netmask
    cluster.properties.private_subnet_netmask 255.255.255.248
    cluster.properties.private_user_net_number 172.16.4.0
    cluster.properties.private_user_netmask 255.255.254.0

    cluster.properties.private_maxnodes 6
    cluster.properties.private_maxprivnets 10
    cluster.properties.zoneclusters 0
    cluster.properties.auth_joinlist_type sys

    If answering the virtual cluster question with value '1' then the correct netmask entries are:
    cluster.properties.cluster_id 0x49F82635
    cluster.properties.installmode disabled
    cluster.properties.private_net_number 172.16.0.0
    cluster.properties.cluster_netmask 255.255.255.128
    cluster.properties.private_netmask 255.255.255.128
    cluster.properties.private_subnet_netmask 255.255.255.248
    cluster.properties.private_user_net_number 172.16.0.64
    cluster.properties.private_user_netmask 255.255.255.224

    cluster.properties.private_maxnodes 6
    cluster.properties.private_maxprivnets 10
    cluster.properties.zoneclusters 1
    cluster.properties.auth_joinlist_type sys


    Workaround if problem already occured:
    1.) Boot all nodes in non-cluster-mode with 'boot -x'
    2.) Change the wrong values of /etc/cluster/ccr/global/infrastructure on all nodes. See example above.
    3.) Write a new checksum for all infrastructure files on all nodes. Use -o (master file) on the node which is booting up first.
    s4u-4800e-doma-muc07 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure -o
    s4u-4800e-domc-muc07 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure
    s4u-4800f-doma-muc07 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure
    s4u-4800f-domc-muc07 # /usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/global/infrastructure
    4.) first reboot s4u-4800e-doma-muc07 (master infrastructure file) into cluster, then the other nodes.


    This is reported in bug 6825948.


    Update 17.Jun.2009:
    The -33 revision of the Sun Cluster core patch is the first released version which fix this issue at installation time.
    126106-33 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-33 Sun Cluster 3.2: CORE patch for Solaris 10_x86


    posted by jschleich May 04 2009, 10:30:02 AM CEST Permalink Comments [0]

    20090424 Friday April 24, 2009
    Upgrade to Sun Cluster 3.2 1/09 Update2 and SUNWscr preremove script

    There is a missing/old preremove script in Sun Cluster 3.2 2/08 Update1 which is equivalent to the patches
    126106-12 until -19 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-12 until -19 Sun Cluster 3.2: CORE patch for Solaris 10_x86
    126105-12 until -19 Sun Cluster 3.2: CORE patch for Solaris 9

    This means in case of Upgrade (using scinstall -u) from Sun Cluster 3.2 to Sun Cluster 3.2 update1 or update2 the issue can occur.
    More details available in Missing preremove script in Sun Cluster 3.2 core patch revision 12 and higher.
    The issue is, if the mentioned Sun Cluster core patches are installed it is not possible to remove the SUNWscr package within the upgrade to Sun Cluster 3.2 1/09 Update2.

    The problem looks as:
    # ./scinstall -u update
    Starting upgrade of Sun Cluster framework software
    Saving current Sun Cluster configuration
    Do not boot this node into cluster mode until upgrade is complete.
    Renamed "/etc/cluster/ccr" to "/etc/cluster/ccr.upgrade".
    ** Removing Sun Cluster framework packages **
        ...
        Removing SUNWscrtlh..done
        Removing SUNWscr.....failed
        scinstall: Failed to remove "SUNWscr"
        Removing SUNWscscku..done
        ...
    scinstall: scinstall did NOT complete successfully!


    Workaround:
    Before the upgrade to Sun Cluster 3.2 Update1/Update2 install the following patch which delivers a correct preremove script for Sun Cluster 3.2
    140016 Sun Cluster 3.2: CORE patch for Solaris 9
    140017 Sun Cluster 3.2: CORE patch for Solaris 10
    140018 Sun Cluster 3.2: CORE patch for Solaris 10_x86

    If already one of the following patches installed then the above patches are not necessary, because these patches also include a correct preremove script for package SUNWscr.
    126106-27 or higher Sun Cluster 3.2: CORE patch for Solaris 10
    126107-28 or higher Sun Cluster 3.2: CORE patch for Solaris 10_x86
    126105-26 or higher Sun Cluster 3.2: CORE patch for Solaris 9

    This is reported in bugs 6676771 and 6747530 with further details.


    posted by jschleich Apr 24 2009, 09:32:35 AM CEST Permalink Comments [0]

    20090408 Wednesday April 08, 2009
    nested mounts may fail to mount in the correct order on Sun Cluster 3.2

    In case of Sun Cluster 3.2 it's possible that nested mounts will be mounted in the wrong order. As a result, the data on these file systems become inaccessible to users.

    The issue happen if one of the following Sun Cluster core patches are active and nested mounts are managed with resource type SUNW.HAStoragePlus.
    126106-27 or -29 or -30 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-28 or -30 or -31 Sun Cluster 3.2: CORE patch for Solaris 10_x86
    126107-26 or -28 or -29 Sun Cluster 3.2: CORE patch for Solaris 9

    The error can look like:
    The correct output of df -k should be
    /dev/vx/dsk/datadg/vol01 480751 1048 431628 1% /test
    /dev/vx/dsk/datadg/vol02 288639 1042 258734 1% /test/test2
    /dev/vx/dsk/datadg/vol03 577295 1041 518525 1% /test/test3

    The mount order is defined in the HAStoragePlus resource test-rs
    # clrs show -v test-rs | grep FilesystemMountPoints
    FilesystemMountPoints: /test /test/test2 /test/test3

    But, due to runtime problems the filesystems get mounted in wrong order and the df -k can look like:
    /dev/vx/dsk/datadg/vol02 480751 1048 431628 1% /test/test2
    /dev/vx/dsk/datadg/vol03 480751 1048 431628 1% /test/test3
    /dev/vx/dsk/datadg/vol01 480751 1048 431628 1% /test
    In this specific case, /test/test2 and /test/test3 were mounted first followed by an overlay mount of /test. Due to this, data in /test/test2 and /test/test3 would not be accessible and show the same information as /test.

    Workaround:
    It's possible to split the SUNW.HAStoragePlus resource. For the example above change the resource test-rs and remove the FilesystemMountPoints /test/test2 and /test/test3. Furthermore create a new resource test1-rs with the mentioned FilesystemMountPoints and add a resource dependency.
    The commands to change this specific configuration will be:
    # clrs set -p FilesystemMountPoints=/test test-rs
    # clrs create -g test-rg -t SUNW.HAStoragePlus -p FilesystemMountPoints=/test/test2,/test/test3 -p Resource_dependencies=test-rs -p AffinityOn=True test1-rs

    Due to this change the test1-rs starts after the test-rs and the problem is solved.


    Details available in:
    Sun Alert 256368 Nested Mounts Managed by a SUNW.HAStoragePlus Resource may Fail to Mount in the Correct Order on Solaris Cluster 3.2


    Update 17.Jun.2009:
    The -33 revision of the Sun Cluster core patch is the first released version which fix this issue.
    126106-33 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-33 Sun Cluster 3.2: CORE patch for Solaris 10_x86


    posted by jschleich Apr 08 2009, 12:36:25 PM CEST Permalink Comments [0]

    20090309 Monday March 09, 2009
    memory leaks in "rgmd -z global" process

    A memory leak occurs in the "rgmd -z global" process on Sun Cluster 3.2 1/09 Update2. The global zone instance of the rgmd process leaks memory in most situations such as "scstat" or "cluster show" and other basic commands. The problem is severe and the rgmd heap grows to a large size and crashes the Sun Cluster node.

    The issue only happen if one of the following Sun Cluster core patches are active.
    126106-27 or -29 or -30 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-28 or -30 or -31 Sun Cluster 3.2: CORE patch for Solaris 10_x86
    Due to the fact that this patches are also part of the Sun Cluster 3.2 1/09 Update2 release the issue occur also on fresh installed Sun Cluster 3.2 1/09 Update2 systems.

    The error can look as follows:
    Analyze the grow of memory allocation with (or similar tools)
    # prstat
    3942 root 61M 11M sleep 101 - 0:00:02 0.7% rgmd/41
    sometime later the increase of the memory allocation is visible.
    3942 root 61M 20M sleep 101 - 0:01:15 0.7% rgmd/41
    or
    # pmap -x <pid_of_rgmd-z_global> | grep heap
    00022000 47648 6992 6984 - rwx-- [ heap ]
    sometime later the increase of the memory allocation is visible.
    00022000 47648 15360 15352 - rwx-- [ heap ]

    When the memory is full the Sun Cluster node panics with the following message:
    Feb 25 07:59:23 node1 RGMD[1843]: [ID 381173 daemon.error] RGM: Could not allocate 1024 bytes; node is out of swap space; aborting node.
    ...
    Feb 25 08:10:05 node1 cl_dlpitrans: [ID 624622 kern.notice] Notifying cluster that this node is panicking
    Feb 25 08:10:05 node1 unix: [ID 836849 kern.notice]
    Feb 25 08:10:05 node1 ^Mpanic[cpu0]/thread=2a100047ca0:
    Feb 25 08:10:05 node1 unix: [ID 562397 kern.notice] Failfast: Aborting zone "global" (zone ID 0) because "globalrgmd" died 30 seconds ago.
    Feb 25 08:10:06 node1 unix: [ID 100000 kern.notice]
    ...

    Update 20.Mar.2009:
    Available now:
    Sun Alert 254908 Memory Leak in the "rgmd" Process of Solaris Cluster 3.2 may Cause a failfast Panic

    Update 17.Jun.2009:
    The -33 revision of the Sun Cluster core patch is the first released version which fix this issue.
    126106-33 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-33 Sun Cluster 3.2: CORE patch for Solaris 10_x86


    Workaround: Use previous version -19 to prevent issue.
    126106-19 Sun Cluster 3.2: CORE patch for Solaris 10
    126107-19 Sun Cluster 3.2: CORE patch for Solaris 10_x86

    The issue is reported in bug 6808508 (description: scalable services coredump during the failover due to network failure). A fix is in progress. This blog will be updated when the fix is available.


    posted by jschleich Mar 09 2009, 05:26:31 PM CET Permalink Comments [1]