Availability Engineering
Sun Cluster Oasis
Main | Next page »
Thursday Jun 25, 2009
Single-node clusters, for disaster recovery and more...

At first sight, a single-node cluster may seem to be a pointless thing. After all, what sort of high availability can you get from one node? :)

That might be a valid point if HA alone were the only consideration, but there are quite a few other ways in which single-node clusters can be useful. Two of the most useful ones are:

Disaster Recovery with SCGE

SCGE allows two clusters, separated by enough distance that a disaster at one site will not affect the the other, to be managed together. Several data replication products (AVS, SRDF, Oracle DataGuard, etc.) can be managed within this two-cluster partnership to ensure that the DR site has up-to-date information, ready to take over service.

Obviously this configuration requires two clusters, but if we assume that the DR site will be needed only in the (hopefully rare) instance of a disaster, and probably occasionally during maintenance of the primary, there is no need for it to be an exact copy of the primary site. In fact, it can be a single node. All that is required is that it be running Solaris Cluster software, i.e. it can be a Single-Node Cluster.

Carrying this idea further, it is also fully supported to have single-node clusters at both primary and secondary sites.

As I mentioned at the start, this won't give very much in the way of High Availability in the event of a local primary-site server failure, but you may not need that. Strange though it might seem at first glance, HA isn't a prerequisite for DR, it depends entirely on your business continuity needs (and that's a subject for a future blog entry).

No special tricks or configurations are required for this, it just works “out of the box”. With two single-node clusters and AVS (SNDR) replication between them, you have a fully-supported Disaster Recovery configuration, implemented with no special additional hardware. Larger sites with external storage arrays and replication also work just as well with a single-node cluster as with a multi-node configuration.

Development

Another place where a single-node cluster can be really useful is when developing cluster-based software, especially cluster agents. With the support for Solaris Containers (aka zones) that was added in Solaris Cluster 3.2, this has become even easier.

Fully testing a cluster agent requires that you simulate failures, such as system crashes or disconnections, and ensure that the agent reacts correctly. This is also true when testing that a given application operates correctly in a cluster environment. It's not something that you'd normally want to do on your desktop. However, providing an extra pair of systems in a cluster as lab test equipment for each developer is costly, and takes up valuable lab space and energy.

The solution? A single-node cluster, with some zones configured. With Solaris Cluster 3.2 you can specify zones (in the format of nodename:zonename) in the nodelist of an application resource group, see the clrg(1CL) manpage. The cluster software, running in the global zone, manages those applications just as if they were on separate physical nodes. You can request that the resource groups be switched between zones, or even crash or halt zones to test that automatic recovery is performed correctly. All without leaving your desk or rebooting your development system.

And for my next trick...

I hope that's given a brief idea of what can be done today with a single node. What might the future hold? Well, people will jump on me if I promise anything, but I really like some of the ideas that the Open HA Cluster guys have been demonstrating. Take a look at Thorsten's whitepaper if you want to try clustering VirtualBox systems - on your laptop!

As always, join us at http://www.opensolaris.org/os/community/ha-clusters/ to discuss this or any other cluster topics.


Steve McKinty
SCGE Architect


Posted at 12:15PM Jun 25, 2009 in Other  |  Comments[0]

Monday Jun 15, 2009
Oracle on Solaris Cluster: Stuff you might have missed

If, like me, you only have a limited amount of time to browse the multitude of web sites, blogs, journals and news feeds out there, you may have missed some interesting Oracle collateral that my colleagues at Sun have produced. So to help reduce your information overload, I thought I'd highlight them in this post.

As Oracle 11g is growing in importance, let's start there. Both Solaris Cluster and Sun ISV engineering have been working with Oracle 11g now for some considerable time. We've invested a lot of resources in integration and stress testing it to make sure it works seamlessly with Solaris Cluster. The results of these efforts are captured in the "Sun Reference Architecture for Oracle 11g Grid" whitepaper. From a personal standpoint, I find the performance characterization of various networking and file system options the most interesting material as it really demonstrates the breadth of choice Solaris Cluster gives you: 1GbE, 10GbE, Infiniband, ASM, shared QFS, etc.

If you've not kept up with the progress in Solaris Cluster's support for virtualization, then you may not have seen Dr Ellard Roush's excellent "Zone Clusters — How to Deploy Virtual Clusters and Why" Blueprint. If you want to consolidate your Oracle RAC workloads to maximise the system utilisation, then this, together with the "Deploying Oracle Real Application Clusters (RAC) on Solaris Zone Clusters", co-authored with Gia-Khanh Nguyen, are papers you must read.

Following a similar theme, Alexandre Chartre, Daniel Dibbets, and Roman Ivanov have written "Running Oracle Real Application Clusters (RAC) on Sun Logical Domains". This Blueprint describes the best practice for setting up Oracle Clusterware on LDoms. Having this perform reliably is a requirement and precursor to gaining Oracle support for the same configuration running under Solaris Cluster, with all the well known benefits that Solaris Cluster brings.

Finally, if there was ever any doubt that SPARC/Solaris was the most scalable and performant platform for Oracle, the "Performance and Scalability Benchmark: Oracle Communications Billing and Revenue Management on Sun SPARC Enterprise T5220 and M8000 Servers Running the Solaris 10 OS" Blueprint describes how you can bring the best of Sun technologies: ZFS, Solaris Containers, Dynamic System Domains, together to provide highly scalable Communications Billing and Revenue Management solution.

Hopefully, one or more of these will be of use or interest to you. If you have any questions, please feel free to contact us through the blog or via the Solaris Cluster forum. We're always happy to help.

Tim Read
Solaris Cluster Engineering

Posted at 01:13AM Jun 15, 2009 in Oracle  |  Comments[0]

Monday Jun 01, 2009
Announcing Open HA Cluster 2009.06

We are pleased to announce the release of High Availability Cluster software for OpenSolaris 2009.06! If you've been following along, this release is the fruit of project Colorado. Open HA Cluster 2009.06 is based on Solaris Cluster 3.2, including many of the features from the most recent update. Additionally, Open HA Cluster 2009.06 contains the following new features:

Taken together, these features contribute to “hardware minimization,” allowing you to form a cluster with fewer physical hardware requirements.

This release runs on both SPARC and x86/x64 systems and includes the following agents:

Open HA Cluster 2009.06 is distributed as IPS packages from the https://pkg.sun.com/opensolaris/ha-cluster repository. In order to obtain access, accept the license agreement at https://pkg.sun.com to obtain a certificate and key. Follow the instructions given at registration to configure your system's access to the ha-cluster publisher.

To install the complete cluster, including agents, install the “ha-cluster-full” package. To install a minimal cluster, without agents and other optional components, install the “ha-cluster-minimal” package instead. You can then install the individual agents and other optional components.

Open HA Cluster 2009.06 is free to use, with production level support offerings available for two-node clusters. This release runs on OpenSolaris 2009.06 only.

For more information, see the documentation landing page and the OpenSolaris Availability page. If you don't have physical hardware available to create a cluster, try it out on VirtualBox! (PDF link).

Please direct your questions and comments to ha-clusters-discuss@opensolaris.org

The Colorado Team

Posted at 07:11AM Jun 01, 2009 in Open High Availability Cluster  |  Comments[0]

Tuesday May 26, 2009
SAP on Solaris Cluster

Solaris Cluster comes bundled with rich support for numerous software applications.

Follow the link to see a list of all the Solaris Cluster Agents available in the latest release of  Solaris Cluster  - SC 3.2 01/09.
For most of these applications the latest versions are supported.  In this blog I specifically want to highlight the latest support
for the SAP NetWeaver stack and highlight some key features provided by Solaris Cluster to make SAP highly available on
Solaris.

Solaris Cluster 3.2 HA SAP Web Application Server agents now support SAP 7.1 on S10 SPARC and X64. You will need
patch# 126062-06 or later for S10 SPARC or patch# 126063-07 or later for S10 X64. This patch is required for the following
Resource Types (RTs) - SUNW.sapenq, SUNW.saprepl, SUNW.sapscs, SUNW.sapwebas. The SAP agent RTs SUNW.sap_ci_v2
and SUNW.sap_as_v2 do not support SAP 7.1 version.

All SAP Agents are supported in global containers and zone nodes (SC 3.2 support for containers).

Solaris Cluster software can be used to improve the availability of SAP components running on Solaris OS. Solaris Cluster uses
redundant components to protect against any planned or unplanned downtime - eliminating any single point of failure in the entire
stack. Solaris Cluster provides HA agents for SAP CI (Central Instance), SAP Enqueue Server, Replica Server, Message Server,
Web Application Servers, SAP MaxDB and SAP LiveCache. The Agents support the following SAP installations a) ABAP only,
b) JAVA only and c) ABAP and JAVA combined.

One of the strengths and key features of Solaris Cluster is the support for multiple flavors of dependencies and  affinities
between applications. Refer to the blog by Marty Rattner where he explained this in detail.

You can always refer to the "Sun Cluster Data Services Planning and Administration Guide for Solaris OS" where this topic is
explained in depth with examples.


When making the SAP Enqueue Server and Replication Server highly available in Solaris Cluster, the dependencies and affinities
play a very important role. For High Availability, the Enqueue Server and the Replica Server must run on different nodes. If the
node running the Enqueue Server goes down then the Enqueue Server must be started on the node where Replica Server is running.
When the Enqueue Server starts, the replication table, stored on the replication server, is transferred to the standalone Enqueue Server
and the new lock table is created from it. After the Enqueue Server has started, the Replica Server must be failed over to another node
in the cluster to continue replication of the lock table.

This can be easily accomplished in Solaris Cluster by setting a weak positive affinity between the Enqueue Server and the Replica
Server and a strong negative affinity between the Replica Server and the Enqueue Server. The weak positive affinity ensures that the
Enqueue Server failsover to the node running the Replica Server. The Strong Negative affinity ensures that the Replica Server never
runs on the same node as the Enqueue Server. Check out the following diagram to understand this clearly.

In addition to affinities, dependencies also play a very critical role: A dependency between SAP resource and a Database resource
ensures that the Database is started first before SAP servers are started. Also, a resource dependency between Enqueue resource
and a Replica resource ensures that latter is started only after the Enqueue Server is online.

As you can see, Solaris Cluster provides several options to make SAP highly available on Solaris. In this blog I highlighted only a few.
Please refer to the Solaris Cluster HA for SAP agents administration guide for additional configuration examples.


Thanks
Prasad Dharmavaram
Solaris Cluster Engineering

Posted at 12:46PM May 26, 2009 in Agents  |  Comments[1]

Tuesday May 19, 2009
Open HA Cluster at CommunityOne West June 1-2

In addition to the Cluster Summit on May 31, Open HA Cluster will be well represented at the CommunityOne West conference at Moscone Center in San Francisco.

We'll have an Open HA Cluster demo running the whole week. Come visit us in the Sun Pavillion to see Open HA Cluster running on OpenSolaris.

I'll also be giving a talk, "High Availability with OpenSolaris", as part of the Deploying OpenSolaris in your DataCenter deep dive track on Tuesday, June 2. Contrary to the "official" CommunityOne information you might find elsewhere, this deep dive track is completely free. Just register with the "OSDDT" registration code. The other talks in this track, on ZFS and Zones, should be quite interesting as well.

You can see the entire lineup of the OpenSolaris presence at CommunityOne here, and even more details here. I hope to see you in San Francisco in a couple weeks!

Nick Solter
Tech lead, Open HA Cluster 2009.06

Posted at 08:36AM May 19, 2009 in Open High Availability Cluster  |  Comments[0]

Main | Next page »