Thursday Sep 25, 2008
Thursday Sep 25, 2008
We are pleased to announce the availability of the latest Solaris Cluster Express (SCX)! You can download the software here.
There are some major milestones reached in this release: It incorporates
the first contribution from a community member! And coming from a
student, it is doubly delightful!! It also provides early access to
some of the new and exciting features that are being developed by the team.
What is new?
* This release is runs on SXCE build 97. The version of Sun Management
Center and other shared components are upgraded to be compatible with
the Solaris Express Community Edition (SXCE) version.
* The fencing mechanisms have been enhanced with the introduction of optional fencing. This provides a mechanism for the administrator
to change the fencing mechanism either at global or at an
individual disk level.
* This release also has a new feature called zone clusters. This
feature makes it possible to form a virtual cluster based on the zones
of a cluster. This is made possible by the introduction of a new brand
of zone called "cluster". Needless to say, most of the code is
available under a CDDL license like the rest of the software. This
feature is sure to make you reconsider your views about Open HA Cluster
and Solaris Cluster! Please refer to the clzonecluster(1CL) man page
for more details. You can find a cheat sheet for configuring a zone cluster here.
* Use of Loopback File Driver (lofi) device for global-devices name
space is introduced with this release. A dedicated partition for the
exclusive use of global-devices name space (i.e /globaldevices) is no
longer the requirement.
* As usual, there are the mandatory bug fixes and you can find them from the change log.
This release is a major milestone in the Open Source journey. For the
list of all the exciting projects that the community is working on,
please visit Open HA Cluster community. This release of Solaris Cluster Express (SCX)
will not work on OpenSolaris binary distribution (OpenSolaris 05.08). For the planned move to OpenSolaris binary distribution, visit Project Colorado.
Munish Ahuja
Madhan Kumar B
Jonathan Mellors
Venugopal N.S
Tuesday Sep 09, 2008
If you have ever asked yourself why Solaris Cluster Express is running on Solaris Express Community Edition and not yet on the OpenSolaris binary distribution, then you might be interested in project Colorado. This project is endorsed by the HA Clusters community group and has as its goal to provide a minimal and extensible binary distribution of Open HA Cluster that runs on the OpenSolaris binary distribution.
As always, the devil is in the details. The following table summarizes some of the reasons why this isn't just a "recompile and run" experience:
| Work Area | Solaris Express Community Edition | OpenSolaris Binary Distribution |
|---|---|---|
| Packaging System | System V packages | Image Packaging System (IPS) |
| Zones support | native and lx brand type | ipkg brand type |
| KSH version | KSH88, KSH93 | KSH93 |
| Web-based system management for applications | Sun Java Web Console (webconsole) | not provided |
| Encumbered Solaris code | CDE, Motif, ToolTalk | not contained by design |
| Supported Platforms | SPARC, i386/x64 | i386/x64 only (to date) |
| Installer | Supports network (JumpStart), text, and graphical installation | Supports graphical installation only (to date) |
| Preferred Compiler | Studio 11, soon Studio 12 | StudioExpress |
You can read here more details about each point listed within the table.
Besides solving the above challenges, we also want to offer some new possibilities within Colorado. You can read the details within the umbrella requirement specification. There are separate requirement specifications to outline specific details for the planned private-interconnect changes, cluster infrastructure changes involving the weaker membership, enhancements to make the proxy file system (PxFS) optional, and changes to use iSCSI with ZFS for non-shared storage configurations.
You can provide feedback on those documents to the ha-clusters-discuss mailing list. There is a review scheduled with the Cluster Architecture Review Committee on 18 September 2008, where you are invited to participate by phone if you are interested.
Thorsten Früauf
Solaris Cluster Engineering
Wednesday Jun 04, 2008
Solaris Cluster Express 6/08 is now available for download! You can download the DVD image here
What is new in this release?
* This release runs on OpenSolaris Nevada build 86. The version of the Sun Management Centre is now 3.1.
* The HA agent for Solaris Containers is now enhanced to include support for the Solaris 9 Branded Zones on SPARC platform. This is very useful for those customers who still need to run some applications on Solaris 9 while taking advantage of the new features of Solaris 10 and above.
* The HA agent for PostgreSQL Database is now ehanced to support WAL shipping. This feature greatly enhances the deployment of PostgreSQL database in Enterprise deployments.
* Support for Solaris Containers configured with exclusive IP is included in this release.
* The SCX Geographic Edition is enhanced to support Oracle Data Guard based replication.
* This release also contains the mandatory bug fixes and other minor enhancements not mentioned above.
Stay tuned for more milestones along the open source journey!
Munish Ahuja
Madhan Kumar B.
Jonathan Mellors
Arun Kurse
Venugopal N.S.
Friday May 30, 2008

Last week i presented Open HA Cluster at Open Source Grid Cluster Conference in Oakland California. The conference had three different tracks, dedicated to Globus (GlobusWorld), Grid Engine (Grid Engine Workshop), and Rocks (Rocks Cluster Workshop).
My presentation about making Sun Grid Engine highly available using Open HA Cluster (OHAC) was part of the Grid Engine Workshop.
I noticed that the term Cluster was a bit overused at this conference with different products and technologies using it in slightly different ways. So i started with clarifying the term "HA Cluster" to refer to the technology which OHAC brings to the arena, which is about high availability, in spite of failures. A quick show of hands revealed that about 25% of the participants were aware of the concept of "HA Clusters" in general, with about 15% actually being aware of OHAC itself. Given that, i spent a larger portion of my talk on the concept of single points of failure, redundancy, failover and how OHAC recovers from system failures. Towards the end of my talk, i also talked about using OHAC to make Sun Grid Engine highly available and what are the key advantages of the HA solution based on OHAC. These points and the slides are curtsy of Thorsten Frueauf . The key points about how OHAC helps in improving the availability of Sun Grid Engine are noted in this blog entry .
The presentation did generate a couple questions from the audience, i remember one question was about how does OHAC takes care of MAC address change when it fails over a HA ipaddress from one node to another. I explained that OHAC uses gratuitous ARPs to update the ARP cache of any routers on the network and that works with all but a very few exceptions. Another question was about data recovery during disk/mirror failures and whether the end application needs to worry about it, i explained that typically that recovery is performed by a volume manager and the end application is blissfully unaware of it. The OHAC framework makes sure that the end application has the data available when and where (on the node where the app is) the application is started. Another question was about the speed of failover (how fast is the recovery upon various failures). I turned that question into an advantage where i explained how OHAC is tightly integrated with Solaris and thus can detect failures quickly and recover from the failures quickly. I then invited folks to view the failover demo on my laptop on the next day, in the "Grill the Gurus" portion of the conference.
I was somewhat curious about the audience mix as well about whether the larger percentage was from academic community or the commercial community. A quick show of hands revealed that the commercial users were well represented, roughly in the same numbers as the academic/research users. After the talk, i did speak to a couple of people during the coffee/lunch breaks and met a variety of people. Here are some folks i remember: A sysadmin at an European Oil company interested in using Grid Engine for optimizing/minimizing application license for a commercial software he uses for geological data analysis, a IT manager for a Medical Software startup based in San Francisco who was interested in Open Source software as a way to minimize costs, a deployment architect for a IT consultancy company who was interested in geographical data replication and content based routing of incoming jobs, a lab manager from an ivy league university who wanted to figure out an easy way for his students to be effective at managing his compute lab environment, a IT admin for a storage manufacturer who was interested in learning about techniques for efficient monitoring of workloads.
For the demo next day, i had Sun Grid Engine configured as a HA server across two zones on my laptop. I was able to demo the very quick restart of the Grid Engine qmaster and scheduler daemons. People seemed to be somewhat interested to learn as to how that is happening, which led me to explain how Solaris Contracts are used by the process monitoring implementation in OHAC, which leads to quick detection and recovery from application failures. Most people were simply interested in chatting about the general concept of clusters itself and discussing their own "Grid and Cluster" scenarios.
I you are interested in the actual slides i used for the talk, you can check them out here . If you missed this conference, you would have an opportunity to learn more about Open HA Cluster and OpenSolaris at the upcoming LinuxTag conference in Berlin, Germany from May 28th till 31th of May 2008.
The picture at the top is taken during a coffee break in the conference. Check out this link for other photos i took at the conference. Also, Deirdré Straughan made a video of my talk, complete with neat fading in and out of the presentation slides. Click in the embedded window below to watch the presentation in flash.
If you'd like, you can watch the video in iPod format and watch it on your video iPod . Beware that the file is rather big though.
This conference was a nice experience for me to talk to lot of people and make them aware of Open HA Cluster , and also learn about what is going on in other Open Source communities such as Grid. Hope you found some of the things in this blog useful and interesting.
Regards,
Ashutosh Tripathi
Solaris Cluster Engineering
Thursday May 29, 2008
One year ago, we announced that we would open source the entire Solaris Cluster product suite. Today, we are delivering on that promise six months ahead of schedule by releasing over two million lines of source code for the Solaris Cluster framework!
Read the official press release and listen to a podcast with Meenakshi Kaul-Basu, Director of Availability Products at Sun.
This third, and final, source code release follows the initial open sourcing of the Solaris Cluster agents in June, 2007 and Solaris Cluster Geographic Edition in December, 2007. As with the previous releases, the source code is available under the Common Development and Distribution License (CDDL) under the auspices of the HA Clusters community group on OpenSolaris.org.
The open source version of Solaris Cluster is called Open High Availability Cluster. Although some encumbered parts of Solaris Cluster have not been open sourced, with this release, you can now build a fully functional HA Cluster purely from source.
In addition to the source code for the product itself, Open HA Cluster includes source for the Sun Cluster Automated Test Environment (SCATE), man pages, and globalization.
Consider getting involved in the HA Clusters community group:
Nick Solter, Open HA Cluster tech lead and HA Clusters community group facilitator