Kristien's Weblog
Kristien's Weblog
« Why Oracle RAC on... | Main | John Stuart Mill »
20050509 maandag 09 mei 2005
Amnesia

You forgot that I need ya
You must've caught amnesia
That's why you don't believe

(Black Eyed Peas)

I am not really a Black Eyed Peas fan (not my type of music) but I kind of like this song.

After having discussed the private interconnect some time ago I felt it is necessary to chat a bit about the Cluster Membership Monitor (CMM) but this is impossible without first discussing the typical issues in clustering theory called 'amnesia' and 'split brain'.  Amnesia is for this week, and I hope to get to Split Brain next week, in order to be able to finish up the CMM discussion before I leave on holiday the 28th of May (hurray !!!!!!!).

Let's imagine the following situation: You have a 2 node cluster. At 12pm you shut down a cluster node, nodea, for maintenance. Nodeb is still running. At 4pm you decide to change some settings (such as timeouts). Where will these changes go? The cluster has a central repository, called the Cluster Configuration Repository (CCR), which has a local copy on each node (in /etc/cluster/ccr --> check it out). So, because nodea is down, the update will only make it in nodeb's copy of the database. Nodea is unaware of the change. If it would join the cluster now, it would receive the most recent copy of the CCR from nodeb.

Let us now say that at 6pm, you shut down nodeb as well. Both nodes are down. If we would now boot nodea, it would start up with an old copy of the database. The cluster would have 'forgotten' the changes that happened between 12pm and 6pm, as these changes are only known on nodea, and this one is down. This is called amnesia.

I can already tell you that this is a situation that will never happen in Sun Cluster.

If we take a look at how amnesia prevention was historically done in different clustering solutions, there are several options:

1) Just DO NOT allow changes when members are not part of the cluster
2) Store the cluster database on shared storage, always
3) Store the cluster database on local storage and on shared location and use shared location to override local copies when nodes have been down

Sun Cluster 3.x uses a more elegant approach. In fact, it uses the same mechanism that it uses to prevent Split Brain: A majority algorithm. Before I explain this, however, I will first explain what IS Split Brain. That's for next week.

 


09 mei 2005, 10:41:16 MEST Permalink Opmerkingen [0]

Terugkoppel URL: http://blogs.sun.com/kristien/entry/amnesia
Opmerkingen:

Voeg je opmerking toe:

Naam:
E-Mail:
URL:

Jouw opmerking:

HTML Syntax: Uitgeschakeld