Grid Infrastructure | Pickleball spielen

Purpose of the Voting Disk for #Oracle RAC

Veröffentlicht von Uwe Hesse in TOI am September 20, 2013

The Voting Disk provides an additional communication path for the cluster nodes in case of problems with the Interconnect. It prevents Split-Brain scenarios. That is another topic from my recent course Oracle Grid Infrastructure 11g: Manage Clusterware and ASM that I’d like to share with the Oracle Community.

Under normal circumstances, the cluster nodes are able to communicate through the Interconnect. Not only do the cssd background processes interchange a network heartbeat that way, but also things like Cache Fusion are done on that path. The red lines in the picture symbolize the cssd network heartbeat. Additionally, the cssd processes write also into the Voting Disk (respectively Voting File) regularly and interchange a disk heartbeat over that path. The blue lines in the picture stand for that path. Each cssd makes an entry for its node and for the other nodes it can reach over the network:

Now in case of a network error, a Split-Brain problem would occur – without a Voting Disk. Suppose node1 has lost the network connection to the Interconnect. In order to prevent that, redundant network cards are recommended since a long time. We introduced HAIP in 11.2.0.2 to make that easier to implement, without the need of bonding, by the way. But here, node1 cannot use the Interconnect anymore. It can still access the Voting Disk, though. Nodes 2 and 3 see their heartbeats still but no longer node1, which is indicated by the green Vs and red fs in the picture. The node with the network problem gets evicted by placing the Poison Pill into the Voting File for node1. cssd of node1 will commit suicide now and leave the cluster:

The pictures in this posting are almost identical with what I paint on the whiteboard during the course. Hope you find it useful 🙂

Related posting: Voting Disk and OCR in 11gR2: Some changes

Grid Infrastructure, RAC

20 Kommentare

Who is the Master Node in my Oracle Cluster?

Veröffentlicht von Uwe Hesse in TOI am September 17, 2013

I got this question during my present course Oracle Grid Infrastructure 11g: Manage Clusterware and ASM while we discussed backup of the OCR. That backup is done by the OCR Master, which can be any node in the cluster. It is therefore recommended to configure the backup location to a shared folder that is accessible from all cluster nodes. But back to the question – here is how to find the OCR Master of the Oracle Cluster:

[grid@host01 ~]$ cat /u01/app/11.2.0/grid/log/host01/crsd/crsd.log | grep -i 'ocr master'
2013-09-17 07:49:44.237: [  OCRMAS][3014282128]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number 1
2013-09-17 07:53:00.305: [  OCRMAS][3009534864]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number 1
2013-09-17 12:19:21.414: [  OCRMAS][3009604496]th_master: NEW OCR MASTER IS 2
[grid@host01 ~]$ olsnodes -n
host01  1
host02  2
host03  3

(GRID_HOME and local hostname) The CRSD (Cluster Registry Service Demon) is the one who deals with the OCR, which is why I search through this log file.
The Master Node of the Cluster is the one who will become the only surviving node if the interconnect fails completely. It is also the node who will pass the time to the other nodes in the absence of NTP via CTSSD (Cluster Time Synchronization Service Demon). I suppose it is always the same as the OCR Master, but just to be on the safe side, you can check that this way:

[grid@host01 ~]$ cat /u01/app/11.2.0/grid/log/host01/cssd/ocssd.log | grep -i 'master node'
2013-09-17 07:48:53.541: [    CSSD][3004672912]clssgmCMReconfig: reconfiguration successful, incarnation 274866533 with 1 nodes, local node number 1, master node number 1
2013-09-17 07:49:47.427: [    CSSD][2990480272]clssgmCMReconfig: reconfiguration successful, incarnation 274866534 with 1 nodes, local node number 1, master node number 1
2013-09-17 07:52:27.595: [    CSSD][2989472656]clssgmCMReconfig: reconfiguration successful, incarnation 274866536 with 1 nodes, local node number 1, master node number 1
2013-09-17 07:59:20.783: [    CSSD][2989472656]clssgmCMReconfig: reconfiguration successful, incarnation 274866537 with 2 nodes, local node number 1, master node number 1
2013-09-17 11:34:59.157: [    CSSD][2989472656]clssgmCMReconfig: reconfiguration successful, incarnation 274866538 with 3 nodes, local node number 1, master node number 1
2013-09-17 12:18:48.885: [    CSSD][2992602000]clssgmCMReconfig: reconfiguration successful, incarnation 274866540 with 3 nodes, local node number 1, master node number 2
2013-09-17 12:22:52.660: [    CSSD][2992602000]clssgmCMReconfig: reconfiguration successful, incarnation 274866541 with 2 nodes, local node number 1, master node number 2
2013-09-17 12:23:32.836: [    CSSD][2992602000]clssgmCMReconfig: reconfiguration successful, incarnation 274866542 with 3 nodes, local node number 1, master node number 2
2013-09-17 12:26:29.474: [    CSSD][3016432528]clssgmCMReconfig: reconfiguration successful, incarnation 274866543 with 3 nodes, local node number 1, master node number 2
2013-09-17 12:28:42.960: [    CSSD][2987871120]clssgmCMReconfig: reconfiguration successful, incarnation 274866544 with 3 nodes, local node number 1, master node number 2

The CSSD (Cluster Synchronization Service Demon) is the one who deals with the Voting File that is used to determine which nodes must reboot and which nodes will survive in case of a problem with the Interconnect. Therefore, I search through its log file to determine the Master of the Cluster.

The question comes up quite often, so this little post will be handy to point to in the future. Hope you find it useful as well 🙂

Grid Infrastructure, RAC

3 Kommentare