I am teaching a Real Application Clusters course this week in Vienna for the Vienna Airport IT. The staff at Vienna Aiport IT is very experienced in Oracle Database administration – one of the students has worked with version 5 even! Often, I am the „version eldest „with 7.3.4 in my classes, but not this time 🙂
I start that course with a description of RAC basic architecture usually, and the developement and explanation of the following picture takes about one and a half hour:
Maybe you can recognize something from that picture. As a legend: The blue box in the center is the Shared Storage, on which the database files reside. Beneath (black) is the voting disk and the OCR. They can’t be on an ASM diskgroup, as the database is. We have a 2 node cluster here: The black boxes are the nodes A and B. They are connected to the shared storage (black lines) and have also local storage each (small blue boxes), where the OS, clusterware & DB software are installed on. The main clusterware processes on each node are cssd (using the voting disk) and crsd (using the OCR). On each node is an instance running (I refer to the ASM instances at a later stage in the course). We have the usual background processes like DBWR and LGWR (red)and the single instance known SGA (red).
Additional, there a background processes attached to the instance that are only seen in a RAC. The most important ones are LMON and LMS (green), that make up (under their „marketing name“ GES and GCS) the Global Resource Directory. At least 2 network cards (NICs) are on each node: One for the Private Interconnect (eth2, red), and one for the Public LAN (black above).
Often, as in our setup, there is a third network card to connect the node to a SAN. The IP address resp. the IP alias of the Public NIC (eth0) are not used by clients to connect to the RAC. Instead, virtual IPs (VIPs) are used (green). That has the advantage, that those VIPs are resources, controlled by the clusterware, so in case of a node failure, clients don’t have to wait on timeouts from the network layer. Instead, clusterware can lead to an immediate connect time failover – and even to a transparent application failover (TAF) for existing session, if configured.
That is of course not all I say about it in the class, but maybe you get an impression 🙂
#1 von Surachart Opun am August 15, 2009 - 13:13
Nice picture from your whiteboard.
#2 von Aman.... am August 15, 2009 - 20:02
Good one Uwe 🙂
#3 von Anand am August 17, 2009 - 19:41
Great Pic 🙂
#4 von Rahul am Oktober 20, 2009 - 13:08
Offcourse it nice and interesting. Thanks a lot.
#5 von Anand am Februar 24, 2010 - 14:40
I am working on a 2 node RAC db.The log_archive_max_process was set to 30, which was high.So, i changed the value to 10 using ‚Alter system set log_archive_max_proessess=10;‘ .As far as i know that scope=BOTH and sid=* is default i didn’t mention it in my command.Then i checked the alert log, and 20 archive processes were stopped.Till this point things are normal.We had a server reboot, and after the database OPEN completed,i can see command ‚ALTER SYSTEM SET LOG_ARCHIVE_MAX_PROCESSES=30 scope=BOTH sid=*;‘ getting fired , leading to again increasing the archive process back to 30.I have checked for if any STARTUP trigger is written which causes it, but couldn’t find any.Where more should i look into?
#6 von Uwe Hesse am Februar 27, 2010 - 13:59
I have no clue ad hoc and unfortunately limited time right now to investigate that. Have you checked the protocol files of the clusterware? Maybe crsd has done it. But that is just a guess.
#7 von Anand am März 10, 2010 - 06:37
Can we tarce the session , and if at what level?
#8 von Uwe Hesse am März 11, 2010 - 20:33
the clusterware automatically creates tracefiles:
CRS logs are in $ORA_CRS_HOME/log/hostname/crsd/. The crsd.log file is archived every 10 MB (crsd.l01, crsd.l02, …).
Specific logs are in $ORA_CRS_HOME/log/hostname/racg and in $ORACLE_HOME/log/hostname/racg.
Oracle Clusterware alerts can be found in alert.log present in the $ORA_CRS_HOME/log/ directory.