These I consider the most important points about Exadata Patching:
Where is the most recent information?
MOS Note 888828.1 is your first read whenever you think about Exadata Patching
What is to patch with which utility?
Expect quarterly bundle patches for the storage servers and the compute nodes. The other components (Infiniband switches, Cisco Ethernet Switch, PDUs) are less frequently patched and not on the picture therefore.
The storage servers have their software image (which includes Firmware, OS and Exadata Software) exchanged completely with the new one using patchmgr. The compute nodes get OS (and Firmware) updates with dbnodeupdate.sh, a tool that accesses an Exadata yum repository. Bundle patches for the Grid Infrastructure and for the Database Software are being applied with opatch.
Rolling or non-rolling?
This the sensitive part! Technically, you can always apply the patches for the storage servers and the patches for compute node OS and Grid Infrastructure rolling, taking down only one server at a time. The RAC databases running on the Database Machine will be available during the patching. Should you do that?
Let’s focus on the storage servers first: Rolling patches are recommended only if you have ASM diskgroups with high redundancy or if you have a standby site to failover to in case. In other words: If you have a quarter rack without a standby site, don’t use rolling patches! That is because the DBFS_DG diskgroup that contains the voting disks cannot have high redundancy in a quarter rack with just three storage servers.
Okay, so you have a half rack or bigger. Expect one storage server patch to take about two hours. That summarizes to 14 hours (for seven storage servers) patching time with the rolling method. Make sure that management is aware about that before they decide about the strategy.
Now to the compute nodes: If the patch is RAC rolling applicable, you can do that regardless of the ASM diskgroup redundancy. If a compute node gets damaged during the rolling upgrade, no data loss will happen. On a quarter rack without a standby site, you put availability at risk because only two compute nodes are there and one could fail while the other is just down.
Why you will want to have a Data Guard Standby Site
Apart from the obvious reason for Data Guard – Disaster Recovery – there are several benefits associated to the patching strategy:
You can afford to do rolling patches with ASM diskgroups using normal redundancy and with RAC clusters that have only two nodes.
You can apply the patches on the standby site first and test it there – using the snapshot standby database functionality (and using Database Replay if you licensed Real Application Testing)
A patch set can be applied on the standby first and the downtime for end users can be reduced to the time it takes to do a switchover
A release upgrade can be done with a (Transient) Logical Standby, reducing again the downtime to the time it takes to do a switchover
I suppose this will be my last posting in 2014, so Happy Holidays and a Happy New Year to all of you :-)
The Oracle circus went to Liverpool this year for the annual conference of the UK Oracle User Group and it was a fantastic event there! Top speakers and a very knowledgeable audience too, I was really impressed by the quality we have experienced. Together with my friends and colleagues Iloon and Joel, I was waving the flag for Oracle University again – and it was really fun to do so :-)
One little obstacle was that I actually did many presentations and roundtables. So less time for me to listen to the high quality talks of the other speakers…
Joel and I hosted three roundtables:
About Exadata, where we had amongst others Dan Norris (Member of the Platform Integration MAA Team, Oracle) and Jason Arneil (Solutions Architect, e-DBA) contributing
About Grid Infrastructure & RAC, where Ian Cookson (Product Manager Clusterware, Oracle) took many questions from the audience. We could have had Markus Michalewicz also if I only would have told him the day before during the party night – I’m still embarrassed about that.
About Data Guard, where Larry Carpenter (Master product Manager Data Guard and Maximum Availability Architecture, Oracle) took all the questions as usual. AND he hit me for the article about the Active Data Guard underscore parameter, so I think I will remove it…
Iloon delivered her presentation about Apex for DBA Audience, which was very much appreciated and attracted a big crowd again, same as in Nürnberg before.
Joel had two talks on Sunday already: Managing Sequences in a RAC Environment (This is actually a more complex topic than you may think!) and Oracle Automatic Parallel Execution (Obviously complex stuff)
I did two presentations as well: The Data Guard Broker – Why it is recommended and Data Guard 12c New Features in Action
Both times, the UKOUG was so kind to give me very large rooms, and I can say that they haven’t looked empty although I faced tough competition by other interesting talks. This is from the first presentation:
A big THANK YOU goes out to all the friendly people of UKOUG who made this possible and maintained the great event Tech14 was! And also to the bright bunch of Oracle colleagues and Oracle techies (speakers and attendees included) that gave me good company there: You guys are the best! Looking forward to see you at the next conference :-)
Together with Craig Shallahamer, Lucas Jellema, Mark Rittman, Martin Bach, Pete Finnigan and Tim Fox, I will be presenting in Dubai. My topic is Minimizing Downtime with Rolling Upgrade using Data Guard
Click on the picture for details, please!
Hope to see you there :-)