A task that you will most likely encounter during the administration of Exadata is the replacement of a damaged Hard Disk on the storage servers. Fortunately, this is quite easy, because almost everything is done by the system itself 🙂
Especially, the original Celldisks and Griddisks are rebuilt automatically on the Cell Layer. On the Database Layer, the related ASM disks also get rebuilt automatically, while due to the (at least) normal redundancy, the availability of the Database(s), relying on the diskgroups is not affected. The task is briefly described in this MOS Note.
As soon as the Hard Disk failure is noticed by the MS (Management Server) background process on the Cell, it will raise an alert that will also be published to Grid Control, if configured. Immediately, due to Pro-Active Disk Quarantine, the ASM-, Grid- and Celldisks get dropped. ASM rebalancing is triggered. You as the responsible Admin notice the alert and order a replacement Disk resp. use a Spare Disk to plug it into the Cell after you plugged out the damaged one. The Cell can stay online, because the Hard Disks are hot-pluggable.
No further administrative work to be done, typically. Easy, isn’t it? Mr. Sengonul from Turkcell (leading global system provider for mobile communications in Turkey, one of our Customer Exadata references) has published the Logfiles from such an incident with this posting. Thank you for that and also for your fine presentation about the Exadata Migration!