Posts

Exadata: Disk controller was hung. Cell was power cycled

Image
 Just another manic magic Monday. I’ve moved my blog from https://insanedba.blogspot.com to https://dincosman.com Please update your bookmarks and follow/subscribe at the new address for all the latest updates and content. More up-to-date content of this post may be available there. After a great weekend, we came to the office and performed our daily health checks like every Monday. One of our storage servers (cell) of Exadata X2-2 X4270 M2 had lost 11 ASM disks out of a total of 34 ASM disks. We struck it lucky, all databases were still up despite all the losses. Let's examine what happened to our cell server. When I checked the mailbox, I saw an alert mail from the problematic cell stating that "Disk controller was hung. Cell was power cycled." It looks like the cell disk controller was not performing well (maybe a bug or a peak moment) and forced the server to reboot. But normally reboots do not end up with disk losses. I started by checking the...

Bizarre tables: starting with MD* . Let's drop some.

Image
What are these strange tables starting with MD*?  I’ve moved my blog from https://insanedba.blogspot.com to https://dincosman.com Please update your bookmarks and follow/subscribe at the new address for all the latest updates and content. More up-to-date content of this post may be available there. Can I drop them?  I will answer first:      Yes, You can drop some of them if your Oracle DB version is 12cR2 or later.  But which ones? There were over 3000 tables starting with 'MD*' letters in one of our production database.  I knew that those tables are related with Spatial indexes. But that was a huge amount. So I took a deep dive into the spatial indexes.  For each spatial index created, one table named like "MDRT_#" is created also. There is a one to one relationship (except the partitioned o...

Node Eviction after Applying Release Update 19.13

Image
Be CAREFUL! Before Applying Release Update I’ve moved my blog from https://insanedba.blogspot.com to https://dincosman.com Please update your bookmarks and follow/subscribe at the new address for all the latest updates and content. More up-to-date content of this post may be available there. After applying release update 19.13 to our standby site cluster, consisting of two physical machines (non Oracle-engineered), we started to experience Node Evictions and Reboot Problems. We immediately started to search for the root cause of the issue and followed some steps to make them stable again before applying RU 19.13 to our production sites. * We chose a sample that is reflecting the problem. On Feb 22, at 13:35, the host machine (bltdb02) rebooted. * We looked at the lastgasp log files to get the details why the node got rebooted. Grid cssdagent or cssdmonitor can record node reboots here...