Exadata: why a half-rack is the “recommended minimum size”

Lots of shops dipped their toes in the Exadata water with a quarter-rack first of all.

(For those who are new to the Exadata party and don’t know of a world without elastic configurations, a quarter-rack is a machine with two compute nodes and three storage cells).

If you are / were one of those customers, you’ll probably have winced at the difference between the “raw” storage capacity and the “usable” storage capacity when you got to play with it for the first time.

While you could choose to configure your DATA and RECO diskgroups with HIGH redundancy in ASM, did you notice that you couldn’t do the same with the DBFS_DG / SYSTEM_DG?

“A slight HA disadvantage of an Oracle Exadata Database Machine X3-2 quarter or eighth rack is that there are insufficient Exadata cells for the voting disks to reside in any high redundancy disk group which can be worked around by expanding with 2 more Exadata cells. Voting disks require 5 failure groups or 5 Exadata cells; this is one of the main reasons why an Exadata half rack is the recommended minimum size.”

Basically, you need at least 5 storage cells for each Exadata environment if you want to have true “high availability” with your Exadata machine.

While quarter-rack machines have 3 storage cells, half-rack machines have 7 or 8 storage cells, depending on the model.

Let’s say that you have the model with 8 storage cells: if you split a half-rack machine equally, you’ll have 2x quarter-rack machines with 4 storage cells, so you would need one more storage cell per machine to provide HA for the SYSTEMDG / DATA_DG diskgroup.

For some reason, this nugget escaped my attention until recently. Even more reason to have a standby Exadata machine at your DR site …

9 thoughts on “Exadata: why a half-rack is the “recommended minimum size””

Oracle is very conservative with their hardware. Truly speaking the chances of loosing 3 whole cells is quite low , but realistically this is what the 5 failure groups is guarding againsts. I have never seen that happen… and if it did I would be worried about much more than just my vote and ocr 🙂

This does not mean that an 1/4 can’t be used for HA. You can still do rolling patching with a 1/4 rack, and still get HA in most cases. It just means that there is the possibility that the Voting/OCR may have to be moved if a cell is down, and you lose the disk containing those files.

High Redundancy (triple mirroring) works with 3 cells and protects against double disk failure, failure of disk and flash card, and double cell failure.

Some notes:

However, the vote device cannot be stored in a high redundancy diskgroup with 3 cells because Oracle needs 5 copies of the voting disk. It can be stored in a normal redundancy diskgroup that has 3 cells (usually DBFS disk group).

Because of use of high redundancy on vote disks, if you lose 2 cells, Exadata will halt because there is no longer a quorum (2 out of 3) of vote disks available. However, you can force a restart and continue and you won’t lose any data. In contrast if you have normal redundancy and lose two cells you will halt and lose all your data (normal redundancy does not survive double disk failure).

[…] have introduced high availability quorum disks for the quarter-rack and eighth-rack machines. I blogged out this before as I thought it had the potential to be a real “gotcha” if you were expecting to run […]