A: ndb_mgmd, ndbd , mysqld
B: ndb_mgmd, ndbd , mysqld
Now you have a couple of options on how to handle potential network partitioning/split brain which could occur if the computer A and B lose contact with each other. Let's see the implications of these options.
Option 1: ArbitrationRank=1, no NIC bonding
- If ndb_mgmd on A is elected as the arbitrator and host A crashes, then data node on host B will also die (cannot reach arbitrator).
- If ndb_mgmd on A is elected as the arbitrator and host B lose contact with host A, then the data node on B will shut down since it can't reach the arbitrator. For further discussions, call this situation X.
- If you are in situation X and if you restart data node on B and you have not fixed the link), then it will start isolated (after some timeouts have kicked in, StartPartionedTimeout and StartPartialTimeout). Now you are in situation Y.
- If you had an application writing data (and assume this application crashed when the connection from B -> A was lost) then the following can happen if you are in situation Y
- If you do "select count(*) from t1" on mysqld host A, then it can differ from a "select count(*) from t1" on mysqld on host B, because the data node on B never got the message to flush the redo buffer to the redo log.
Now you have ended up in a very bad situation. Kill data node on host B. Fix network, restart data node with --initial. - However, this could be combined with STONITH to kill off one of the data nodes.
- no arbitrators are used - no protection from split brain.
- if host A lose contact with host B, then the data nodes will be split
- Application can write to A and B, but e.g A will _never_ see the changes from B, and vice versa
- The data nodes will drift apart and be out of sync (inconsistent). Very bad.
- When the link comes up again the data nodes A and B will not reconnect to each other.
- System administrator must kill one of the data nodes and restart it with --initial.
Option 3: ArbitrationRank=0, NIC bonding + redundant switch.
- no arbitrators are used - no protection from split brain.
- Same problems as Option 2) if a) both network links are broken or b) both switches crashes. However, the likelihood of this happening is very small.
- External arbitrator is used: You, a sys admin, a dba, or a process.
- When the data nodes lose contact with each other, they are read-only for ArbitrationDelay before unlocking (and becoming writable).
- This means that you have ArbitrationDelay (configurable) time to decide which data node that should live and who should die.
- The data node that you killed can then be restarted with "ndbd", i.e, without --initial.
- "You" in this case can be an external program, sysadmin etc..
- If you use an external program must run on a third computer.
- If you use asynchronous replication between SITE_A and SITE_B then it could be possible to have the management server there with ArbitrationRank=1, and then potentially local management servers on host A and B that only has configuration data (i.e, ArbitrationRank=0 for those).
- This must be tested to see if the link between SITE_A and SITE_B is "good enough". The link should also be reliable.
- This is the best and recommended approach (and best if even have a computer D that is like C for redundancy matters).
- Run the management server on host C. Setup would look like:
A: ndbd + mysqld
B: ndbd + mysqld
C: ndb_mgmd (AribtrationRank=1) + mysqld
What is the "best" option:
- Option 6) - get a third computer - simple
- Option 3), Option 4) - "you do it, but be careful"
- Option 5) - must be tested
- Option 1) (atleast the cluster will crash avoiding inconsistencies)
- Option 2)
Of course redundant NICs (bonding) and redundant switches can be used in all of the options above, even it is not explicitly said so for some of them.