Monday, January 21, 2013

Galera Recovery - rsync or xtrabackup

Is there any difference using rsync or xtrabackup when performing SSTs?

Yes, the main difference between the two methods is that the rsync method will block the Donor, hence it will not be possible to write to the Donor during the SST (because it is read-only).
Thus, if you have a three node cluster, and one node fails, effectively two are out of service.
When using xtrabackup this is not a problem, as reads and writes can happen on the Donor node during the SST.

Is there any difference in recovery times comparing the two methods?

To test this we set up a three node Galera cluster (based on the Codership build), populating the cluster, using sysbench, with 22GB of data. We are using standard Galera wsrep options.

The servers, created on Rackspace UK (all servers are in the same availability zone)  using Ubuntu 12.04, has 2GB of RAM each. No changes were made on the data during the recovery, or between the runs.

17:24:49 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12586)
17:55:26 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 12586)

Recovery Time: ~31 minutes

To test the xtrabackup we installed the necessary tools and changed the my.cnf files following the procedure outlined here.

8:35:45 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12586)
9:06:04 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 12586)

Recovery Time: ~31 minutes

So for this data set and setup, using rsync or xtrabackup for optimizing recovery times does not matter, but for sure the benefit of having a read/write donor is an appealing case for using xtrabackup as the SST method. Over WAN it may be different, we will look into this in a later post.

On a side note, I actually expected rsync to be faster that xtrabackup (on paper/theory it should be..), but it may be due to the limited bandwidth between the Rackspace servers (it peaks at about 15MB/s on the instances used here), so in a private data center it may be different, However, having a read/write Donor is something I appreciate a lot. Thanks Massimo for bringing it up.