Mydbops
Scaling database Operations
☰ Menu
How to Switch Replica Master of a non-GTID
Slave in Percona Cluster ?
Waseem Akram galera, InnoDB, troubleshooting January 31, 2019February 1, 2019
4 Minutes
Introduction –
Recently i worked on a production issue for one of our client under support .They have a
architecture of a three node Galera cluster (https://mydbops.wordpress.com/2016/06/14/galera-
cluster-a-walk-through-presentation/) with one asynchronous slave .
Node1 – 172.10.2.11
Node2 – 172.10.2.12
Node3 – 172.10.2.13
Replica – 172.10.2.14
Architecture –
The slave(replica) was configured with node3 as replica master. Unfortunately the node 3 was
crashed with an OOM killer ,also server has a low gcache size, so when i am trying to start the
node 3 , it went to SST . Here the data size was around 2.6 TB , in general for completion of whole
SST and joining the node back to cluster will take around approximately 12 hours.
As i told earlier, the replication slave was under node3 and all reporting applications were
pointed to async slave only .So, I can’t wait upto 12 hours as it will affect my entire client reporting
environment.
To overcome this scenario, i had planned to switch my async slave under node 2 ( 172.10.2.12 ) . By
this blog post, i am going to explain the steps how i was able to achieve this .
Step 1:
Stop the slave server for getting persistent log file and position in async slave.
mysql> STOP SLAVE SQL_THREAD;
Query OK, 0 rows affected (0.11 sec)
From the command SHOW SLAVE STATUS\G get the Relay_Log_File and
Read_Master_Log_Pos.
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Reconnecting after a failed master event read
Master_Host: 172.10.2.13
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.003099
Read_Master_Log_Pos: 677232126
Relay_Log_File: replica-relay-bin.009093
Relay_Log_Pos: 677232323
Relay_Master_Log_File: mysql-bin.003099
Here Relay_Log_File is replica-relay-bin.009093 and Read_Master_Log_Pos is 677232126
Step 2:
Decode the respective Relay_Log_File and get the Xid
(https://dev.mysql.com/doc/internals/en/xid-event.html) value using the position of
Read_Master_Log_Pos .
mysqlbinlog --no-defaults --base64-output=decode-rows -vv /data/mysql/re
plica-relay-bin.009093 > /home/mydbops/replica-relay-bin.009093.txt
[mydbops@replica ~]$ less replica-relay-bin.009093.txt | grep "677232126
"#181202 3:27:13 server id 12 end_log_pos 677232126 CRC32 0xc818fec0
Xid = 572228464
Extended view of replica-relay-bin.009093
# at 677232001
#181202 3:27:13 server id 12 end_log_pos 677231872 CRC32 0x19bffe41
Query thread_id=7239194 exec_time=1 error_code=0
SET TIMESTAMP=1543701433/*!*/;
SET @@session.foreign_key_checks=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,
@@session.collation_server=8/*!*/;
BEGIN
/*!*/;
# at 677232069
#181202 3:27:13 server id 12 end_log_pos 677231983 CRC32 0x8c9b2a51
Rows_query
# insert into mydbops.wsrep_paused(tag,counter) values ( NAME_CONST('nod
e',2),@new_value)
# at 677232180
#181202 3:27:13 server id 12 end_log_pos 677232046 CRC32 0xa0e15b87
Table_map: `mydbops`.`wsrep_paused` mapped to number 297
# at 677232243
#181202 3:27:13 server id 12 end_log_pos 677232095 CRC32 0xe5c9c37e
Write_rows: table id 297 flags: STMT_END_F
### INSERT INTO `mydbops`.`wsrep_paused`
### SET
### @1=1517733 /* INT meta=0 nullable=0 is_null=0 */
### @2=2 /* TINYINT meta=0 nullable=0 is_null=0 */
### @3=7555.01 /* FLOAT meta=4 nullable=1 is_null=0 */
### @4=1543701433 /* TIMESTAMP(0) meta=0 nullable=1 is_null=0 */
# at 677232292
#181202 3:27:13 server id 12 end_log_pos 677232126 CRC32 0xc818fec0
Xid=572228464
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
Step 3:
Using Xid number get end_log_pos from node 2 to which we are going configure replication.
[mydbops@db02 ~]$ sudo mysqlbinlog /data/mysql/mysql-bin.002279 --base6
4-output=DECODE-ROWS --verbose | grep "Xid = 572228464"#181202 3:27:13
server id 12 end_log_pos 815896364 CRC32 0x519db5da Xid = 572228464
from Xid ‘572228464’ we got the binlog position ‘815896364‘ in mysql-bin.002279
Step 4:
Pointing the replication to node 2 by running change master in slave server.
mysql> reset slave;
Query OK, 0 rows affected (0.11 sec)
mysql> reset slave all;
Query OK, 0 rows affected (0.01 sec)
mysql> change master to master_host='172.20.4.12',master_port=3306,maste
r_log_file='mysql-bin.002279',master_log_pos=815896364;
Query OK, 0 rows affected (0.02 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: 172.20.4.12
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.002280
Read_Master_Log_Pos: 100575837
Relay_Log_File: replica-relay-bin.000002
Relay_Log_Pos: 317
Relay_Master_Log_File: mysql-bin.002279
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
After switch over current master for replica is node 2 (172.20.4.12):
Conclusion:
From the four simple steps i was able to get back the slave and make it production ready,
which reduces the impact of downtime of a switch with out GTID.
Tagged:
Galera Cluster,
MySQL
Published by Waseem Akram
Waseem Akram is a Oracle certified MySQL DBA. Working in MyDBOPS IT Solutions on MySQL and
related technologies to ensures database performance. Handling multi client tasks round the clock and also
working with a team of experts at MyDBOPS. View all posts by Waseem Akram
WordPress.com.