0% found this document useful (0 votes)
78 views5 pages

How To Switch Replica Master of A Non-GTID Slave in Percona Cluster ? - Mydbops

The document discusses how to switch the replica master of a non-GTID MySQL slave in a Percona Cluster from a failed node to another active node. It describes stopping replication on the slave, identifying the replication position using SHOW SLAVE STATUS, decoding the relay log file to get the XID, finding the matching position on the new master, and running CHANGE MASTER TO to point the slave to the new master node and restart replication. This process allows switching the slave's replication source without downtime to avoid impacting reporting applications during the failed node's recovery.

Uploaded by

Manosh Malai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views5 pages

How To Switch Replica Master of A Non-GTID Slave in Percona Cluster ? - Mydbops

The document discusses how to switch the replica master of a non-GTID MySQL slave in a Percona Cluster from a failed node to another active node. It describes stopping replication on the slave, identifying the replication position using SHOW SLAVE STATUS, decoding the relay log file to get the XID, finding the matching position on the new master, and running CHANGE MASTER TO to point the slave to the new master node and restart replication. This process allows switching the slave's replication source without downtime to avoid impacting reporting applications during the failed node's recovery.

Uploaded by

Manosh Malai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Mydbops

Scaling database Operations

☰ Menu

How to Switch Replica Master of a non-GTID


Slave in Percona Cluster ?
Waseem Akram galera, InnoDB, troubleshooting January 31, 2019February 1, 2019
4 Minutes
Introduction –

Recently i worked on a production issue for one of our client under support .They have a
architecture of a three node Galera cluster (https://mydbops.wordpress.com/2016/06/14/galera-
cluster-a-walk-through-presentation/) with one asynchronous slave .

Node1 – 172.10.2.11
Node2 – 172.10.2.12
Node3 – 172.10.2.13
Replica – 172.10.2.14

Architecture –

The slave(replica) was configured with node3 as replica master. Unfortunately the node 3 was
crashed with an OOM killer ,also server has a low gcache size, so when i am trying to start the
node 3 , it went to SST . Here the data size was around 2.6 TB , in general for completion of whole
SST and joining the node back to cluster will take around approximately 12 hours.

As i told earlier, the replication slave was under node3 and all reporting applications were
pointed to async slave only .So, I can’t wait upto 12 hours as it will affect my entire client reporting
environment.

To overcome this scenario, i had planned to switch my async slave under node 2 ( 172.10.2.12 ) . By
this blog post, i am going to explain the steps how i was able to achieve this .

Step 1:

Stop the slave server for getting persistent log file and position in async slave.
mysql> STOP SLAVE SQL_THREAD;
Query OK, 0 rows affected (0.11 sec)

From the command SHOW SLAVE STATUS\G get the Relay_Log_File and
Read_Master_Log_Pos.

mysql> show slave status\G


*************************** 1. row ***************************
Slave_IO_State: Reconnecting after a failed master event read
Master_Host: 172.10.2.13
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.003099
Read_Master_Log_Pos: 677232126
Relay_Log_File: replica-relay-bin.009093
Relay_Log_Pos: 677232323
Relay_Master_Log_File: mysql-bin.003099

Here Relay_Log_File is replica-relay-bin.009093 and Read_Master_Log_Pos is 677232126

Step 2:

Decode the respective Relay_Log_File and get the Xid


(https://dev.mysql.com/doc/internals/en/xid-event.html) value using the position of
Read_Master_Log_Pos .

mysqlbinlog --no-defaults --base64-output=decode-rows -vv /data/mysql/re


plica-relay-bin.009093 > /home/mydbops/replica-relay-bin.009093.txt

[mydbops@replica ~]$ less replica-relay-bin.009093.txt | grep "677232126


"#181202 3:27:13 server id 12 end_log_pos 677232126 CRC32 0xc818fec0
Xid = 572228464

Extended view of replica-relay-bin.009093


# at 677232001
#181202 3:27:13 server id 12 end_log_pos 677231872 CRC32 0x19bffe41
Query thread_id=7239194 exec_time=1 error_code=0
SET TIMESTAMP=1543701433/*!*/;
SET @@session.foreign_key_checks=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,
@@session.collation_server=8/*!*/;
BEGIN
/*!*/;
# at 677232069
#181202 3:27:13 server id 12 end_log_pos 677231983 CRC32 0x8c9b2a51
Rows_query
# insert into mydbops.wsrep_paused(tag,counter) values ( NAME_CONST('nod
e',2),@new_value)
# at 677232180
#181202 3:27:13 server id 12 end_log_pos 677232046 CRC32 0xa0e15b87
Table_map: `mydbops`.`wsrep_paused` mapped to number 297
# at 677232243
#181202 3:27:13 server id 12 end_log_pos 677232095 CRC32 0xe5c9c37e
Write_rows: table id 297 flags: STMT_END_F
### INSERT INTO `mydbops`.`wsrep_paused`
### SET
### @1=1517733 /* INT meta=0 nullable=0 is_null=0 */
### @2=2 /* TINYINT meta=0 nullable=0 is_null=0 */
### @3=7555.01 /* FLOAT meta=4 nullable=1 is_null=0 */
### @4=1543701433 /* TIMESTAMP(0) meta=0 nullable=1 is_null=0 */
# at 677232292
#181202 3:27:13 server id 12 end_log_pos 677232126 CRC32 0xc818fec0
Xid=572228464
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;

Step 3:

Using Xid number get end_log_pos from node 2 to which we are going configure replication.

[mydbops@db02 ~]$ sudo mysqlbinlog /data/mysql/mysql-bin.002279 --base6


4-output=DECODE-ROWS --verbose | grep "Xid = 572228464"#181202 3:27:13
server id 12 end_log_pos 815896364 CRC32 0x519db5da Xid = 572228464

from Xid ‘572228464’ we got the binlog position ‘815896364‘ in mysql-bin.002279


Step 4:

Pointing the replication to node 2 by running change master in slave server.

mysql> reset slave;


Query OK, 0 rows affected (0.11 sec)

mysql> reset slave all;


Query OK, 0 rows affected (0.01 sec)

mysql> change master to master_host='172.20.4.12',master_port=3306,maste


r_log_file='mysql-bin.002279',master_log_pos=815896364;
Query OK, 0 rows affected (0.02 sec)

mysql> start slave;


Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G


*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: 172.20.4.12
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.002280
Read_Master_Log_Pos: 100575837
Relay_Log_File: replica-relay-bin.000002
Relay_Log_Pos: 317
Relay_Master_Log_File: mysql-bin.002279
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:

After switch over current master for replica is node 2 (172.20.4.12):

Conclusion:

From the four simple steps i was able to get back the slave and make it production ready,
which reduces the impact of downtime of a switch with out GTID.
Tagged:
Galera Cluster,
MySQL

Published by Waseem Akram

Waseem Akram is a Oracle certified MySQL DBA. Working in MyDBOPS IT Solutions on MySQL and
related technologies to ensures database performance. Handling multi client tasks round the clock and also
working with a team of experts at MyDBOPS. View all posts by Waseem Akram

WordPress.com.

You might also like