Confirm replication is healthy
Do not proceed if lag is non-zero or threads are not running. Fix replication first.
SHOW REPLICA STATUS\G
-- Replica_IO_Running: Yes
-- Replica_SQL_Running: Yes
-- Seconds_Behind_Source: 0Freeze writes on the primary
This starts the write outage. Move quickly through steps 3–5.
SET GLOBAL super_read_only = 1;
SET GLOBAL read_only = 1;Wait for replica to fully catch up
Should be instant since writes are frozen. If it takes more than a few seconds, roll back using the section below.
SHOW REPLICA STATUS\G
-- Seconds_Behind_Source: 0Promote the replica
STOP REPLICA;
RESET REPLICA ALL;
SET GLOBAL read_only = 0;
SET GLOBAL super_read_only = 0;Redirect traffic
kubectl get pods -n tailscale \
-l tailscale.com/parent-resource=mysql-replica \
-o jsonpath='{.items[0].status.podIP}{"\n"}'kubectl patch endpointslice mysql --type=json \
-p '[{"op":"replace","path":"/endpoints/0/addresses/0","value":"<REPLICA_PROXY_IP>"}]'Write outage ends here. Apps reconnect within a few seconds.
Verify
kubectl run verify --rm -it --image=mysql:8.0 --restart=Never -- \
mysql -h mysql.default.svc.cluster.local -u <user> -p<pass> \
-e "SELECT @@server_id, @@read_only"
-- read_only should be 0, server_id should match the replicaDo your maintenance
mysql-prod-primary is now idle. Reboot, patch, or do whatever you need.
Rejoin the primary as a replica
Once the machine is back up, connect to mysql-prod-primary and configure it as a replica.
STOP REPLICA;
RESET REPLICA ALL;
SET GLOBAL read_only = 1;
SET GLOBAL super_read_only = 1;
CHANGE REPLICATION SOURCE TO
SOURCE_HOST = 'mysql-prod-replica',
SOURCE_USER = 'repl',
SOURCE_PASSWORD = '<REPL_PASSWORD>',
SOURCE_AUTO_POSITION = 1,
GET_SOURCE_PUBLIC_KEY = 1;
START REPLICA;SHOW REPLICA STATUS\G
-- Replica_IO_Running: Yes
-- Replica_SQL_Running: Yes
-- Seconds_Behind_Source: 0Rollback
Roll back (before step 5 only)
If anything goes wrong before the EndpointSlice is patched, traffic is still going to the old primary. Undo the freeze:
SET GLOBAL super_read_only = 0;
SET GLOBAL read_only = 0;If you already promoted the replica (step 4), reconfigure it as a replica again:
STOP REPLICA;
RESET REPLICA ALL;
SET GLOBAL read_only = 1;
SET GLOBAL super_read_only = 1;
CHANGE REPLICATION SOURCE TO
SOURCE_HOST = 'mysql-prod-primary',
SOURCE_USER = 'repl',
SOURCE_PASSWORD = '<REPL_PASSWORD>',
SOURCE_AUTO_POSITION = 1,
GET_SOURCE_PUBLIC_KEY = 1;
START REPLICA;There is no rollback after step 5. Use the Restore Topology runbook instead.