mysql_backups_using_replication

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
mysql_backups_using_replication [12.11.2025 23:14] – created Pascal Sutermysql_backups_using_replication [13.11.2025 09:20] (current) – [Backup and Monitoring] Pascal Suter
Line 3: Line 3:
  
 in my case i had a docker-compose.yml which contains the php web app and a mysql server service. i first copied the service in the ''docker-compose.yml'' file to a ''mysql-replica'' service, so that the respective sections look like this:  in my case i had a docker-compose.yml which contains the php web app and a mysql server service. i first copied the service in the ''docker-compose.yml'' file to a ''mysql-replica'' service, so that the respective sections look like this: 
-<code>+<code yaml>
   mysql:   mysql:
 #    image: mysql:8.0 #    image: mysql:8.0
Line 40: Line 40:
  
 for the primary server, make sure the config contains the following settings:  for the primary server, make sure the config contains the following settings: 
-<code>+<code conf>
 [mysqld] [mysqld]
 server-id = 1 server-id = 1
Line 58: Line 58:
  
 and on the replica the settings should look like this:  and on the replica the settings should look like this: 
-<code>+<code conf>
 [mysqld] [mysqld]
 server-id = 2 server-id = 2
Line 64: Line 64:
 relay_log = relay-bin relay_log = relay-bin
  
-keep binlog on replica too (useful for cascading replication, backups, etc.) +disable binary logging as we don't need this on the replica 
-log_bin = mysql-bin +skip-log-bin
-binlog_format = ROW+
  
 +# GTID functionality is needed for replication
 gtid_mode = ON gtid_mode = ON
 enforce_gtid_consistency = ON enforce_gtid_consistency = ON
-log_slave_updates = ON 
  
 # make it read-only for safety # make it read-only for safety
Line 77: Line 76:
  
 # don't auto-start replication until we finish setup # don't auto-start replication until we finish setup
 +# then comment this out
 skip_slave_start = ON skip_slave_start = ON
 </code> </code>
Line 124: Line 124:
 </code> </code>
  
-you can also see the the executed and retreived Gtid dataset, they should update as the master is writing data to the db. +you can also see the the executed and retrieved Gtid dataset, they should update as the master is writing data to the db. 
  
 +now remove the ''skip_slave_start = ON'' line or comment it out in the ''conf/mysql-replica/replica.cnf'' file and restart the container, then re-check if it is still syncing. 
 +===== Backup and Monitoring =====
 +now lets create a backup script. this script will first check if the replication is still running and updated before it shuts down the replica container and creates a tar.gz file of the database data directory. on success, it will notify a uptime kuma push monitor. 
 +
 +for this to work we first need to create a monitoring user **on the master database** which will then auto sync to the read-only replica database as well. 
 +
 +<code>
 +docker-compose exec mysql mysql -uroot -psecretrootpassword
 +
 +CREATE USER 'monitor'@'%' IDENTIFIED BY 'monitor-password';
 +GRANT REPLICATION CLIENT ON *.* TO 'monitor'@'%';
 +FLUSH PRIVILEGES;
 +</code>
 +
 +to test if it is working, run the following query **on the replica mysql server** this time: 
 +  docker-compose exec mysql-replica mysql -umonitor -pmonitor-password -e "SHOW SLAVE STATUS\G"
 +this should show the same output we saw previously when setting up the replication. 
 +
 +and here is the script that does it all. it only sends a heartbeat update to the uptime kuma push monitor when the backup is successful AND the replica server was replicating prior to stopping it and backing it up. obviously you need to adjust the user parameters at the begining of the script: 
 +<code bash>
 +#!/usr/bin/env bash
 +set -euo pipefail
 +
 +########################################
 +# Settings
 +########################################
 +
 +# Directory where your docker-compose.yml lives
 +COMPOSE_DIR="/opt/buchmann-prod"   # <-- CHANGE THIS
 +
 +# Name of the replica service in docker-compose
 +REPLICA_SERVICE="mysql-replica"
 +
 +# Where the replica's data directory is on the host
 +REPLICA_DATA_DIR="${COMPOSE_DIR}/data/mysql-replica"
 +
 +# Where to store backups
 +BACKUP_DIR="/backup"
 +
 +# How many backups to keep
 +MAX_BACKUPS=24
 +
 +# Uptime Kuma entity ID
 +ENTITY_ID="2N88TDy5YeCX7n3cJD5Wok9DNZhdRlw6"       # <-- CHANGE THIS
 +
 +# Monitoring DB user (created on the PRIMARY, replicated to replica)
 +MONITOR_USER="monitor"
 +MONITOR_PASSWORD="jookeg4ahr0eidienaiGhe8nieGo1u"   # <-- CHANGE THIS or move to env
 +
 +# State file to track last Executed_Gtid_Set
 +STATE_DIR="${BACKUP_DIR}/.state"
 +LAST_GTID_FILE="${STATE_DIR}/last_gtid"
 +
 +########################################
 +# Globals
 +########################################
 +
 +START_TS=$(date +%s)
 +REPL_WARN=0   # 0=OK, 1=replication issue detected
 +
 +mkdir -p "${BACKUP_DIR}" "${STATE_DIR}"
 +
 +########################################
 +# Functions
 +########################################
 +
 +check_replication() {
 +  echo "[$(date)] Checking replication status on ${REPLICA_SERVICE}..."
 +
 +  # Get SHOW SLAVE STATUS output
 +  local status
 +  status=$(docker-compose -f "${COMPOSE_DIR}/docker-compose.yml" exec -T "${REPLICA_SERVICE}" \
 +    mysql -u"${MONITOR_USER}" -p"${MONITOR_PASSWORD}" -e "SHOW SLAVE STATUS\G" 2>/dev/null || true)
 +
 +  if [[ -z "${status}" ]]; then
 +    echo "[$(date)] WARNING: SHOW SLAVE STATUS returned nothing. Replication may not be configured."
 +    REPL_WARN=1
 +    return
 +  fi
 +
 +  # Parse key fields
 +  local io_running sql_running seconds_behind executed_gtid
 +
 +  io_running=$(printf '%s\n' "${status}" | awk -F: '/Slave_IO_Running/ {gsub(/^[ \t]+/, "", $2); print $2}')
 +  sql_running=$(printf '%s\n' "${status}" | awk -F: '/Slave_SQL_Running:/ {gsub(/^[ \t]+/, "", $2); print $2}')
 +  seconds_behind=$(printf '%s\n' "${status}" | awk -F: '/Seconds_Behind_Master/ {gsub(/^[ \t]+/, "", $2); print $2}')
 +  executed_gtid=$(
 +  printf '%s\n' "${status}" | \
 +    sed -n 's/^ *Executed_Gtid_Set:[[:space:]]*//p'
 +  )
 +  echo "[$(date)] Slave_IO_Running=${io_running}, Slave_SQL_Running=${sql_running}, Seconds_Behind_Master=${seconds_behind}"
 +
 +  # Basic health check
 +  if [[ "${io_running}" != "Yes" || "${sql_running}" != "Yes" ]]; then
 +    echo "[$(date)] WARNING: Replication threads not running correctly."
 +    REPL_WARN=1
 +  fi
 +
 +  # Check if replication advanced since last backup (via Executed_Gtid_Set)
 +  if [[ -n "${executed_gtid}" ]]; then
 +    if [[ -f "${LAST_GTID_FILE}" ]]; then
 +      local prev_gtid
 +      prev_gtid=$(<"${LAST_GTID_FILE}")
 +      if [[ "${prev_gtid}" == "${executed_gtid}" ]]; then
 +        echo "[$(date)] WARNING: Executed_Gtid_Set unchanged since last backup – replication may be stuck."
 +        REPL_WARN=1
 +      else
 +        echo "[$(date)] Replication has advanced since last backup."
 +      fi
 +    else
 +      echo "[$(date)] No previous GTID file – this is probably the first backup."
 +    fi
 +    # Update GTID state for next run
 +    printf '%s\n' "${executed_gtid}" > "${LAST_GTID_FILE}"
 +  else
 +    echo "[$(date)] WARNING: Executed_Gtid_Set is empty."
 +    REPL_WARN=1
 +  fi
 +}
 +
 +send_uptime_ping() {
 +  local end_ts
 +  end_ts=$(date +%s)
 +  local walltime=$((end_ts - START_TS))
 +
 +  local status="up"
 +  local msg="OK"
 +
 +  msg="${msg// /_}"
 +
 +  local url="https://uptime.dalco.ch/api/push/${ENTITY_ID}?status=${status}&msg=${msg}&ping=${walltime}"
 +
 +  echo "[$(date)] Sending uptime ping: ${url}"
 +  curl -fsS "${url}" >/dev/null 2>&1 || echo "[$(date)] WARNING: failed to send uptime ping"
 +}
 +
 +on_exit() {
 +  local exit_code="$1"
 +  echo "[$(date)] Starting ${REPLICA_SERVICE} container again..."
 +  docker-compose -f "${COMPOSE_DIR}/docker-compose.yml" start "${REPLICA_SERVICE}" || \
 +    echo "[$(date)] WARNING: failed to start ${REPLICA_SERVICE}"
 +
 +  # Only send ping if:
 +  #  - script exited successfully (exit_code == 0)
 +  #  - replication checks reported no issues (REPL_WARN == 0)
 +  if [[ "${exit_code}" -eq 0 && "${REPL_WARN}" -eq 0 ]]; then
 +    send_uptime_ping
 +  else
 +    echo "[$(date)] Not sending uptime ping (exit_code=${exit_code}, REPL_WARN=${REPL_WARN})."
 +  fi
 +}
 +
 +trap 'on_exit $?' EXIT
 +
 +########################################
 +# Main
 +########################################
 +
 +echo "[$(date)] Starting MySQL replica backup..."
 +
 +cd "${COMPOSE_DIR}"
 +
 +# 1) Check replication health *before* stopping the replica
 +check_replication
 +
 +# 2) Stop replica container (no password needed here)
 +echo "[$(date)] Stopping ${REPLICA_SERVICE} container..."
 +docker-compose stop "${REPLICA_SERVICE}"
 +
 +# 3) Create tar.gz backup
 +TIMESTAMP="$(date +%Y%m%d-%H%M%S)"
 +BACKUP_FILE="${BACKUP_DIR}/mysql-replica-${TIMESTAMP}.tar.gz"
 +
 +echo "[$(date)] Creating tar.gz from ${REPLICA_DATA_DIR} -> ${BACKUP_FILE}..."
 +tar -czf "${BACKUP_FILE}" -C "${REPLICA_DATA_DIR}" .
 +
 +echo "[$(date)] Backup archive created: ${BACKUP_FILE}"
 +
 +# 4) Rotate old backups
 +echo "[$(date)] Rotating old backups (keeping last ${MAX_BACKUPS})..."
 +mapfile -t BACKUPS < <(ls -1t "${BACKUP_DIR}"/mysql-replica-*.tar.gz 2>/dev/null || true)
 +
 +if (( ${#BACKUPS[@]} > MAX_BACKUPS )); then
 +  for f in "${BACKUPS[@]:MAX_BACKUPS}"; do
 +    echo "[$(date)] Deleting old backup: ${f}"
 +    rm -f -- "${f}"
 +  done
 +fi
 +
 +echo "[$(date)] Backup finished successfully."
 +</code>
 +**NOTE** you may want to use gz instead of bz2 in case it takes too long
 +
 +here is the cron job: 
 +  0 */2 * * * /opt/dbbackup.sh >> /var/log/dbbackup.log 2>&1
 +add the dbbackup.log to your logrotate config
 +
 +===== replica check script =====
 +here is a script to quickly check if the replica is working and up to date.. useful when messing around with the config for example or for monitoring purposes other than through the bakcup script: 
 +
 +<code bash checkReplica.sh>
 +#!/usr/bin/env bash
 +set -euo pipefail
 +
 +########################################
 +# Settings
 +########################################
 +
 +# Directory where your docker-compose.yml lives
 +COMPOSE_DIR="/path/to/your/project"         # <-- CHANGE THIS
 +
 +# Name of the replica service in docker-compose
 +REPLICA_SERVICE="mysql-replica"
 +
 +# Monitoring DB user (created on PRIMARY, replicated to replica)
 +MONITOR_USER="monitor"
 +MONITOR_PASSWORD="monitor-password"        # <-- CHANGE THIS
 +
 +########################################
 +# Colors
 +########################################
 +
 +if [ -t 1 ]; then
 +  RED="$(printf '\e[31m')"
 +  GREEN="$(printf '\e[32m')"
 +  YELLOW="$(printf '\e[33m')"
 +  BOLD="$(printf '\e[1m')"
 +  RESET="$(printf '\e[0m')"
 +else
 +  RED=""; GREEN=""; YELLOW=""; BOLD=""; RESET=""
 +fi
 +
 +########################################
 +# Helper
 +########################################
 +
 +# Extract "Value" from lines like:
 +# "  Some_Field: Value"
 +get_field() {
 +  local key="$1"
 +  printf '%s\n' "$STATUS" \
 +    | sed -n "s/^[[:space:]]*${key}:[[:space:]]*//p" \
 +    | head -n1
 +}
 +
 +########################################
 +# Main
 +########################################
 +
 +cd "${COMPOSE_DIR}"
 +
 +# Try SHOW REPLICA STATUS first (MySQL 8+), fall back to SHOW SLAVE STATUS (older)
 +STATUS=$(docker-compose -f "${COMPOSE_DIR}/docker-compose.yml" exec -T "${REPLICA_SERVICE}" \
 +  mysql -u"${MONITOR_USER}" -p"${MONITOR_PASSWORD}" -e "SHOW REPLICA STATUS\G" 2>/dev/null || true)
 +
 +if [[ -z "${STATUS}" ]]; then
 +  STATUS=$(docker-compose -f "${COMPOSE_DIR}/docker-compose.yml" exec -T "${REPLICA_SERVICE}" \
 +    mysql -u"${MONITOR_USER}" -p"${MONITOR_PASSWORD}" -e "SHOW SLAVE STATUS\G" 2>/dev/null || true)
 +fi
 +
 +if [[ -z "${STATUS}" ]]; then
 +  echo "${RED}${BOLD}ERROR:${RESET} No replication status found."
 +  echo "  - Is ${REPLICA_SERVICE} configured as a replica?"
 +  echo "  - Is MySQL running in that container?"
 +  exit 1
 +fi
 +
 +# IO / SQL thread status (new names first, fall back to old)
 +IO_Running=$(get_field "Replica_IO_Running")
 +if [[ -z "${IO_Running}" ]]; then
 +  IO_Running=$(get_field "Slave_IO_Running")
 +fi
 +
 +SQL_Running=$(get_field "Replica_SQL_Running")
 +if [[ -z "${SQL_Running}" ]]; then
 +  SQL_Running=$(get_field "Slave_SQL_Running")
 +fi
 +
 +# Seconds behind (new name first, fall back)
 +Seconds_Behind=$(get_field "Seconds_Behind_Source")
 +if [[ -z "${Seconds_Behind}" ]]; then
 +  Seconds_Behind=$(get_field "Seconds_Behind_Master")
 +fi
 +
 +# GTID sets – allow indent before the key
 +Executed_Gtid_Set=$(
 +  printf '%s\n' "${STATUS}" \
 +    | sed -n 's/^[[:space:]]*Executed_Gtid_Set:[[:space:]]*//p' \
 +    | head -n1
 +)
 +Retrieved_Gtid_Set=$(
 +  printf '%s\n' "${STATUS}" \
 +    | sed -n 's/^[[:space:]]*Retrieved_Gtid_Set:[[:space:]]*//p' \
 +    | head -n1
 +)
 +
 +# Errors (names are the same on old/new)
 +Last_IO_Error=$(get_field "Last_IO_Error")
 +Last_SQL_Error=$(get_field "Last_SQL_Error")
 +
 +########################################
 +# Determine overall status
 +########################################
 +
 +overall_color="${GREEN}"
 +overall_text="OK"
 +
 +if [[ "${IO_Running}" != "Yes" || "${SQL_Running}" != "Yes" ]]; then
 +  overall_color="${RED}"
 +  overall_text="ERROR"
 +elif [[ -z "${Seconds_Behind}" || "${Seconds_Behind}" == "NULL" ]]; then
 +  overall_color="${YELLOW}"
 +  overall_text="UNKNOWN_DELAY"
 +elif [[ "${Seconds_Behind}" =~ ^[0-9]+$ && "${Seconds_Behind}" -gt 300 ]]; then
 +  # More than 5 minutes behind -> warn
 +  overall_color="${YELLOW}"
 +  overall_text="LAGGING"
 +fi
 +
 +########################################
 +# Output
 +########################################
 +
 +echo
 +echo "${BOLD}Replica status for service '${REPLICA_SERVICE}':${RESET}"
 +echo "  Overall: ${overall_color}${overall_text}${RESET}"
 +echo
 +
 +echo "  IO thread:   ${IO_Running:-<unknown>}"
 +echo "  SQL thread:  ${SQL_Running:-<unknown>}"
 +
 +if [[ -z "${Seconds_Behind}" || "${Seconds_Behind}" == "NULL" ]]; then
 +  echo "  Delay:       (unknown / NULL)"
 +else
 +  echo "  Delay:       ${Seconds_Behind} seconds behind primary"
 +fi
 +
 +echo
 +echo "  Retrieved GTID set:"
 +echo "    ${Retrieved_Gtid_Set:-<none>}"
 +echo "  Executed GTID set:"
 +echo "    ${Executed_Gtid_Set:-<none>}"
 +echo
 +
 +if [[ -n "${Last_IO_Error}" && "${Last_IO_Error}" != " " ]]; then
 +  echo "${RED}  Last IO error:${RESET}"
 +  echo "    ${Last_IO_Error}"
 +  echo
 +fi
 +
 +if [[ -n "${Last_SQL_Error}" && "${Last_SQL_Error}" != " " ]]; then
 +  echo "${RED}  Last SQL error:${RESET}"
 +  echo "    ${Last_SQL_Error}"
 +  echo
 +fi
 +</code>
  
  • mysql_backups_using_replication.1762985647.txt.gz
  • Last modified: 12.11.2025 23:14
  • by Pascal Suter