How to Fix MySQL Replication Error 1032 (HA_ERR_KEY_NOT_FOUND) on Slave Server


2 views

When adding a new slave to an existing MySQL replication setup, you might encounter Error 1032 with the message "Can't find record in 'tablename'". This occurs when:

  • The slave attempts to apply an UPDATE or DELETE operation
  • The matching row doesn't exist in the slave's table
  • There's data inconsistency between master and slave

The error typically appears due to:

1. Missing rows on slave that exist on master
2. Different primary/unique keys between servers
3. Schema drift (different column definitions)
4. Direct writes to slave database (violating replication rules)

While you can skip errors temporarily, this isn't a permanent solution:

-- Dangerous approach (skips ALL errors)
STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

-- Better approach (skips only Error 1032)
STOP SLAVE;
SET GLOBAL slave_exec_mode='IDEMPOTENT';
START SLAVE;

Method 1: Resync affected tables

-- On master:
FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;
-- Record binlog position
-- Dump affected table
mysqldump -u root -p dbname email_events > email_events.sql
UNLOCK TABLES;

-- On slave:
STOP SLAVE;
DROP TABLE email_events;
-- Import dump file
mysql -u root -p dbname < email_events.sql
START SLAVE UNTIL MASTER_LOG_FILE='mysqld-bin.000410', MASTER_LOG_POS=368808733;
START SLAVE;

Method 2: Use pt-table-sync for live synchronization

pt-table-sync --sync-to-master h=slave_host,u=root,p=password \
--databases dbname --tables email_events --print
# Verify output first, then run with --execute
  • Enable read_only on slaves
  • Implement checksum tables monitoring
  • Use pt-table-checksum regularly
  • Consider GTID-based replication

For critical systems where consistency is paramount:

  • MySQL Group Replication
  • Galera Cluster
  • ProxySQL with failover

When adding a new slave to an existing MySQL replication setup, you might encounter Error 1032 (HA_ERR_KEY_NOT_FOUND) with messages like:

Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table xxx.email_events; 
Can't find record in 'email_events', Error_code: 1032; 
handler error HA_ERR_KEY_NOT_FOUND;

This error typically occurs when:

  • The slave server is missing rows that exist on the master
  • Data became inconsistent during initial sync
  • Schema changes weren't properly replicated
  • There were manual modifications on the slave

While you can temporarily skip the error using:

STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

This isn't a permanent solution. The proper fix involves:

1. First verify data consistency:

pt-table-checksum --replicate=percona.checksums \
--create-replicate-table \
--empty-replicate-table \
--databases=your_db \
h=master_host,u=user,p=password

2. Then identify differences:

pt-table-sync --replicate=percona.checksums \
--sync-to-master h=slave_host,u=user,p=password

3. For permanent error skipping (use cautiously):

CHANGE MASTER TO 
MASTER_HOST='master_host',
MASTER_USER='repl_user',
MASTER_PASSWORD='password',
MASTER_LOG_FILE='mysql-bin.000410',
MASTER_LOG_POS=368808733,
MASTER_AUTO_POSITION=0;

Create a replication error handler script:

#!/bin/bash

ERROR=$(mysql -u root -p"password" -e "SHOW SLAVE STATUS\G" | grep "Last_SQL_Errno")

if [[ $ERROR == *"1032"* ]]; then
    mysql -u root -p"password" -e "STOP SLAVE; SET GLOBAL sql_slave_skip_counter=1; START SLAVE;"
    echo "$(date) - Skipped error 1032" >> /var/log/mysql_replication.log
fi
  • Use GTID-based replication instead of file/position
  • Implement checksum verification regularly
  • Avoid manual data changes on slaves
  • Consider using MySQL Group Replication for better consistency