How to Fix MySQL Replication Error 1396 (CREATE USER Failed) When Slave_SQL_Running Stops


2 views

When your MySQL replication suddenly stops with Slave_SQL_Running: No and error 1396 on a CREATE USER statement, it typically indicates a synchronization conflict between master and slave databases. Here's what's happening under the hood:

SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Last_Errno: 1396
               Last_Error: Error 'Operation CREATE USER failed for 'user'@'ip'' on query.

The error occurs because:

  • The user already exists on the slave with different authentication parameters
  • There's a privilege mismatch between master and slave
  • The replication user lacks sufficient privileges to execute user management commands

First, skip the problematic transaction:

STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

For persistent solutions:

Always create users through replication:

-- On master:
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';

-- On slave:
STOP SLAVE;
CHANGE MASTER TO MASTER_USER='repl_user', MASTER_PASSWORD='password';
START SLAVE;

When users must exist on both servers:

-- On slave:
DROP USER IF EXISTS 'problem_user'@'ip';
STOP SLAVE;
START SLAVE;

Add these to your my.cnf:

[mysqld]
slave-skip-errors = 1396
replicate-wild-ignore-table=mysql.%

Monitor replication status with:

SHOW SLAVE STATUS\G
SELECT * FROM performance_schema.replication_applier_status_by_worker;

For complex cases, examine the binary log:

SHOW BINLOG EVENTS IN 'mysql-bin.00000X' FROM [position] LIMIT 10;
mysqlbinlog --start-position=[pos] /var/lib/mysql/mysql-bin.00000X

When your MySQL replication breaks with Slave_SQL_Running: No and error 1396 on a CREATE USER statement, it typically indicates a synchronization issue between master and slave. Here's the anatomy of this specific failure:

SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Last_Errno: 1396
               Last_Error: Error 'Operation CREATE USER failed for 'user'@'ip'' on query

The error occurs because:

  • The user already exists on the slave with different credentials
  • Privilege tables are out of sync
  • There might be a replication filter preventing user creation

For immediate recovery (with understanding this is temporary):

STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

Warning: This only works if the next events in the binary log aren't dependent on this user creation.

To properly fix this without skipping events:

-- On master:
SELECT CONCAT('SHOW GRANTS FOR ''',user,'''@''',host,''';') 
FROM mysql.user WHERE user='problem_user'\G

-- On slave:
DROP USER IF EXISTS 'user'@'ip';
-- Then apply the grants from master

Best practices for user management in replication:

-- Always create users with IF NOT EXISTS:
CREATE USER IF NOT EXISTS 'user'@'ip' IDENTIFIED BY 'password';

-- Or use mysql_native_password explicitly:
CREATE USER 'user'@'ip' IDENTIFIED WITH mysql_native_password AS '*hash';

-- Consider adding to your sync checks:
pt-table-checksum --replicate=percona.checksums --databases=mysql

If you're using replication filters, ensure they don't exclude mysql database:

replicate-ignore-db=mysql  # This would cause our issue

Instead, use:

replicate-wild-ignore-table=mysql.%  # More precise filtering

Set up alerts for these key metrics:

Seconds_Behind_Master > threshold
Slave_SQL_Running = No
Last_Errno != 0

Consider using tools like Percona PMM or Orchestrator for automatic failover scenarios.