MySQL Replication: Choosing Between ROW, STATEMENT, and MIXED Binary Log Formats


10 views

When setting up MySQL replication in version 5.1.49, the binlog format choice fundamentally affects replication behavior, data integrity, and performance. Let's examine each format's technical characteristics:

-- Check current binlog format
SHOW VARIABLES LIKE 'binlog_format';

ROW format logs actual row changes rather than SQL statements. This is particularly crucial when:

  • Using non-deterministic functions like UUID() or RAND()
  • Executing complex triggers or stored procedures
  • Needing precise data recovery
-- Example where ROW format is essential
UPDATE large_table SET col = UUID() WHERE id BETWEEN 1 AND 10000;

The trade-off is increased binary log size, especially for bulk operations affecting many rows.

STATEMENT format replicates the actual SQL commands. While more compact, it's prone to edge cases:

-- Potential replication inconsistency in STATEMENT mode
UPDATE accounts SET balance = balance * 1.05 WHERE user_id = 42;

This format may be suitable for simple workloads with deterministic queries, but generally not recommended for production.

MIXED mode automatically switches between ROW and STATEMENT based on query characteristics:

  • Uses STATEMENT for deterministic queries (simple INSERT/UPDATE)
  • Switches to ROW for non-deterministic operations
-- Configuration example
SET GLOBAL binlog_format = 'MIXED';
SET GLOBAL binlog_row_image = 'FULL';

For MySQL 5.1.49 replication:

# my.cnf configuration
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
expire_logs_days = 7
sync_binlog = 1

Consider ROW format as your default choice unless you have specific legacy requirements. The data safety benefits typically outweigh the storage overhead.

If you encounter replication drift:

-- Verify replication consistency
CHECKSUM TABLE important_table;
-- Compare master/slave checksums

For large transactions in ROW format, monitor disk space and consider adjusting:

SET GLOBAL binlog_row_image = 'MINIMAL';

When setting up MySQL replication (specifically in version 5.1.49), the binlog format choice significantly impacts performance, data consistency, and storage requirements. Let's examine the three formats:


-- Check current binlog format
SHOW VARIABLES LIKE 'binlog_format';

Logs the actual SQL statements that modify data. Best for:

  • Minimal storage usage
  • Simple queries with deterministic results
  • Cases where the same statement produces identical results on master and slave

-- Example of what gets logged
UPDATE users SET status='active' WHERE last_login > '2023-01-01';

Drawback: Non-deterministic functions (like UUID() or NOW()) may cause inconsistencies.

Logs changed rows rather than SQL statements. Advantages include:

  • Precise data replication
  • Handles non-deterministic functions correctly
  • Better for stored procedures and triggers

-- Example row change representation
# at 12345
#230101 10:00:00 server id 1  end_log_pos 98765
Table_map: test.users mapped to number 14
Update_rows: table id 14 flags: STMT_END_F
BINLOG '
BASE64ENCODEDROWDATAHERE
'/*!*/;

Tradeoff: Larger binary logs and potentially slower replication for bulk operations.

Hybrid approach that uses statement-based by default but switches to row-based when needed:

  • Automatic switching for non-deterministic statements
  • Balance between storage efficiency and reliability
  • Good for most general-purpose replication scenarios

-- Configuration example in my.cnf
[mysqld]
binlog_format = MIXED
log-bin = mysql-bin
server-id = 1

For MySQL 5.1.49 replication setups:

  1. Choose ROW if:
    • You use stored procedures/triggers
    • Need 100% data consistency
    • Have non-deterministic queries
  2. Choose MIXED if:
    • You want a balance of performance and reliability
    • Have mostly deterministic statements with few edge cases

-- How to switch formats dynamically
SET GLOBAL binlog_format = 'ROW';
FLUSH LOGS;
  • ROW format requires careful monitoring of disk space
  • Statement-based may break with certain SQL modes
  • Test thoroughly before changing formats in production
  • Consider upgrading from 5.1 for better replication features