MySQL Replication: Choosing Between ROW, STATEMENT, and MIXED Binary Log Formats


2 views

When setting up MySQL replication in version 5.1.49, the binlog format choice fundamentally affects replication behavior, data integrity, and performance. Let's examine each format's technical characteristics:

-- Check current binlog format
SHOW VARIABLES LIKE 'binlog_format';

ROW format logs actual row changes rather than SQL statements. This is particularly crucial when:

  • Using non-deterministic functions like UUID() or RAND()
  • Executing complex triggers or stored procedures
  • Needing precise data recovery
-- Example where ROW format is essential
UPDATE large_table SET col = UUID() WHERE id BETWEEN 1 AND 10000;

The trade-off is increased binary log size, especially for bulk operations affecting many rows.

STATEMENT format replicates the actual SQL commands. While more compact, it's prone to edge cases:

-- Potential replication inconsistency in STATEMENT mode
UPDATE accounts SET balance = balance * 1.05 WHERE user_id = 42;

This format may be suitable for simple workloads with deterministic queries, but generally not recommended for production.

MIXED mode automatically switches between ROW and STATEMENT based on query characteristics:

  • Uses STATEMENT for deterministic queries (simple INSERT/UPDATE)
  • Switches to ROW for non-deterministic operations
-- Configuration example
SET GLOBAL binlog_format = 'MIXED';
SET GLOBAL binlog_row_image = 'FULL';

For MySQL 5.1.49 replication:

# my.cnf configuration
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
expire_logs_days = 7
sync_binlog = 1

Consider ROW format as your default choice unless you have specific legacy requirements. The data safety benefits typically outweigh the storage overhead.

If you encounter replication drift:

-- Verify replication consistency
CHECKSUM TABLE important_table;
-- Compare master/slave checksums

For large transactions in ROW format, monitor disk space and consider adjusting:

SET GLOBAL binlog_row_image = 'MINIMAL';

When setting up MySQL replication (specifically in version 5.1.49), the binlog format choice significantly impacts performance, data consistency, and storage requirements. Let's examine the three formats:


-- Check current binlog format
SHOW VARIABLES LIKE 'binlog_format';

Logs the actual SQL statements that modify data. Best for:

  • Minimal storage usage
  • Simple queries with deterministic results
  • Cases where the same statement produces identical results on master and slave

-- Example of what gets logged
UPDATE users SET status='active' WHERE last_login > '2023-01-01';

Drawback: Non-deterministic functions (like UUID() or NOW()) may cause inconsistencies.

Logs changed rows rather than SQL statements. Advantages include:

  • Precise data replication
  • Handles non-deterministic functions correctly
  • Better for stored procedures and triggers

-- Example row change representation
# at 12345
#230101 10:00:00 server id 1  end_log_pos 98765
Table_map: test.users mapped to number 14
Update_rows: table id 14 flags: STMT_END_F
BINLOG '
BASE64ENCODEDROWDATAHERE
'/*!*/;

Tradeoff: Larger binary logs and potentially slower replication for bulk operations.

Hybrid approach that uses statement-based by default but switches to row-based when needed:

  • Automatic switching for non-deterministic statements
  • Balance between storage efficiency and reliability
  • Good for most general-purpose replication scenarios

-- Configuration example in my.cnf
[mysqld]
binlog_format = MIXED
log-bin = mysql-bin
server-id = 1

For MySQL 5.1.49 replication setups:

  1. Choose ROW if:
    • You use stored procedures/triggers
    • Need 100% data consistency
    • Have non-deterministic queries
  2. Choose MIXED if:
    • You want a balance of performance and reliability
    • Have mostly deterministic statements with few edge cases

-- How to switch formats dynamically
SET GLOBAL binlog_format = 'ROW';
FLUSH LOGS;
  • ROW format requires careful monitoring of disk space
  • Statement-based may break with certain SQL modes
  • Test thoroughly before changing formats in production
  • Consider upgrading from 5.1 for better replication features