When dealing with a 300M+ row InnoDB table experiencing concurrent inserts and long-running SELECT queries, we're seeing transaction isolation characteristics in action. The processlist shows multiple transactions stuck in COMMIT state while one large query (ID 40) executes:
SELECT * FROM large_table
WHERE (large_table.hotspot_id = 3000064)
ORDER BY discovered_at
LIMIT 799000, 1000
Contrary to MyISAM, InnoDB implements MVCC (Multi-Version Concurrency Control). However, several factors can still cause blocking:
- Transaction Isolation Level: REPEATABLE READ (InnoDB default) creates read views that must be maintained
- Undo Log Retention: Long-running queries prevent purge operations
- Index Scans: The ORDER BY operation forces a filesort with large offsets
For your specific case with hotspot_id filtering, consider these optimizations:
-- Create optimized composite index
ALTER TABLE large_table
ADD INDEX idx_hotspot_discovered (hotspot_id, discovered_at);
-- Rewrite the query to use indexed seek
SELECT * FROM large_table
FORCE INDEX (idx_hotspot_discovered)
WHERE hotspot_id = 3000064
ORDER BY discovered_at
LIMIT 1000 OFFSET 799000;
For a write-heavy table with analytical queries:
- Read Replicas: Offload SELECT queries to replicas
- Partitioning: Range-partition by discovered_at
- Queue Processing: Buffer inserts using Redis/Kafka
While MyISAM seems tempting for its simpler locking model, consider:
Factor | InnoDB | MyISAM |
---|---|---|
Concurrent Inserts | Possible with row-level locks | Table locks during SELECT |
Crash Safety | ACID compliant | Prone to corruption |
Full-Text Search | MySQL 5.6+ supported | Historically better |
Key my.cnf adjustments for high-volume inserts:
[mysqld]
innodb_buffer_pool_size = 12G # 70-80% of available RAM
innodb_io_capacity = 2000 # For SSD storage
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_purge_threads = 4 # Helps with undo log cleanup
innodb_flush_neighbors = 0 # Disable for SSDs
innodb_flush_log_at_trx_commit = 2 # Trade durability for speed
For pagination queries, consider keyset pagination instead:
-- First page
SELECT * FROM large_table
WHERE hotspot_id = 3000064
ORDER BY discovered_at
LIMIT 1000;
-- Subsequent pages (using last seen value)
SELECT * FROM large_table
WHERE hotspot_id = 3000064
AND discovered_at > '2023-01-01 12:00:00'
ORDER BY discovered_at
LIMIT 1000;
When working with large InnoDB tables (300M+ rows in this case), we expect SELECT queries to coexist peacefully with INSERT operations due to InnoDB's MVCC architecture. However, the processlist shows multiple COMMIT operations stuck waiting while a large SELECT executes:
| 40 | user | localhost | dbname | Query | 33 | Sorting result |
SELECT * FROM large_table WHERE hotspot_id = 3000064 ORDER BY discovered_at LIMIT 799000, 1000
The InnoDB status reveals several critical indicators:
SEMAPHORES:
RW-shared spins 2119697, OS waits 1037149
Modified db pages 1439
Buffer pool hit rate 1000 / 1000
Immediate Workarounds
For the specific query shown, implement pagination optimization:
-- Instead of:
SELECT * FROM large_table WHERE hotspot_id = 3000064 ORDER BY discovered_at LIMIT 799000, 1000
-- Use keyset pagination:
SELECT * FROM large_table
WHERE hotspot_id = 3000064 AND discovered_at > '2023-01-01 00:00:00'
ORDER BY discovered_at LIMIT 1000
Architectural Changes
Consider these schema modifications for the large table:
-- Add covering index
ALTER TABLE large_table ADD INDEX idx_cover (hotspot_id, discovered_at, device_address);
-- Partition by hotspot_id
ALTER TABLE large_table PARTITION BY RANGE (hotspot_id) (
PARTITION p0 VALUES LESS THAN (1000000),
PARTITION p1 VALUES LESS THAN (2000000),
PARTITION p2 VALUES LESS THAN (3000000),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
Critical my.cnf adjustments for heavy-write workloads:
[mysqld]
innodb_buffer_pool_size = 12G # 50-70% of available RAM
innodb_io_capacity = 2000 # For SSD storage
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_flush_neighbors = 0 # Disable for SSD
innodb_purge_threads = 4
innodb_adaptive_hash_index = OFF # For large tables
While MyISAM seems tempting, consider these tradeoffs:
-- MyISAM creates table-level locks during:
-- SELECT + INSERT scenarios
-- Heavy read workloads with concurrent writes
-- TokuDB alternative (if available):
ALTER TABLE large_table ENGINE=TokuDB;
SET GLOBAL tokudb_cache_size=4G;
Essential queries for diagnosing similar issues:
-- Check for lock waits
SELECT * FROM performance_schema.events_waits_current
WHERE EVENT_NAME LIKE '%innodb%lock%';
-- Monitor InnoDB history list length
SHOW ENGINE INNODB STATUS\G