When working with PostgreSQL databases (especially older versions like 8.2.3), managing table performance after large-scale deletions is crucial. The scenario describes:
- Logging tables with millions of rows
- Monthly purges of data older than 30 days
- Current practice of running REINDEX after deletions
- Concern about whether VACUUM operations should be included
For optimal performance after mass deletions, you need to consider three operations:
-- Basic maintenance commands
REINDEX TABLE logging_table;
VACUUM (VERBOSE, ANALYZE) logging_table;
VACUUM FULL logging_table; -- Use with caution
In PostgreSQL 8.2.3 (before autovacuum was robust), manual VACUUM is critical because:
- Deletes mark rows as "dead" but don't reclaim space
- Indexes retain pointers to dead tuples
- Table statistics become inaccurate without ANALYZE
While REINDEX helps, it's not a complete solution:
-- REINDEX only addresses index bloat
REINDEX TABLE large_log_table; -- Takes exclusive lock
-- Compare to:
VACUUM (VERBOSE, ANALYZE) large_log_table; -- Less intrusive
For your logging table scenario:
- Delete old records
- Run VACUUM (ANALYZE)
- Periodically REINDEX (weekly/monthly)
Here's a complete maintenance script for logging tables:
-- Delete old records
DELETE FROM application_logs WHERE log_date < NOW() - INTERVAL '30 days';
-- Follow with VACUUM ANALYZE
VACUUM (VERBOSE, ANALYZE) application_logs;
-- Monthly reindex (schedule during low traffic)
REINDEX TABLE application_logs;
-- For very large tables, consider:
VACUUM FULL VERBOSE ANALYZE application_logs; -- Locks table
In PostgreSQL 8.2.3:
- VACUUM ANALYZE updates statistics for query planner
- Plain VACUUM is non-blocking (preferred for production)
- VACUUM FULL rewrites table (requires exclusive lock)
Check table and index bloat with:
SELECT relname, n_dead_tup, last_vacuum, last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'application_logs';
When dealing with logging tables where we regularly purge millions of rows (like 30-day retention policies), proper maintenance becomes critical. Unlike modern PostgreSQL versions with autovacuum, version 8.2.3 requires manual intervention.
When you DELETE FROM logs WHERE created_at < NOW() - INTERVAL '30 days'
:
1. Dead tuples accumulate (visible via SELECT n_dead_tup FROM pg_stat_user_tables
)
2. Indexes maintain pointers to deleted rows
3. Table statistics become outdated
While REINDEX cleans up orphaned index entries, it's often overkill:
-- Before automatic rebuild SELECT pg_size_pretty(pg_indexes_size('logs')); -- Full reindex (locks table) REINDEX TABLE logs; -- Concurrent alternative (PostgreSQL 8.2 lacks REINDEX CONCURRENTLY) CREATE INDEX CONCURRENTLY logs_new_idx ON logs(created_at); DROP INDEX logs_idx; ALTER INDEX logs_new_idx RENAME TO logs_idx;
For your use case, I recommend:
-- Basic vacuum (space reclamation) VACUUM VERBOSE logs; -- With statistics update (helps query planner) VACUUM ANALYZE VERBOSE logs; -- Aggressive vacuum (for one-time cleanup) VACUUM FULL ANALYZE logs;
For a logging table called app_logs
:
-- 1. Delete old records BEGIN; DELETE FROM app_logs WHERE created_at < NOW() - INTERVAL '30 days'; COMMIT; -- 2. Vacuum with statistics VACUUM ANALYZE app_logs; -- 3. Conditionally reindex DO $$ BEGIN IF (SELECT n_dead_tup FROM pg_stat_user_tables WHERE relname = 'app_logs') > 1000000 THEN EXECUTE 'REINDEX TABLE app_logs'; END IF; END $$;
Create a monitoring view:
CREATE VIEW table_metrics AS SELECT relname, n_live_tup, n_dead_tup, pg_size_pretty(pg_relation_size(relid)) AS size, pg_size_pretty(pg_indexes_size(relid)) AS idx_size, last_vacuum, last_autovacuum, last_analyze FROM pg_stat_user_tables;