When implementing automated log cleanup, several technical factors should guide your retention period decision:
- Debugging needs: Most production issues surface within 7-30 days
- Compliance requirements: Certain industries mandate 90-180 day retention
- Storage constraints: Available disk space vs. log growth rate
- Forensic value: Security incidents may require longer retention
Based on industry benchmarks:
// Common retention presets
const RETENTION_PRESETS = {
DEBUG: 7, // Short-term troubleshooting
STANDARD: 30, // General applications
COMPLIANCE: 90, // Regulatory requirements
SECURITY: 180 // Incident investigation
};
Here's a Node.js implementation using file system aging:
const fs = require('fs');
const path = require('path');
function cleanupLogs(logDir, maxAgeDays) {
const cutoff = Date.now() - (maxAgeDays * 86400000);
fs.readdir(logDir, (err, files) => {
files.forEach(file => {
const filePath = path.join(logDir, file);
fs.stat(filePath, (err, stats) => {
if (stats.mtimeMs < cutoff) {
fs.unlink(filePath, err => {
if (err) console.error(Failed to delete ${filePath}:, err);
});
}
});
});
});
}
// Run daily cleanup for 30-day retention
setInterval(() => cleanupLogs('/var/log/myapp', 30), 86400000);
For space-constrained environments:
function rotateBySize(logDir, maxSizeMB) {
const files = fs.readdirSync(logDir)
.map(file => ({
name: file,
size: fs.statSync(path.join(logDir, file)).size,
time: fs.statSync(path.join(logDir, file)).mtimeMs
}))
.sort((a,b) => b.time - a.time);
let totalSize = files.reduce((sum, file) => sum + file.size, 0);
const maxBytes = maxSizeMB * 1024 * 1024;
for (let i = files.length - 1; i >= 0 && totalSize > maxBytes; i--) {
fs.unlinkSync(path.join(logDir, files[i].name));
totalSize -= files[i].size;
}
}
Combine time and size constraints for robust management:
const cleanupConfig = {
maxAgeDays: 30,
maxDirSizeMB: 1024,
preserveLatest: 5 // Always keep N newest files
};
function hybridCleanup(logDir, config) {
// Implementation combining both methods
// ...
}
When implementing automated log cleanup functionality, there's no universal "perfect" retention period - it depends on your specific requirements. Let's examine the key considerations:
From production environment experience, these retention periods are frequently used:
- 7-30 days: For high-volume debug logs in development environments
- 30-90 days: For standard application logs in production
- 180-365 days: For security/audit logs with compliance requirements
Here are three common strategies with code examples:
Time-Based Deletion (Python)
import os
import time
from datetime import datetime, timedelta
def clean_logs(log_dir, days_to_keep):
cutoff = time.time() - (days_to_keep * 86400)
for f in os.listdir(log_dir):
filepath = os.path.join(log_dir, f)
if os.path.isfile(filepath):
if os.stat(filepath).st_mtime < cutoff:
os.remove(filepath)
Size-Based Rotation (Bash)
#!/bin/bash
LOG_DIR="/var/log/myapp"
MAX_SIZE="100M"
find $LOG_DIR -name "*.log" -size +$MAX_SIZE -exec rm {} \;
Hybrid Approach (Java)
import java.nio.file.*;
import java.time.*;
import java.io.IOException;
public class LogCleaner {
public static void clean(String dirPath, int maxDays, long maxSizeMB)
throws IOException {
Path dir = Paths.get(dirPath);
long cutoff = System.currentTimeMillis() -
TimeUnit.DAYS.toMillis(maxDays);
Files.list(dir)
.filter(Files::isRegularFile)
.filter(p -> p.toString().endsWith(".log"))
.filter(p -> {
try {
long fileTime = Files.getLastModifiedTime(p).toMillis();
long fileSize = Files.size(p);
return fileTime < cutoff ||
fileSize > maxSizeMB * 1024 * 1024;
} catch (IOException e) {
return false;
}
})
.forEach(p -> {
try { Files.delete(p); }
catch (IOException e) { /* log error */ }
});
}
}
Consider these aspects when choosing your retention policy:
- Disk space constraints: More critical for embedded systems
- Debugging needs: Longer retention helps troubleshoot intermittent issues
- Compliance requirements: Financial/healthcare apps often need 6+ months
- Log volume: High-frequency logs may need shorter retention
For enterprise systems, consider:
- Gradual expiration (e.g., keep 90 days of debug but 365 days of errors)
- Cloud storage for archival (S3/GCS with lifecycle policies)
- Log aggregation systems (ELK stack) for long-term analysis