Optimizing Puppet Report Storage: Safe Deletion of Processed Reports in PuppetDB/Dashboard Environments


2 views

Puppet generates detailed YAML reports in /var/lib/puppet/reports by default, while PuppetDB or Dashboard stores processed data in MySQL (typically /var/lib/mysql). This dual storage can lead to significant disk consumption when managing numerous nodes.

Yes, once reports are processed by PuppetDB (version 2.3+) or Puppet Dashboard, the original YAML files can be safely deleted. The processed data is stored in these database tables:

puppetdb.reports
puppetdb.resource_events
puppetdb.catalogs

For Puppet Enterprise setups, configure report pruning in puppetdb.conf:

[database]
# For PuppetDB 7.x+
node-ttl = 7d
report-ttl = 7d
gc-interval = 60m

For open-source Puppet, create a cron job:

# /etc/cron.daily/puppet-report-cleanup
#!/bin/bash
find /var/lib/puppet/reports -type f -name '*.yaml' -mtime +3 -delete
# Optional: MySQL table optimization
mysql -u puppet -ppassword -e "OPTIMIZE TABLE puppetdb.reports"

For Puppet Dashboard, modify /etc/puppet-dashboard/settings.yml:

daily_report_prune_age: 3
daily_report_prune_keep: 1000
  • Implement a retention policy (typically 7-30 days for reports)
  • Monitor database growth with SELECT pg_size_pretty(pg_database_size('puppetdb'));
  • Consider report compression for archival purposes
  • For large deployments, evaluate external storage solutions like S3

If you encounter missing report data after cleanup:

# Verify PuppetDB report count
curl -X GET http://localhost:8080/pdb/query/v4/reports --data-urlencode 'query=["=", "certname", "node01.example.com"]'

# Check Dashboard processing logs
grep "Processing report" /var/log/puppet-dashboard/production.log

Puppet generates reports in YAML format stored in /var/lib/puppet/reports, while processed data gets stored in either PuppetDB (PostgreSQL) or Puppet Dashboard (MySQL). The MySQL database path (/var/lib/mysql) typically grows proportionally with report volume.

Yes, once reports are processed by PuppetDB or Puppet Dashboard, the original YAML files can be safely deleted. The database becomes the authoritative storage location for historical report data.

# Safe cleanup command (run as root)
find /var/lib/puppet/reports -type f -name '*.yaml' -mtime +7 -delete

For production environments, implement these strategies:

  • Automated cleanup: Set up cron jobs for periodic deletion
  • Database maintenance: Regularly optimize tables storing reports
  • Storage monitoring: Implement alerting for disk usage thresholds

Create a custom Puppet task for automated report cleanup:

# Puppet manifest example for report cleanup
class puppet::report_cleanup {
  file { '/usr/local/bin/clean_puppet_reports':
    ensure  => file,
    mode    => '0755',
    content => @(EOT)
      #!/bin/bash
      # Delete reports older than 7 days
      find /var/lib/puppet/reports -type f -name '*.yaml' -mtime +7 -delete
      | EOT
  }

  cron { 'puppet-report-cleanup':
    command => '/usr/local/bin/clean_puppet_reports',
    user    => 'root',
    hour    => 2,
    minute  => 0,
  }
}

For PuppetDB installations, configure report TTL in puppetdb.conf:

# Sample PuppetDB configuration
database {
    report-ttl = 14d
    node-ttl = 30d
}

For MySQL-based Puppet Dashboard, implement regular maintenance:

# MySQL optimization commands
OPTIMIZE TABLE resource_statuses;
OPTIMIZE TABLE reports;