When managing infrastructure with Puppet, there are scenarios where you need to programmatically access the compiled catalog data. The most common use case is extracting package lists with their specified versions across your node inventory. This enables version tracking, dependency analysis, and compliance reporting.
Here are three practical approaches to retrieve package data from Puppet:
1. Direct Catalog Compilation
Use Puppet's face API to compile and inspect catalogs programmatically:
require 'puppet/face'
def get_packages(node)
catalog = Puppet::Face[:catalog, '0.0.1'].find(node)
catalog.resources.select { |r| r.type == 'Package' }.map do |p|
{
name: p.title,
version: p[:ensure],
provider: p[:provider]
}
end
end
2. Using PuppetDB Queries
For environments with PuppetDB, you can query the package resources directly:
# PQL query to get all packages across nodes
curl -X POST http://puppetdb:8081/pdb/query/v4 \
-H "Content-Type: application/json" \
-d '{
"query": "resources[certname, title, parameters] { type = 'Package' }"
}'
3. Catalog Export via Tasks
Create a custom task to export catalog data to your preferred storage:
# metadata.json
{
"description": "Export package information from catalogs",
"input_method": "stdin",
"parameters": {
"output_dir": {
"description": "Directory to save JSON exports",
"type": "String"
}
}
}
# export_packages.rb
require 'json'
require 'puppet'
def task(output_dir:)
node = Puppet[:certname]
catalog = Puppet::Resource::Catalog.indirection.find(node)
packages = catalog.resources.select { |r| r.type == 'Package' }.each_with_object({}) do |p, h|
h[p.title] = p[:ensure]
end
File.write(File.join(output_dir, "#{node}_packages.json"), packages.to_json)
{ status: 'success', exported: packages.size }
end
For environments using Hiera or complex conditional logic, consider these additional steps:
- Include class parameters in your export to understand package inclusion triggers
- Capture resource dependencies using catalog.relationship_graph
- Account for dynamic version specifications (e.g., latest/pinned versions)
Once collected, package data can be processed for various outputs:
# Generate a markdown report
def generate_report(package_data)
report = ["# Package Version Report", "| Node | Package | Version |", "|------|---------|---------|"]
package_data.each do |node, packages|
packages.each do |name, version|
report << "| #{node} | #{name} | #{version} |"
end
end
report.join("\n")
end
Consider adding catalog analysis as a pipeline step to track package version changes between deployments. This helps detect version drift and potential conflicts early in the deployment process.
When implementing catalog exports:
- Restrict access to sensitive catalog data
- Sanitize outputs containing node-specific parameters
- Implement proper authentication for PuppetDB queries
When managing infrastructure with Puppet, you'll often need to extract specific resource data like package versions from compiled catalogs. This information becomes crucial for:
- Auditing package versions across nodes
- Generating compliance reports
- Troubleshooting dependency issues
- Maintaining version consistency
The simplest approach is to query resources directly on nodes:
# Get all installed packages
puppet resource package
# Filter for specific packages
puppet resource package | grep -E 'nginx|httpd'
# JSON output format
puppet resource package --to_json
For centralized data collection, PuppetDB provides powerful query capabilities:
# Query PuppetDB for package resources
curl -X GET \
--tlsv1 \
--cacert /etc/puppetlabs/puppet/ssl/certs/ca.pem \
--cert /etc/puppetlabs/puppet/ssl/certs/$(hostname -f).pem \
--key /etc/puppetlabs/puppet/ssl/private_keys/$(hostname -f).pem \
'https://puppetdb:8081/pdb/query/v4/resources/Package'
Create a custom report processor to extract package data:
# In /etc/puppetlabs/puppet/puppet.conf
[master]
reports = store,package_versions
# Custom report processor (/etc/puppetlabs/code/environments/production/modules/puppet_metrics/lib/puppet/reports/package_versions.rb)
require 'json'
Puppet::Reports.register_report(:package_versions) do
def process
packages = self.resource_statuses.values.select do |resource|
resource.resource_type == 'Package'
end.map do |package|
{
name: package.title,
version: package.events.last&.desired_value,
node: self.host
}
end
File.write("/opt/puppetlabs/package_versions/#{Time.now.to_i}.json", packages.to_json)
end
end
Preview package changes before they're applied:
puppet agent -t --noop --evaltrace \
| grep -A 1 'Package' \
| grep -E 'current_value|ensure'
Combine these methods to create comprehensive reports. For example, this Ruby script aggregates data:
require 'puppetdb'
require 'json'
client = PuppetDB::Client.new(server: 'https://puppetdb:8081')
response = client.request(
'resources',
[:'=', 'type', 'Package'],
[:'=', 'exported', false]
)
package_data = response.data.each_with_object({}) do |resource, hash|
hash[resource['title']] = {
version: resource['parameters']['ensure'],
nodes: [resource['certname']]
}
end
File.write('package_report.json', JSON.pretty_generate(package_data))