StatsD and CollectD serve fundamentally different purposes in the monitoring ecosystem:
- CollectD is a metrics collection daemon focused on system-level statistics (CPU, memory, disk I/O)
- StatsD is a metrics aggregation service designed for application-level instrumentation
In production environments, they often work together:
# Typical data flow:
[Application] --custom metrics--> [StatsD]
[Server] --system metrics--> [CollectD]
[CollectD] --optional forwarding--> [StatsD]
[StatsD] --> [TimeSeries DB like Graphite]
Pattern 1: CollectD as Source for StatsD
LoadPlugin write_graphite
<Plugin write_graphite>
<Node "statsd">
Host "statsd.example.com"
Port "8125"
Protocol "udp"
Prefix "collectd."
</Node>
</Plugin>
Pattern 2: StatsD as Aggregation Layer
// Node.js application sending metrics
const StatsD = require('node-statsd');
const client = new StatsD({
host: 'statsd.example.com',
prefix: 'app.'
});
client.increment('user.login');
Metric | CollectD | StatsD |
---|---|---|
Collection Frequency | 5-60s intervals | Real-time |
Protocol | Usually TCP | UDP by default |
Resource Usage | Low (C implementation) | Medium (Node.js) |
The hybrid approach makes sense when you need:
- System-level monitoring via CollectD (disk space, network traffic)
- Application business metrics via StatsD (API calls, user actions)
- Unified visualization in dashboards
Example Grafana query showing both sources:
aliasByNode(
group(
stats_counts.app.*.login,
collectd.*.memory.free
),
2
)
StatsD and CollectD serve complementary roles in the monitoring ecosystem. While both handle metrics collection, they operate at different layers:
- CollectD: Primarily a data collection agent that gathers system-level metrics (CPU, memory, disk I/O) directly from hosts
- StatsD: Functions as a metrics aggregation service that receives, processes, and forwards metrics from various sources
# CollectD Architecture
Host -> CollectD (collection) -> Time-series DB (e.g., Graphite, InfluxDB)
# StatsD Architecture
Application -> StatsD (aggregation) -> Backend (e.g., Graphite, Prometheus)
They can work together in these common deployment scenarios:
- Direct Collection:
# CollectD config to send metrics to StatsD LoadPlugin write_graphite <Plugin write_graphite> <Node "statsd"> Host "statsd.example.com" Port "8125" Protocol "udp" </Node> </Plugin>
- Sidecar Pattern:
# Docker compose example services: collectd: image: collectd:latest volumes: - ./collectd.conf:/etc/collectd/collectd.conf statsd: image: statsd/statsd ports: - "8125:8125/udp"
Use Case | CollectD | StatsD |
---|---|---|
System metrics | ✓ Best choice | ✗ Limited capability |
Custom application metrics | ✗ Not ideal | ✓ Perfect fit |
High-resolution collection | ✓ Native support | ✗ Aggregation-focused |
Here's how to configure CollectD to filter metrics before sending to StatsD:
LoadPlugin match_regex
LoadPlugin target_set
<Chain "PreCache">
<Rule "filter_system_metrics">
<Match regex>
Plugin "^cpu|memory|disk$"
</Match>
<Target set>
MetaData "target_type" "statsd"
</Target>
</Rule>
</Chain>