OpenTSDB uses HBase as its backend storage, allowing theoretically unlimited retention of high-precision metrics. Here's how you'd configure a metric in OpenTSDB:
tsd.core.auto_create_metrics = true
tsd.storage.hbase.data_table = tsdb
tsd.storage.hbase.uid_table = tsdb-uid
Graphite uses Whisper files with fixed-size databases. A sample retention schema in storage-schemas.conf:
[servers]
pattern = ^servers\.
retentions = 10s:6h,1m:7d,10m:5y
OpenTSDB supports sub-second metric collection natively. A sample put command for 500ms resolution:
put sys.cpu.user 1356998400.5 42 host=webserver01 cpu=0
Graphite's minimum interval is typically 1 second, though some forks have added sub-second support. The carbon.conf setting controls this:
[carbon]
MAX_UPDATES_PER_SECOND = 1000
MAX_CREATES_PER_MINUTE = 1000
OpenTSDB excels at high-cardinality metrics. Example query for 100,000 unique time series:
{
"start": "1h-ago",
"queries": [
{
"metric": "app.requests",
"aggregator": "sum",
"tags": {
"host": "*",
"region": "us-west-*"
}
}
]
}
Graphite's query language (Graphite Render API) handles rollups efficiently:
/render?target=summarize(app.requests.count,"1hour","sum")&from=-7d
Modern alternatives worth considering:
# Prometheus config example
scrape_configs:
- job_name: 'node'
scrape_interval: 15s
static_configs:
- targets: ['localhost:9100']
InfluxDB TSM storage example:
INSERT cpu,host=server01 value=0.64
SELECT mean("value") FROM "cpu" WHERE time > now() - 1h GROUP BY time(5m)
OpenTSDB requires HBase expertise for cluster management. Graphite's carbon-relay supports consistent hashing:
[relay]
RELAY_METHOD = consistent-hashing
DESTINATIONS = 127.0.0.1:2004:a, 127.0.0.1:2104:b
For deployments under 1 million metrics/day, Graphite is often simpler. Beyond that scale, OpenTSDB's distributed architecture shines.
The most fundamental architectural difference lies in how these systems handle data retention:
// Graphite's storage-schemas.conf example
[default_1min_for_1day]
pattern = .*
retentions = 60s:1d
// OpenTSDB uses HBase's native storage without downsampling
# tsd.storage.hbase.data_table = tsdb
# tsd.storage.hbase.meta_table = tsdb-meta
Graphite employs fixed-size Whisper databases that require predefined retention policies, while OpenTSDB leverages HBase's scalable storage without built-in data deterioration. This means:
- Graphite: Storage size predictable but requires upfront configuration
- OpenTSDB: Storage scales dynamically with metrics volume
Regarding time resolution capabilities:
// Graphite's minimum interval (typically 1 second possible)
carbon.conf:
MAX_CREATES_PER_MINUTE = 10000
// OpenTSDB native support for sub-second metrics
tsd.core.auto_create_metrics = true
tsd.storage.fix_duplicates = true
While Graphite can technically handle 1s resolution, practical deployments often use 10s-60s intervals due to performance considerations. OpenTSDB natively handles millisecond precision out of the box.
The query paradigms differ significantly:
# Graphite's function chaining
aliasByNode(summarize(production.server*.cpu.load, "1h", "avg"), 2)
# OpenTSDB's metric+tag queries
{
"start": 1672531200,
"queries": [{
"metric": "sys.cpu.nice",
"tags": {
"host": "web*",
"dc": "lga"
},
"aggregator": "avg",
"downsample": "1h-avg"
}]
}
Both systems scale differently:
- Graphite: Relies on Carbon relay + aggregator patterns
[aggregation] cpu.core.all = avg cpu.core.[0-9]+
- OpenTSDB: Leverages HBase regions for horizontal scaling
tsd.storage.enable_compactions = true tsd.storage.flush_interval = 1000
For modern deployments, consider these alternatives:
# Prometheus configuration example
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
# InfluxDB line protocol example
cpu,host=server01 value=0.64 1434055562000000000
Other notable systems include VictoriaMetrics, TimescaleDB, and M3DB, each with distinct tradeoffs in consistency models and query capabilities.