>
>
>
>
When working with AWS RDS PostgreSQL on gp3 volumes, many developers assume the 125 MiB/s baseline throughput applies uniformly. However, the reality involves a burst bucket mechanism that can significantly impact performance:
>
>
# Sample CloudWatch query for EBS metrics
>
>aws cloudwatch get-metric-statistics \
>
> --namespace AWS/RDS \
>
> --metric-name EBSByteBalance% \
>
> --dimensions Name=DBInstanceIdentifier,Value=your-db-instance \
>
> --start-time $(date -d "3 days ago" +%Y-%m-%dT%H:%M:%S) \
>
> --end-time $(date -d "now" +%Y-%m-%dT%H:%M:%S) \
>
> --period 3600 \
>
> --statistics Average
>
>
>
>
>
>
gp3 volumes operate with a token bucket algorithm where:
>
>
-
>
- Initial burst balance: 1,024,000,000 credits (equal to 3,000 MB/s for 5 minutes)
- Accumulation rate: 3,000 MB/s per TB of volume size (600 MB/s for 200GB)
- Minimum baseline: 125 MB/s regardless of volume size
>
>
>
>
>
>
>
>
>
>
>
When seeing consistent EBSByteBalance% depletion, implement these solutions:
>
>
-- PostgreSQL query to identify high-I/O operations
>
>SELECT query, calls, total_time, rows,
>
> shared_blks_hit, shared_blks_read
>
>FROM pg_stat_statements
>
>ORDER BY shared_blks_read DESC
>
>LIMIT 10;
>
>
>
>
>
>
-
>
- Volume Scaling: Increase to 400GB+ for better baseline (250MB/s)
- Provisioned IOPS: Use --iops parameter during modification
- Workload Distribution: Implement read replicas for analytic queries
>
>
>
>
>
>
>
>
>
>
>
Create these CloudWatch alarms:
>
>
aws cloudwatch put-metric-alarm \
>
> --alarm-name "RDS-EBS-Credit-Low" \
>
> --metric-name EBSByteBalance% \
>
> --namespace AWS/RDS \
>
> --threshold 20 \
>
> --comparison-operator LessThanThreshold \
>
> --evaluation-periods 3 \
>
> --period 300 \
>
> --alarm-actions arn:aws:sns:us-east-1:123456789012:MyAlarmNotification
>
>
>
>
When working with AWS RDS gp3 volumes, it's crucial to understand how the burst credit system operates. While gp3 volumes below 400GB do provide a baseline throughput of 125MiB/s, this is only available when you have sufficient burst credits in your EBSByteBalance%.
# Sample CloudWatch query to check credit balance
aws cloudwatch get-metric-statistics \
--namespace AWS/RDS \
--metric-name EBSByteBalance% \
--dimensions Name=DBInstanceIdentifier,Value=your-db-instance \
--start-time $(date -u +"%Y-%m-%dT%H:%M:%SZ" --date="-3 days") \
--end-time $(date -u +"%Y-%m-%dT%H:%M:%SZ") \
--period 3600 \
--statistics Average \
--output json
The key misunderstanding here is that the 125MiB/s baseline isn't free - it still consumes credits when used. The baseline is simply the maximum throughput you can achieve while consuming credits at the standard rate. Your instance appears to be:
- Consistently operating at 5-7MiB/s (which is above the true baseline)
- Experiencing spikes that rapidly deplete credits
- Not getting sufficient idle time to replenish credits
Here are three approaches to stabilize your RDS performance:
# Option 1: Increase volume size to get higher baseline
aws rds modify-db-instance \
--db-instance-identifier your-db-instance \
--allocated-storage 400 \ # Crosses the 400GB threshold
--apply-immediately
# Option 2: Provision additional IOPS (costs extra)
aws rds modify-db-instance \
--db-instance-identifier your-db-instance \
--iops 6000 \ # Example value
--apply-immediately
# Option 3: Implement read replica for load distribution
aws rds create-db-instance-read-replica \
--db-instance-identifier your-db-instance \
--db-instance-identifier replica-instance \
--source-region us-west-2
Set up proactive monitoring to prevent future incidents:
# CloudWatch alarm for credit balance
aws cloudwatch put-metric-alarm \
--alarm-name RDS-Credit-Balance-Low \
--alarm-description "EBS burst credits below 20%" \
--metric-name EBSByteBalance% \
--namespace AWS/RDS \
--statistic Average \
--period 300 \
--threshold 20 \
--comparison-operator LessThanThreshold \
--dimensions Name=DBInstanceIdentifier,Value=your-db-instance \
--evaluation-periods 2 \
--alarm-actions arn:aws:sns:us-west-2:123456789012:MyTopic
Beyond storage configuration, consider these database-level optimizations:
- Review slow queries with
pg_stat_statements
- Adjust
work_mem
andmaintenance_work_mem
parameters - Implement connection pooling to reduce overhead
- Schedule heavy operations during off-peak hours