```html
Capacity planning requires analyzing three key metrics:
- Concurrent Users: Peak simultaneous active connections
- Request Rate: Transactions per second (TPS) or requests per minute (RPM)
- Resource Utilization: CPU, memory, disk I/O, and network throughput
For a typical LAMP stack application:
# Python pseudo-code for request simulation
import requests
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 5)
@task(3)
def browse_posts(self):
self.client.get("/posts")
@task(1)
def create_post(self):
self.client.post("/create", json={"title": "Test", "content": "..."})
# Expected throughput formula:
# Max Users = (Available RAM - OS Overhead) / (Memory per User Process)
# Example: 16GB server with 2GB OS overhead, 50MB/process:
# (16-2)*1024/50 ≈ 286 concurrent users
For MySQL performance tuning:
# MySQL configuration for high concurrency
[mysqld]
innodb_buffer_pool_size = 12G # 70-80% of total RAM
innodb_io_capacity = 2000
max_connections = 500
thread_cache_size = 100
table_open_cache = 4000
Application Type | Per User Bandwidth | 10,000 Users |
---|---|---|
Text-heavy website | 50KB/page | 500MB/hour peak |
Media streaming | 2MB/s | 20Gbps sustained |
Auto-scaling configuration for AWS:
# Terraform example for auto-scaling group
resource "aws_autoscaling_group" "web" {
desired_capacity = 4
max_size = 20
min_size = 2
launch_template {
id = aws_launch_template.web.id
version = "$Latest"
}
target_group_arns = [aws_lb_target_group.web.arn]
scaling_policy {
adjustment_type = "ChangeInCapacity"
scaling_adjustment = 2
cooldown = 300
metric_aggregation_type = "Average"
}
}
Essential Prometheus queries for capacity analysis:
# CPU saturation
100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Memory pressure
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
# Disk I/O bottleneck
rate(node_disk_read_bytes_total[5m]) + rate(node_disk_written_bytes_total[5m])
Capacity planning involves complex tradeoffs between:
- Concurrent user load vs. system resources
- Peak traffic patterns vs. baseline requirements
- Hardware costs vs. performance SLAs
Before sizing servers, establish these baseline measurements:
// Sample Python script to collect basic metrics
import psutil
import time
def collect_metrics(interval=5):
while True:
cpu = psutil.cpu_percent(interval=1)
mem = psutil.virtual_memory().percent
disk = psutil.disk_io_counters()
net = psutil.net_io_counters()
print(f"CPU: {cpu}% | MEM: {mem}%")
print(f"Disk R/W: {disk.read_bytes/1024}KB/{disk.write_bytes/1024}KB")
print(f"Network TX/RX: {net.bytes_sent/1024}KB/{net.bytes_recv/1024}KB")
time.sleep(interval)
Theoretical maximum concurrent users calculation:
Max Users = (RAM Size) / (Memory per User)
× (CPU Cores) / (CPU Utilization per User)
/ (Safety Factor of 2-5)
For MySQL performance optimization:
# MySQL Configuration Adjustments
[mysqld]
innodb_buffer_pool_size = 12G # 70-80% of available RAM
innodb_io_capacity = 2000 # For SSDs
max_connections = 500 # Adjust based on connection pooling
thread_cache_size = 100
query_cache_size = 0 # Disabled for MySQL 8+
AWS Auto Scaling template example:
{
"AutoScalingGroupName": "web-tier-asg",
"LaunchTemplate": {
"LaunchTemplateName": "web-server-template",
"Version": "$Latest"
},
"MinSize": 2,
"MaxSize": 10,
"DesiredCapacity": 4,
"TargetTrackingScalingPolicies": [
{
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"TargetValue": 60.0
}
]
}
Server Type | Users @ 1s Response | Users @ 3s Response |
---|---|---|
2vCPU/4GB | 1,200 | 3,800 |
4vCPU/8GB | 2,500 | 7,500 |
Prometheus configuration snippet:
# prometheus.yml
rule_files:
- 'alert.rules'
alerting:
alertmanagers:
- static_configs:
- targets:
- 'localhost:9093'
# alert.rules
groups:
- name: capacity-alerts
rules:
- alert: HighCPU
expr: 100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 10m