In server cluster architecture, both keepalive and heartbeat serve as health-check mechanisms but operate at different layers and with distinct purposes:
// Keepalive example (TCP level)
int enableKeepalive = 1;
setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &enableKeepalive, sizeof(enableKeepalive));
// Heartbeat example (application level)
void send_heartbeat() {
while(running) {
send(cluster_nodes, "HEARTBEAT", 9, 0);
sleep(HEARTBEAT_INTERVAL);
}
}
Keepalive operates at transport layer (TCP):
- OS-level implementation
- Detects physical connection failures
- Minimal network overhead
Heartbeat works at application layer:
- Customizable message format
- Detects application-level failures
- Supports complex failure detection logic
For Linux HA clusters using keepalived:
vrrp_script chk_nginx {
script "pidof nginx"
interval 2
weight 50
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 12345
}
virtual_ipaddress {
192.168.1.100
}
track_script {
chk_nginx
}
}
Metric | Keepalive | Heartbeat |
---|---|---|
Detection Time | 2 hours (default) | Seconds |
Configurability | Limited | Fully customizable |
Network Overhead | Minimal | Depends on implementation |
Many production systems combine both approaches:
// Combined TCP keepalive + application heartbeat
void connection_watchdog() {
configure_tcp_keepalive(socket);
start_heartbeat_thread();
while(active) {
if (!check_tcp_state() || !received_heartbeat()) {
initiate_failover();
break;
}
}
}
Key metrics to monitor when implementing either solution:
- Network bandwidth consumption
- CPU utilization during failure detection
- Failover time consistency
- False positive rates
In Kubernetes environments, consider:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
In server cluster architecture, both keepalive and heartbeat serve as monitoring mechanisms, but with distinct operational paradigms:
// Sample TCP Keepalive configuration in Linux
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
Keepalive operates at transport layer (TCP), while Heartbeat is application-layer:
- Keepalive: Built into TCP stack, detects dead connections
- Heartbeat: Custom protocol messages between nodes
For web server clusters:
# HAProxy keepalive configuration example
backend webservers
mode http
option httpchk
server web1 10.0.0.1:80 check inter 2000 rise 2 fall 3
server web2 10.0.0.2:80 check backup
Heartbeat typically consumes more resources but provides:
Metric | Keepalive | Heartbeat |
---|---|---|
Network Overhead | Low | Medium-High |
Detection Speed | Slow (minutes) | Fast (seconds) |
Configuration | OS-level | Application-specific |
Heartbeat implementations often include:
// Pseudocode for basic heartbeat algorithm
while (true) {
send_heartbeat();
if (!receive_ack_within(timeout)) {
trigger_failover();
break;
}
sleep(interval);
}
Combining both techniques in Kubernetes:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 5
periodSeconds: 10