HAProxy Connection Timing Diagram: Complete Breakdown of Tq, Tw, Tc, Tr, and Tt Metrics


34 views

When debugging performance issues in HAProxy, the timing metrics provide crucial visibility into where latency occurs. Let's examine each metric in the context of a full TCP connection lifecycle:


    Client                      HAProxy                      Server
       |                           |                           |
       |-- SYN ------------------->|                           |
       |                           |-- SYN ------------------->|
       |                           |<-- SYN-ACK ---------------|
       |<-- SYN-ACK ---------------|                           |
       |-- ACK ------------------->|                           |
       |                           |-- ACK ------------------->|
       |-- HTTP REQUEST ---------->| (Tq starts)               |
       |                           | (Tw if queueing occurs)   |
       |                           |-- REQUEST --------------> | (Tc if new connection)
       |                           |                           | (Processing time)
       |                           |<-- RESPONSE --------------| (Tr starts)
       |<-- RESPONSE --------------|                           |
       |-- FIN ------------------->|                           |
       |                           |-- FIN ------------------->|
       |                           |<-- ACK -------------------|
       |<-- ACK -------------------|                           |

Here's how the metrics map to specific connection phases:


# Sample log format showing all timers
log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta/%Tt %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"

Consider this real-world timing sequence:


1. Tq: 50ms (Client uploads 10KB JSON payload)
2. Tw: 0ms (No queueing)
3. Tc: 150ms (New TLS handshake with backend)
4. Tr: 300ms (Server processing time)
5. Tt: 550ms (Total connection duration)

Key patterns to watch for:

  • High Tw indicates connection saturation
  • Spiking Tc suggests network or DNS issues
  • Long Tr points to backend performance problems

Optimize timeouts based on your timing patterns:


defaults
    timeout connect 5s    # Maximum Tc
    timeout queue  30s    # Maximum Tw
    timeout client 60s    # Client-side Tt
    timeout server 60s    # Server-side Tt
    timeout http-request 5s  # Maximum Tq

When debugging performance issues in HAProxy, understanding the various timing metrics is crucial. These metrics help identify bottlenecks in different stages of request processing. Let's break down each timing component and visualize how they fit into the complete connection flow.

Here are the core timing metrics you'll encounter in HAProxy logs and stats:

# Common log format with timings
log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"

Let's examine each timing metric in detail:

Tq: Request Receiving Time

This measures how long HAProxy takes to receive the complete client request. Slow Tq might indicate:

  • Client network issues
  • Large request bodies
  • Slow client processing

Tw: Queue Waiting Time

Time spent waiting in HAProxy's connection queues. High Tw suggests:

# Check current queue stats
echo "show info" | socat /var/run/haproxy.sock -
# Look for:
# CurrQueue: current queue size
# MaxQueue: maximum queue size reached

Tc: Connection Establishment Time

Time to establish TCP connection with backend servers. High Tc could mean:

  • DNS resolution delays
  • Network latency to backends
  • Server connection backlog

Here's a more accurate timing diagram representation:

Client                  HAProxy                  Server
  |-----SYN----->|         |                       |
  |<----SYN/ACK--|         |                       |
  |-----ACK----->|         |                       |
  |--HTTP Req--->|         |                       |
  |              |--SYN--->|                       |
  |              |<-SYN/ACK|                       |
  |              |---ACK-->|                       |
  |              |---Req-->|                       |
  |              |<--Res---|                       |
  |<--Response---|         |                       |
  |-----FIN----->|         |                       |
  |<----FIN------|         |                       |

Let's look at some real-world scenarios and how to interpret the timings:

Case 1: Slow Backend Response

# Sample log entry showing high Tr
192.168.1.100:54321 [12/May/2023:15:32:45.123] web~ web/srv1 10/0/2/4500/4512 200 1450 - - ---- 1/1/0/0/0 0/0 "GET /api/data HTTP/1.1"
# Breakdown:
# Tq: 10ms (request receive time)
# Tw: 0ms (no queue wait)
# Tc: 2ms (fast connection)
# Tr: 4500ms (slow backend response)
# Tt: 4512ms (total time)

Here are some configuration tweaks to improve timing metrics:

# Reduce Tw by increasing connection limits
defaults
    maxconn 5000
    timeout queue 5s

# Improve Tc with connection pooling
backend web_servers
    server srv1 10.0.0.1:80 check maxconn 100
    server srv2 10.0.0.2:80 check maxconn 100
    option persist
    timeout connect 2s

Set up monitoring for these critical timing metrics:

# Prometheus query for high Tw
rate(haproxy_backend_http_queue_time_seconds_sum[1m]) / rate(haproxy_backend_http_queue_time_seconds_count[1m]) > 0.1

# Grafana dashboard example
- Panel: HAProxy Timing Breakdown
  - Tq: histogram of request receive time
  - Tw: gauge of current queue size
  - Tc: line graph of connection times
  - Tr: heatmap of backend response times