Implementing Health Checks for ECS Tasks Without ELB: Zero-Downtime Deployment for Spring Boot Containers

When deploying Spring Boot applications on ECS without Elastic Load Balancing, implementing proper health checks becomes crucial for achieving zero-downtime deployments. Many developers encounter unexpected behavior when the health check status remains stuck in "UNKNOWN" state despite seemingly correct configuration.

The examples you tried with exit 0 and exit 1 demonstrate a common misconception. ECS health checks for tasks without ELB require actual endpoint validation, not just command execution. The "UNKNOWN" status typically indicates one of three scenarios:

1. The health check command isn't properly structured
2. The container isn't exposing the expected health endpoint
3. Network connectivity issues between ECS agent and container

For Spring Boot applications, leverage the Actuator health endpoint (enabled by adding spring-boot-starter-actuator to dependencies). Here's a working task definition health check configuration:

"healthCheck": {
    "command": [
        "CMD-SHELL",
        "curl -f http://localhost:8080/actuator/health || exit 1"
    ],
    "interval": 30,
    "retries": 3,
    "startPeriod": 60,
    "timeout": 5
}

interval: 30 seconds between checks (adjust based on app startup time)
retries: 3 consecutive failures mark task as unhealthy
startPeriod: 60 second grace period for app initialization
timeout: 5 seconds to prevent hanging checks

If health checks still don't work:

Verify the Actuator endpoint is exposed (check application.properties)
Test the health endpoint manually inside the container
Check ECS agent logs for health check execution errors
Ensure network connectivity between ECS agent and container

For custom health check logic, you might use:

"healthCheck": {
    "command": [
        "CMD-SHELL",
        "if [ $(curl -s -o /dev/null -w '%{http_code}' http://localhost:8080/health) -eq 200 ]; then exit 0; else exit 1; fi"
    ],
    "interval": 20,
    "retries": 5,
    "startPeriod": 90,
    "timeout": 3
}

When deploying Spring Boot applications in Docker containers on AWS ECS without Elastic Load Balancing (ELB), many developers encounter unexpected behavior with task health checks. The core issue manifests when:

Health check commands like ["CMD-SHELL","exit 0"] don't change the UNKNOWN status
Service updates can't properly roll out due to undetermined health states
Documentation gaps leave developers troubleshooting in the dark

The ECS health check system behaves differently when ELB isn't involved. Without ELB health checks, ECS relies solely on the Docker container's health check definition, which requires:

// This minimal configuration won't work as expected
"healthCheck": {
  "command": ["CMD-SHELL","exit 0"],
  "interval": 30,
  "timeout": 5,
  "retries": 3
}

For Spring Boot applications, we need an actual endpoint check rather than shell commands. Here's the working configuration:

// Working health check for Spring Boot
"healthCheck": {
  "command": [
    "CMD-SHELL",
    "curl -f http://localhost:8080/actuator/health || exit 1"
  ],
  "interval": 30,
  "timeout": 5,
  "retries": 3,
  "startPeriod": 60
}

startPeriod: Gives your application time to start (critical for Spring Boot)
curl -f: Fails on non-2xx responses
Port mapping: Ensure your container exposes the correct port

{
  "family": "spring-boot-app",
  "networkMode": "awsvpc",
  "containerDefinitions": [
    {
      "name": "app-container",
      "image": "your-ecr-repo/spring-boot-app:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "hostPort": 8080
        }
      ],
      "healthCheck": {
        "command": [
          "CMD-SHELL",
          "curl -f http://localhost:8080/actuator/health || exit 1"
        ],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ]
}

After deployment:

Check task health status in ECS console
View stopped tasks to see if health checks failed
Examine CloudWatch logs for health check command output

Common issues include:

Insufficient startPeriod for Spring Boot initialization
Missing actuator/health endpoint in application.properties
Network configuration preventing localhost access

ServerDevWorker

Implementing Health Checks for ECS Tasks Without ELB: Zero-Downtime Deployment for Spring Boot Containers

Related Articles