ECS Cluster Shows No Instances Despite Visible EC2 Instances: Troubleshooting Guide


2 views

When setting up an ECS cluster in the Sydney region, you might encounter a situation where:

  • Two EC2 instances appear in the EC2 dashboard
  • The ECS cluster shows zero registered container instances
  • No errors are immediately visible in the AWS console

First, check these critical configuration points:

# Check ECS agent status on EC2 instances
ssh -i your-key.pem ec2-user@instance-ip
sudo systemctl status ecs

# Expected output should show 'active (running)'
# If not running, try:
sudo systemctl start ecs

The most frequent issues causing this discrepancy:

1. Missing IAM Instance Profile

EC2 instances need proper permissions to register with ECS:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:CreateCluster",
        "ecs:DeregisterContainerInstance",
        "ecs:DiscoverPollEndpoint",
        "ecs:Poll",
        "ecs:RegisterContainerInstance",
        "ecs:StartTelemetrySession",
        "ecs:Submit*"
      ],
      "Resource": "*"
    }
  ]
}

2. Incorrect Cluster Name Configuration

Verify the ECS agent configuration file:

# On the EC2 instance:
cat /etc/ecs/ecs.config

# Should contain:
ECS_CLUSTER=your-cluster-name
ECS_BACKEND_HOST=

Several AWS services can help diagnose the issue:

CloudWatch Logs

aws logs get-log-events \
  --log-group-name /ecs/ecs-agent \
  --log-stream-name ecs-agent-log-stream \
  --region ap-southeast-2

ECS API Verification

aws ecs list-container-instances \
  --cluster your-cluster-name \
  --region ap-southeast-2

This bash script automates common recovery steps:

#!/bin/bash
INSTANCE_IDS=$(aws ec2 describe-instances \
  --filters "Name=tag:aws:ecs:cluster-name,Values=your-cluster-name" \
  --query "Reservations[].Instances[].InstanceId" \
  --output text \
  --region ap-southeast-2)

for id in $INSTANCE_IDS; do
  echo "Processing instance $id"
  aws ssm send-command \
    --instance-ids $id \
    --document-name "AWS-RunShellScript" \
    --parameters 'commands=["sudo systemctl restart ecs"]' \
    --region ap-southeast-2
done
  • Always use CloudFormation or Terraform for reproducible deployments
  • Implement instance health checks in your Auto Scaling Group
  • Set up CloudWatch alarms for ECS agent metrics

When setting up an ECS cluster in ap-southeast-2 (Sydney) region with two EC2 instances configured with 60GB disks, the instances appear in EC2 dashboard but remain invisible in ECS cluster view. This creates an operational paradox where infrastructure exists but isn't recognized by the orchestration layer.

First, confirm the ECS agent status on affected instances:


# SSH into the EC2 instance and check agent status
sudo systemctl status ecs

# View agent logs for errors
cat /var/log/ecs/ecs-agent.log | grep -i error

1. IAM Role Misconfiguration
The EC2 instances need AmazonEC2ContainerServiceforEC2Role policy attached. Verify with:


aws iam list-instance-profiles --query 'InstanceProfiles[?contains(InstanceProfileName, ecsInstanceRole)]'

2. Cluster Name Mismatch
The ECS agent config file (/etc/ecs/ecs.config) must specify the correct cluster:


ECS_CLUSTER=your-cluster-name
ECS_BACKEND_HOST=

For VPC-related issues, check these critical endpoints are reachable:


# Test ECS service endpoint connectivity
curl -v https://ecs.ap-southeast-2.amazonaws.com/

# Verify DNS resolution
dig ecs.ap-southeast-2.amazonaws.com

This bash script automates common remediation steps:


#!/bin/bash
# Stop ECS service
sudo systemctl stop ecs

# Cleanup agent state
sudo rm -rf /var/lib/ecs/data/*

# Refresh configuration
echo "ECS_CLUSTER=${YOUR_CLUSTER_NAME}" | sudo tee /etc/ecs/ecs.config
echo "ECS_ENABLE_TASK_IAM_ROLE=true" | sudo tee -a /etc/ecs/ecs.config

# Restart service
sudo systemctl start ecs

Create these CloudWatch alarms for proactive detection:


aws cloudwatch put-metric-alarm \
    --alarm-name "ECS-Agent-Down" \
    --metric-name "AgentConnected" \
    --namespace "AWS/ECS" \
    --statistic "Minimum" \
    --period 60 \
    --threshold 1 \
    --comparison-operator "LessThanThreshold" \
    --evaluation-periods 3 \
    --alarm-actions "arn:aws:sns:ap-southeast-2:123456789012:MyNotificationTopic"