Implementing Single Outbound IP for GCP/GKE Workloads with Cloud NAT

When integrating with third-party APIs that require IP whitelisting, managing dynamic GKE workloads becomes problematic. Traditional approaches like instance-level NAT gateways introduce SPOF (Single Point of Failure) and maintenance overhead. The ideal solution should:

Provide a static external IP for all outbound traffic
Scale automatically with GKE cluster changes
Maintain high availability
Require minimal configuration maintenance

Google Cloud's managed NAT service solves this elegantly without maintaining standalone NAT instances. Here's the architecture breakdown:

+---------------+
|   GKE Nodes   |
| (no public IP)|
+-------┬-------+
        │
        ↓
+---------------+
|  Cloud NAT    |
| (Static IP)   |
+-------┬-------+
        │
        ↓
+---------------+
|   Internet    |
+---------------+

1. Reserve a Static IP

gcloud compute addresses create api-gateway-ip \
  --region=us-central1 \
  --network-tier=PREMIUM

2. Configure Cloud Router

gcloud compute routers create nat-router \
  --network=default \
  --region=us-central1 \
  --asn=64512

3. Set Up Cloud NAT

gcloud compute routers nats create api-nat-config \
  --router=nat-router \
  --region=us-central1 \
  --nat-custom-subnet-ip-ranges="ALL_SUBNETWORKS_ALL_IP_RANGES" \
  --nat-external-ip-pool="api-gateway-ip"

For GKE clusters, ensure your node pools are configured without external IPs:

gcloud container node-pools create workload-pool \
  --cluster=my-cluster \
  --no-enable-external-ips \
  --machine-type=n2-standard-4

To confirm all outbound traffic uses the static IP:

# From any GKE pod:
kubectl run --rm -it curl-test --image=curlimages/curl -- sh
curl ifconfig.me

For production environments, consider these enhancements:

Configure NAT logging for audit trails
Set minimum ports per VM to avoid SNAT exhaustion
Implement regional NAT for cross-zone redundancy

Symptom: Outbound connections failing
Check: Verify firewall rules allow egress from GKE nodes to 0.0.0.0/0

Symptom: IP mismatch
Check: Confirm no other NAT gateways or instance-level routes exist

When integrating with third-party APIs requiring IP whitelisting in GCP, we face a fundamental architectural decision: how to maintain a consistent egress point while accommodating dynamic Kubernetes workloads. The traditional approach of assigning individual public IPs to each GCE instance becomes untenable when dealing with auto-scaling clusters and API restrictions.

GCP's Cloud NAT service solves exactly this problem without requiring manual NAT instance management. Here's how to implement it:

# Create a router for your region
gcloud compute routers create nat-router \
    --network=default \
    --region=us-central1

# Configure Cloud NAT with static IP
gcloud compute routers nats create nat-config \
    --router-region=us-central1 \
    --router=nat-router \
    --nat-all-subnet-ip-ranges \
    --auto-allocate-nat-external-ips \
    --nat-external-ip-pool=YOUR_STATIC_IP_ADDRESS

For GKE clusters, we need additional configuration to ensure all egress traffic uses the NAT gateway:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-system
data:
  disable-legacy-endpoints: "true"
  enable-aggregator-routing: "true"
  egress-firewall: |
    {
      "apiVersion": "networking.gke.io/v1",
      "kind": "EgressFirewall",
      "metadata": {
        "name": "force-nat-egress"
      },
      "spec": {
        "egress": [
          {
            "to": {
              "cidr": "0.0.0.0/0"
            },
            "type": "Deny"
          },
          {
            "to": {
              "cidr": "THIRD_PARTY_API_IP/32"
            },
            "type": "Allow"
          }
        ]
      }
    }

Cloud NAT automatically provides regional redundancy when configured properly:

Deploy NAT gateways in all zones where your workloads run
Use custom route advertisements to ensure proper traffic flow
Monitor NAT gateway metrics through Cloud Monitoring

Balance between performance and cost with these techniques:

# Set minimum ports per VM to control scaling
gcloud compute routers nats update nat-config \
    --min-ports-per-vm=32 \
    --max-ports-per-vm=64

When debugging NAT connectivity problems:

# Check active NAT mappings
gcloud compute ssh nat-gateway-instance --command "sudo conntrack -L"

# Verify route propagation
gcloud compute routes list --filter="network=default"

ServerDevWorker

Implementing Single Outbound IP for GCP/GKE Workloads with Cloud NAT

Related Articles