How to Maintain Stable Outbound IPs for GKE Pods While Preserving Inbound Connectivity


1 views

When running workloads on Google Kubernetes Engine (GKE) that require both inbound public connectivity and consistent outbound IPs for whitelisting, we face a fundamental networking paradox. The dynamic nature of GKE node IPs during autoscaling or upgrades makes traditional approaches ineffective.

The conventional NAT Gateway solution (as documented in Google's guide) creates asymmetric routing when applied to NodePort services:


# This breaks inbound NodePort traffic
gcloud compute routes create nat-route \
    --network=default \
    --destination-range=0.0.0.0/0 \
    --next-hop-gateway=default-internet-gateway \
    --priority=800 \
    --tags=nat-gateway

Option 1: Cloud NAT with Proxy VM

Deploy a dedicated proxy VM with static IP that handles only outbound traffic:


# Create static IP
gcloud compute addresses create outbound-proxy-ip \
    --region=us-central1

# Configure route for pod egress
gcloud compute routes create pod-egress \
    --network=default \
    --priority=1000 \
    --destination-range=0.0.0.0/0 \
    --next-hop-instance=outbound-proxy \
    --tags=pod-egress

Option 2: GKE with Internal Load Balancer

Combine internal load balancing with external NAT:


apiVersion: v1
kind: Service
metadata:
  name: ilb-service
  annotations:
    networking.gke.io/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
  - protocol: TCP
    port: 80
    targetPort: 9376

While Google publishes IP ranges by region, these are too broad for security whitelisting. Instead, consider:

  • Using dedicated egress gateways
  • Implementing service mesh with controlled egress
  • Regional external HTTP(S) load balancer with fixed IP

resource "google_compute_global_address" "egress_ip" {
  name = "gke-egress-ip"
}

resource "google_compute_router" "egress_router" {
  name    = "gke-egress-router"
  network = google_compute_network.main.self_link
  bgp {
    asn = 64514
  }
}

resource "google_compute_router_nat" "egress_nat" {
  name                               = "gke-egress-nat"
  router                             = google_compute_router.egress_router.name
  nat_ip_allocate_option             = "MANUAL_ONLY"
  nat_ips                            = [google_compute_global_address.egress_ip.self_link]
  source_subnetwork_ip_ranges_to_nat = "LIST_OF_SUBNETWORKS"
}

When running workloads on GKE that need both public inbound connectivity and consistent outbound IPs for API whitelisting, we face a fundamental infrastructure conflict. The default GKE architecture uses ephemeral external IPs for nodes, which creates problems when:

  • Nodes auto-scale up/down
  • Cluster upgrades occur
  • Third-party APIs require IP-based authentication

The standard approach of using Cloud NAT (as documented in Google's solution) creates routing issues because:


# This NAT configuration breaks inbound traffic
gcloud compute routers create nat-router \
    --network=default \
    --region=us-central1

The NAT gateway intercepts both outbound and inbound traffic when applied at the node level, making NodePort services unreachable from external clients.

We need to implement traffic routing that:

  1. Preserves direct public access to NodePort services
  2. Channels all outbound traffic through static IPs

Option 1: Regional External IP Whitelisting

Google Cloud publishes IP ranges by region. While not perfect, you could whitelist:


us-central1: 34.66.0.0/15, 34.67.0.0/16
us-east1: 34.74.0.0/15, 34.75.0.0/16

Option 2: Proxy-Based Solution with Static IP

Deploy a proxy deployment with static external IP:


apiVersion: apps/v1
kind: Deployment
metadata:
  name: outbound-proxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: proxy
  template:
    metadata:
      labels:
        app: proxy
    spec:
      containers:
      - name: squid
        image: sameersbn/squid:3.5.27
        
---
apiVersion: v1
kind: Service
metadata:
  name: proxy-service
  annotations:
    cloud.google.com/load-balancer-type: "External"
spec:
  loadBalancerIP: "34.102.123.45" # Your reserved static IP
  ports:
  - port: 3128
    targetPort: 3128
  selector:
    app: proxy
  type: LoadBalancer

Option 3: GKE Network Policies with Static NAT

Advanced configuration using egress gateways:


# Reserve static IP
gcloud compute addresses create gke-outbound-ip --region=us-central1

# Create NAT gateway specifically for egress
gcloud compute routers nats create gke-nat-config \
    --router=nat-router \
    --nat-external-ip-pool=gke-outbound-ip \
    --enable-logging \
    --auto-allocate-nat-external-ips
  • Always test failover scenarios for proxy-based solutions
  • Monitor for SNAT port exhaustion in high-traffic clusters
  • Consider VPC Service Controls for additional security

Each approach requires balancing between operational complexity and business requirements. For most production scenarios, Option 3 provides the best combination of reliability and maintainability.