How to Fix Mosquitto MQTT Broker CrashLoopBackOff in Kubernetes with ChirpStack Integration


4 views

When deploying Mosquitto MQTT broker in Kubernetes as part of a ChirpStack LoRaWAN stack, the broker pod frequently enters CrashLoopBackOff state while other components (application-server, network-server, gateway-bridge) run normally. The core symptom appears in application-server logs showing MQTT connection timeouts:

time="2020-12-10T15:01:41Z" level=error msg="integration/mqtt: connecting to broker error, will retry in 2s: Network Error : dial tcp 10.244.146.236:1883: i/o timeout"

The root cause typically stems from permission and volume mounting issues combined with security context constraints. Here's a corrected deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: chirpstack-mosquitto
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chirpstack-mosquitto
  template:
    metadata:
      labels:
        app: chirpstack-mosquitto
    spec:
      containers:
      - name: mosquitto
        image: eclipse-mosquitto:2.0.15
        ports:
        - containerPort: 1883
        volumeMounts:
        - name: config-volume
          mountPath: /mosquitto/config/mosquitto.conf
          subPath: mosquitto.conf
        - name: auth-volume
          mountPath: /mosquitto/config/password_file
          subPath: password_file.txt
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "500m"
      volumes:
      - name: config-volume
        configMap:
          name: mosquitto-config
      - name: auth-volume
        configMap:
          name: mosquitto-password

The original deployment attempts to run Mosquitto as non-root (UID 1000) but mounts volumes with incorrect permissions. Either:

  • Run as root (remove securityContext)
  • Or properly set permissions through initContainer:
initContainers:
- name: volume-permissions
  image: busybox:1.28
  command: ["sh", "-c", "chown -R 1000:1000 /mosquitto"]
  volumeMounts:
  - name: mosquitto-data
    mountPath: /mosquitto/data
  - name: mosquitto-log
    mountPath: /mosquitto/log

The config maps need these precise paths and formatting:

apiVersion: v1
kind: ConfigMap
metadata:
  name: mosquitto-config
data:
  mosquitto.conf: |
    listener 1883
    allow_anonymous false
    password_file /mosquitto/config/password_file
    persistence true
    persistence_location /mosquitto/data/
    log_dest stdout
    log_type all

After applying these changes:

  1. Check pod logs: kubectl logs -f chirpstack-mosquitto-[pod-id]
  2. Test connectivity from another pod:
kubectl run -i --tty --rm debug --image=appropriate/curl --restart=Never -- sh -c 'mosquitto_sub -h chirpstack-mosquitto -t test -u "app-server" -P "app-server"'
  • Use StatefulSet instead of Deployment for persistent storage
  • Implement TLS encryption
  • Consider using Helm chart for deployment
  • Set proper resource limits and liveness probes

Following these configurations should resolve the CrashLoopBackOff issue and establish proper MQTT connectivity between ChirpStack components.


When deploying ChirpStack components in Kubernetes, the Mosquitto MQTT broker pod consistently enters a CrashLoopBackOff state while other components (application server, network server, and gateway bridge) run successfully. The application server logs show MQTT connection timeout errors:

time="2020-12-10T15:01:41Z" level=error msg="integration/mqtt: connecting to broker error, will retry in 2s: Network Error : dial tcp 10.244.146.236:1883: i/o timeout"

The current setup includes three key Kubernetes resources:

1. Mosquitto Configuration (ConfigMap)

kind: ConfigMap
metadata:
  name: mosquitto-config
data:
  mosquitto.conf: |    
    persistence true
    persistence_location /mosquitto/data/
    log_dest stdout
    listener 1883
    protocol mqtt
    allow_anonymous false
    password_file /.config/mosquitto/auth/password_file.txt
    require_certificate false

2. Password File (ConfigMap)

kind: ConfigMap
metadata:
  name: mosquitto-password
data:
  password_file.txt: |
    admin:admin
    user:user
    app-server:app-server
    net-server:net-server
    gateway-bridge:gateway-bridge

3. Deployment Issues

The key problems in the current deployment are:

  • Incorrect volume mounts for configuration files
  • Permission issues with the non-root user
  • Missing service discovery configuration for ChirpStack components

Here's the fixed deployment configuration that resolves the CrashLoopBackOff issue:

kind: Deployment
metadata:
  name: chirpstack-mosquitto
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chirpstack-mosquitto
  template:
    metadata:
      labels:
        app: chirpstack-mosquitto
    spec:
      containers:
      - name: mosquitto
        image: eclipse-mosquitto:2.0.15
        ports:
        - containerPort: 1883
        volumeMounts:
        - name: config-volume
          mountPath: /mosquitto/config/mosquitto.conf
          subPath: mosquitto.conf
        - name: password-volume
          mountPath: /mosquitto/config/password_file
          subPath: password_file.txt
      volumes:
      - name: config-volume
        configMap:
          name: mosquitto-config
      - name: password-volume
        configMap:
          name: mosquitto-password

The service should be updated to ensure proper DNS resolution:

kind: Service
metadata:
  name: mosquitto
spec:
  type: ClusterIP
  ports:
    - name: mqtt 
      port: 1883
      targetPort: 1883
  selector:
    app: chirpstack-mosquitto

After applying these changes:

  1. Check pod status: kubectl get pods
  2. View Mosquitto logs: kubectl logs -f [mosquitto-pod-name]
  3. Test connectivity from another pod:
    kubectl exec -it [app-server-pod] -- sh
    mosquitto_sub -h mosquitto -p 1883 -u app-server -P app-server -t test
    
  • Ensure all ChirpStack components use the correct service name (mosquitto) for MQTT connection
  • Verify password_file permissions inside the container
  • Consider adding readiness/liveness probes to the deployment
  • For production environments, implement TLS encryption