MQTT Connection Fails on Google Container OS

180 views Asked by At

For my setup, I'm working with a third party MQTT VerneMQ broker hosted in AWS. I have been given username/password credentials to connect over secure MQTT (port 8883) using a specific clientId. My goal (though irrelevant to the issue at hand) is merely to subscribe to a topic and redirect traffic from the topic to Google PubSub.

I wrote a simple NodeJS program to make said connection, and it works beautifully when run locally through ts-node

const client = connect(`mqtts://${process.env.MQTT_HOST}`, {
    port: parseInt(process.env.MQTT_PORT, 10),
    clientId: process.env.MQTT_CLIENT_ID,
    username: process.env.MQTT_USERNAME,
    password: process.env.MQTT_PASSWORD,
    rejectUnauthorized: false,
});

client.on('error', handleError);

client.on('connect', (p) => {
    console.log('connect', JSON.stringify(p));
    client.subscribe({ [mqttTopic]: { qos: 0 } });
});
client.on('message', (topic, msg) => onMessageReceived(msg));

I then proceeded to Dockerize it

FROM node:lts-alpine
RUN apk update
WORKDIR /app
COPY . .
RUN npm i
EXPOSE 8883
CMD ["npm", "start"]

and that runs perfectly fine locally through docker run.

The trouble started when I loaded the image into Google's Compute Engine using their "Deploy a container image to this VM instance" option that uses a container-optimized OS image. When I checked the logs, the code was trying to reach out with a connect packet, but would always immediately close.

I thought this might be an issue with how I did the deployment, so to verify, I spun up a standard Debian VM, and upon installing Docker and running my image just like I did it locally, it worked just fine! So it's not that Docker is failing remotely.

I considered that perhaps that the deployment through Compute Engine was just weird, but it was simpler than standing up a Kubernetes cluster when I just needed the single image. Given my issues, I went ahead and spent the time to stand everything up in GKE. The logs reported the exact same messages that they reported when the image was deployed through Compute Engine. Here's the YAML:

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-mqtt
  labels:
    app: test-mqtt
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-mqtt
  serviceName: test-mqtt-service
  template:
    metadata:
      labels:
        app: test-mqtt
    spec:
      containers:
        - name: mqtt
          image: us-central1-docker.pkg.dev/{GCP_PROJECT}/docker/test-mqtt
          ports:
            - name: mqtt-ssl
              containerPort: 8883
---
apiVersion: v1
kind: Service
metadata:
  name: test-mqtt-service
  labels:
    app: test-mqtt
spec:
  ports:
    - name: mqtt-ssl
      port: 8883
  selector:
    app: test-mqtt
  type: LoadBalancer

After all this, I thought for sure this was a port issue, so I checked and double checked the firewalls, both for the vNIC and internally (as suggested might be the case by Google - that didn't change anything). I can reach out over the port, run applications over the port, and it still fails when I open all the ports to the world. In order to triple check the ports, I went ahead and changed the code to reach out to https://test.mosquitto.org and I verified that I could still reach their server using port 8883. So it can't be the port.

I've come to the conclusion that some combination of OS (it worked in Debian with a manual deploy) and broker (it worked for the Mosquitto Test broker) is making this not work, but I feel like I've exhausted all possibilities.

What more can I check to make this work? I feel like it has to be something simple that I'm missing, but I've spent days on this to no avail.

0

There are 0 answers