Sentinel stateful set schedule fail, can't find persistent volumes to bind

779 views Asked by At

Good afternoon

I really need some help getting a group of sentinels up so that they can monitor and perform elections for my redis pods, which are running without issue. At the bottom of this message I have included the sentinel config, which spells out the volumes. The first sentinel, sentinel0, sits at Pending, while the rest of the redis instances are READY 1/1, for all three.

But they don't get scheduled. When I attempt to apply the sentinel statefulset, I get the following schedule error. The sentinel statefulset config is at the bottom of this post

Warning FailedScheduling 5s default-scheduler 0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) didn't find available persistent volumes to bind. Warning FailedScheduling 4s default-scheduler 0/4 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 3 node(s) didn't find available persistent volumes to bind.

About my kubernetes setup:

I am running a four-node baremetal kubernetes cluster; one master node and three worker nodes respectively.

For storage, I am using a 'local-storage' StorageClass shared across the nodes. Currently I am using a single persistent volume configuration file which defines three volumes across three nodes. This seems to be working out for the redis statefulset, but not sentinel. (sentiel config at bottom)

See below config of persistent volume (all three pv-volume-node-0, 1, 2 all are bound)

kind: PersistentVolume
apiVersion: v1
metadata:
  name: ag1-pv-volume-node-0
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  local:
    path: "/var/opt/mssql"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - k8s-node-0
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: ag1-pv-volume-node-1
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  local:
    path: "/var/opt/mssql"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - k8s-node-1
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: ag1-pv-volume-node-2
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  local:
    path: "/var/opt/mssql"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - k8s-node-2

Note: the path "/var/opt/mssql" is the stateful directory data pt for the redis cluster. It's a misnomer and in no way reflects a sql database (I just used this directory from a walkthrough), and it works.

Presently all three redis pods are successfully deployed with a functioning statefulset, see below for the redis config (all working)

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
spec:
  serviceName: redis
  replicas: 3
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      initContainers:
      - name: config
        image: redis:6.0-alpine
        command: [ "sh", "-c" ]
        args:
          - |
            cp /tmp/redis/redis.conf /etc/redis/redis.conf
            
            echo "finding master..."
            MASTER_FDQN=`hostname  -f | sed -e 's/redis-[0-9]\./redis-0./'`
            if [ "$(redis-cli -h sentinel -p 5000 ping)" != "PONG" ]; then
              echo "master not found, defaulting to redis-0"

              if [ "$(hostname)" == "redis-0" ]; then
                echo "this is redis-0, not updating config..."
              else
                echo "updating redis.conf..."
                echo "slaveof $MASTER_FDQN 6379" >> /etc/redis/redis.conf
              fi
            else
              echo "sentinel found, finding master"
              MASTER="$(redis-cli -h sentinel -p 5000 sentinel get-master-addr-by-name mymaster | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}')"
              echo "master found : $MASTER, updating redis.conf"
              echo "slaveof $MASTER 6379" >> /etc/redis/redis.conf
            fi
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis/
        - name: config
          mountPath: /tmp/redis/
      containers:
      - name: redis
        image: redis:6.0-alpine
        command: ["redis-server"]
        args: ["/etc/redis/redis.conf"]
        ports:
        - containerPort: 6379
          name: redis
        volumeMounts:
        - name: data
          mountPath: /var/opt/mssql
        - name: redis-config
          mountPath: /etc/redis/
      volumes:
      - name: redis-config
        emptyDir: {}
      - name: config
        configMap:
          name: redis-config
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "local-storage"
      resources:
        requests:
          storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
  name: redis
spec:
  clusterIP: None
  ports:
  - port: 6379
    targetPort: 6379
    name: redis
  selector:
    app: redis

The real issue I'm having, I believe spawns from how I've configured the sentinel statefulset. The pods won't schedule and its printed reason is it isn't finding persistent volumes to bind from.

SENTINEL STATEFULSET CONFIG, problem here, can't figure out how to set it up right with the volumes I made.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sentinel
spec:
  serviceName: sentinel
  replicas: 3
  selector:
    matchLabels:
      app: sentinel
  template:
    metadata:
      labels:
        app: sentinel
    spec:
      initContainers:
      - name: config
        image: redis:6.0-alpine
        command: [ "sh", "-c" ]
        args:
          - |
            REDIS_PASSWORD=a-very-complex-password-here
            nodes=redis-0.redis.redis.svc.cluster.local,redis-1.redis.redis.svc.cluster.local,redis-2.redis.redis.svc.cluster.local

            for i in ${nodes//,/ }
            do
                echo "finding master at $i"
                MASTER=$(redis-cli --no-auth-warning --raw -h $i -a $REDIS_PASSWORD info replication | awk '{print $1}' | grep master_host: | cut -d ":" -f2)
                if [ "$MASTER" == "" ]; then
                    echo "no master found"
                    MASTER=
                else
                    echo "found $MASTER"
                    break
                fi
            done
            echo "sentinel monitor mymaster $MASTER 6379 2" >> /tmp/master

            echo "port 5000
            $(cat /tmp/master)
            sentinel down-after-milliseconds mymaster 5000
            sentinel failover-timeout mymaster 60000
            sentinel parallel-syncs mymaster 1
            sentinel auth-pass mymaster $REDIS_PASSWORD
            " > /etc/redis/sentinel.conf
            cat /etc/redis/sentinel.conf
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis/
      containers:
      - name: sentinel
        image: redis:6.0-alpine
        command: ["redis-sentinel"]
        args: ["/etc/redis/sentinel.conf"]
        ports:
        - containerPort: 5000
          name: sentinel
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis/
        - name: data
          mountPath: /var/opt/mssql
      volumes:
      - name: redis-config
        emptyDir: {}
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "local-storage"
      resources:
        requests:
          storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
  name: sentinel
spec:
  clusterIP: None
  ports:
  - port: 5000
    targetPort: 5000
    name: sentinel
  selector:
    app: sentinel

This is my first post here. I am a big fan of stackoverflow!

1

There are 1 answers

1
Vasilii Angapov On

You may try to create three PVs using this template:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: ag1-pv-volume-node-0
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  claimRef:
    namespace: default
    name: data-redis-0
  local:
    path: "/var/opt/mssql"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - k8s-node-0

Important part here is claimRef field which ties PV with PVC with StatefulSet. It should be of special format.

Read more here: https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd#using_a_preexisting_disk_in_a_statefulset