I have a kubernetes cluster with two node groups in AWS. One for Spot instances and the other for on demand instances. I have installed Vault and CSI driver to manage the secrets.
When I create this deployment everything works fine, the pods are created, run and the secrets are there.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: vault-test
name: vault-test
namespace: development
spec:
replicas: 1
selector:
matchLabels:
app: vault-test
strategy: {}
template:
metadata:
labels:
app: vault-test
spec:
containers:
- image: jweissig/app:0.0.1
name: app
envFrom:
- secretRef:
name: dev-secrets
resources: {}
volumeMounts:
- name: secrets-store-inline
mountPath: "/mnt/secrets"
readOnly: true
serviceAccountName: vault-sa
volumes:
- name: secrets-store-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: dev
status: {}
But when I add nodeAffinity and tolerations to create the pods in the Spot machines the pods stay in a ContainerCreating status with the following error:
Warning FailedMount 10m (x716 over 24h) kubelet MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod development/pod-name, err: error connecting to provider "vault": provider not found: provider "vault"
I created two applications to test the vault behavior, one with no tolerations just for testing and the real one, with the tolerations and nodeAffinity. And after a lot of tests I realized the problem was where the pods are being scheduled, but I don't understand why that behavior
The problem is the vault CSI driver configuration, the
DaemonSet
is not running in all nodes because of the missingtolerations
. I had to add thetolerations
to theDaemonSet
manifest so there is aPod
in all nodes, and this way all nodes know what vault is.