Kafka Streams state store - what kind of store to use when running in Kubernetes

284 views Asked by At

Do I understand correctly that Kafka Streams state store is co-located with the KS application instance? For example if my KS application is running in a Kubernetes pod, the state store is located in the same pod? What state store storage is better to use in Kubernetes - RocksDB or in-memory? How can the type of the state store be configured in the application?

1

There are 1 answers

0
Kacper Roszczyna On BEST ANSWER

This depends on your use case - sometimes you can accept an in-memory store when you have a small topic. However in most cases you'll default to a persistent stores. To declare one you'd do:

streamsBuilder.addStateStore(
            Stores.keyValueStoreBuilder(
                    Stores.persistentKeyValueStore(
                            storeName
                    ),
                    keySerde,
                    valueSerde
            )
    );

If you wish for a inMemoryStore replace the third line with inMemoryKeyValueStore.

Running a KafkaStreams application in k8s has a few caveats. First of all just a pod is not enough. You'll need to run this as a stateful-set. In that case your pod will have a PersistentVolumeClaim mounted on your pod under a certain path. It's best to set your state.dir property to point at a subfolder of that path. That way when your pod shuts down the volume is retained and when the pod comes back on it will have all of its store present.