Why I'd want to have multiple replicas of my DB?
- Redundancy: I have > 1 replicas of my app code. Why? In case one node fails, another can fill its place when run behind a load balancer.
- Load: A load balancer can distribute traffic to multiple instances of the app.
- A/B testing. I can have one node serve one version of the app, and another serve a different one.
- Maintenance. I can bring down one instance for maintenance, and keep the other one up with 0 down-time.
So, I assume I'd want to do the same with the backing db if possible too.
I realize that many nosql dbs are better configured for multiple instances, but I am interested in relational dbs.
I've played with operators like this and this but have found problems with the docs, have not been able to get them up and running and found the community a bit lacking. Relying on this kind of thing in production makes me nervous. The Mysql operator has a note even, saying it's not for production use.
I see that native k8s statefulsets have scaling but these docs aren't specific to dbs at all. I assume the complication is that dbs need to write persistently to disk via a volume and that data has to be synced and routed somehow if you have more than one instance.
So, is this something that's non-trivial to do myself? Or, am I better off having a dev environment that uses a one-replica db image in the cluster in order to save on billing, and a prod environment that uses a fully managed db, something like this that takes care of the scaling/HA for me? Then I'd use kustomize to manage the yaml variances.
Edit:
I actually found a postgres operator that worked great. Followed the docs one time through and it all worked, and it's from postgres docs.
I have created this community wiki answer to summarize the topic and to make pertinent information more visible.
As Turing85 well mentioned in the comment:
Another solution may be to use a different operator as Aaron did:
See also this similar question.