I want to provide shared state for a Flask app which runs with multiple workers, i. e. multiple processes.
To quote this answer from a similar question on this topic:
You can't use global variables to hold this sort of data. [...] Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs.
(Source: Are global variables thread safe in flask? How do I share data between requests?)
My question is on that last part regarding suggestions on how to provide the data "outside" of Flask. Currently, my web app is really small and I'd like to avoid requirements or dependencies on other programs. What options do I have if I don't want to run Redis or anything else in the background but provide everything with the Python code of the web app?
If your webserver's worker type is compatible with the
multiprocessing
module, you can usemultiprocessing.managers.BaseManager
to provide a shared state for Python objects. A simple wrapper could look like this:You can assign your data to the
shared_dict
to make it accessible across processes:However, you should be aware of the following circumstances:
shared_lock
to protect against race conditions when overwriting values inshared_dict
. (See Flask example below.)BaseManager
process dies, the shared state is gone.BaseManager
, you cannot directly edit nested values inshared_dict
. For example,shared_dict["array"][1] = 0
has no effect. You will have to edit a copy and then reassign it to the dictionary key.Flask example:
The following Flask app uses a global variable to store a counter number:
This works when using only 1 worker
gunicorn -w 1 server:app
. When using multiple workersgunicorn -w 4 server:app
it becomes apparent thatnumber
is not a shared state but individual for each worker process.Instead, with
shared_dict
, the app looks like this:This works with any number of workers, like
gunicorn -w 4 server:app
.