I'm utilizing a Apache + mod-wsgi + django setup on Windows with python 2.7. It seems that the only setup that is possible on Windows for mod-wsgi is having 1 process with many child worker threads. Because of this, it seems that some work may be subject to GIL. I've noticed that database requests don't appear to cause locks, but some python processing does cause locking/slowdown.
For example:
If I'm processing a large xml file using lxml via soaplib, it causes massive slowdowns. Reading the documentation, it seems the solution to this is to use WSGIApplicationGroup %{GLOBAL}
. Side note, does this even work in Windows?
If I'm doing a large list processing job via python natively that is CPU intensive, it also seems to slow other requests.
I'm wondering if there is a general class of work that will cause django/python to lock until it's finished. And if so, what are some best practices to avoid these issues?
Setting
WSGIApplicationGroup %{GLOBAL}
although advisable if using lxml, isn't going to solve performance issues. It protects against deadlocks and crashes due to lxml not being incompatible with sub interpreters. Thus completely different issue.As to performance, if you have many requests performing CPU intensive Python only code, then there will be some GIL contention, but that will only slow the throughput overall and not block concurrent requests which are also doing CPU intensive work. This is because the Python interpreter will cause control by a thread to be implicitly yielded every certain number of Python byte code instructions so that other threads can run.
The bigger problem is where you are using a module which has a C extension component and it is doing CPU intensive long running tasks and what it is doing means it has to operate on Python data structures and so cannot release the GIL to allow other threads to run. In other words, C code which takes a long time and doesn't release the GIL locks out other threads.
If you are seeing this sort of problem, because Windows doesn't allow multi process Apache, you would have to use some sort of backend task queuing system which can farm out the actual work to separate processes somehow. On UNIX systems you would use Celery, or Redis Queue for that. What your options on Windows are I have no idea.