What sort of Python work on Windows stops mod-wsg/Apache's ability to deal with multiple requests?

148 views Asked by At

I'm utilizing a Apache + mod-wsgi + django setup on Windows with python 2.7. It seems that the only setup that is possible on Windows for mod-wsgi is having 1 process with many child worker threads. Because of this, it seems that some work may be subject to GIL. I've noticed that database requests don't appear to cause locks, but some python processing does cause locking/slowdown.

For example:

If I'm processing a large xml file using lxml via soaplib, it causes massive slowdowns. Reading the documentation, it seems the solution to this is to use WSGIApplicationGroup %{GLOBAL}. Side note, does this even work in Windows?

If I'm doing a large list processing job via python natively that is CPU intensive, it also seems to slow other requests.

I'm wondering if there is a general class of work that will cause django/python to lock until it's finished. And if so, what are some best practices to avoid these issues?

2

There are 2 answers

2
Graham Dumpleton On BEST ANSWER

Setting WSGIApplicationGroup %{GLOBAL} although advisable if using lxml, isn't going to solve performance issues. It protects against deadlocks and crashes due to lxml not being incompatible with sub interpreters. Thus completely different issue.

As to performance, if you have many requests performing CPU intensive Python only code, then there will be some GIL contention, but that will only slow the throughput overall and not block concurrent requests which are also doing CPU intensive work. This is because the Python interpreter will cause control by a thread to be implicitly yielded every certain number of Python byte code instructions so that other threads can run.

The bigger problem is where you are using a module which has a C extension component and it is doing CPU intensive long running tasks and what it is doing means it has to operate on Python data structures and so cannot release the GIL to allow other threads to run. In other words, C code which takes a long time and doesn't release the GIL locks out other threads.

If you are seeing this sort of problem, because Windows doesn't allow multi process Apache, you would have to use some sort of backend task queuing system which can farm out the actual work to separate processes somehow. On UNIX systems you would use Celery, or Redis Queue for that. What your options on Windows are I have no idea.

3
bruno desthuilliers On

As why you get "massive slowdowns" when doing some CPU or memory intensive computations, it doesn't necessarily has to do with the GIL, it might just be your system not being able to cope with the load. Scaling heavy processings is usually dealt with using a multiple servers setup (eventually with parallelisation).