call python script from C++ threads, GIL

2.5k views Asked by At

Help me please..
I'm trying to call python scripts from different C++ threads and faced some problem.

main:

Py_Initialize();    
PyEval_InitThreads();
PyThreadState *mainThreadState = PyThreadState_Get();
PyEval_ReleaseLock();
PyInterpreterState *mainInterpreterState = mainThreadState->interp;
...
//creating threads with myThreadState per thread
    PyEval_AcquireLock();
    PyThreadState *myThreadState = PyThreadState_New(mainInterpreterState);
    PyEval_ReleaseLock();
//running threads
...
PyEval_RestoreThread(mainThreadState);
Py_Finalize();

run() function in thread object:

PyEval_AcquireLock();
PyThreadState_Swap(m_threadState);
...
script = "f = open('file_for_this_thread','w')\n"   
         "print f\n"
         "f.write('111')\n"                     
         "print f.fileno()\n"
PyRun_SimpleString( script );
...
PyThreadState_Swap(NULL);
PyEval_ReleaseLock();

'print f' displays correct file info for each file But something is wrong, because second 'print f' prints the same for different threads and the output (if there will be the one) will go to one file instead of different file for each thread
File handlers become equal if i insert time.sleep(1) instead of f.write, too Nothing crashes..

also tried using PyGILState_Ensure/PyGILState_Release, same effect
main:

Py_Initialize();
PyEval_InitThreads();
PyThreadState*  mainThreadState = PyEval_SaveThread();
...
//creating and running threads
...
PyEval_RestoreThread(mainThreadState);
Py_Finalize();

locker:

TPyScriptThreadLocker:
    PyGILState_STATE m_state;
public:
    TPyScriptThreadLocker(): m_state(PyGILState_Ensure() {}
    ~TPyScriptThreadLocker() { PyGILState_Release(m_state); }

run() function in thread object:

TPyScriptThreadLocker lock;
...
script = "f = open('file_for_this_thread','w')\n"   
         "print f.fileno()\n"
         "f.write('111')\n"                     
         "print f.fileno()\n"
PyRun_SimpleString( script );

I know that multithreading in python is not good idea in most cases, but now I want to know what is wrong with this code..

Python 2.7
info from http://www.linuxjournal.com/article/3641?page=0,2

source: http://files.mail.ru/9D4TEF pastebin: http://pastebin.com/DfFT9KN3

1

There are 1 answers

1
jogojapan On BEST ANSWER

As analysed in my comments, the problem is caused by the fact that all threads in your code use the same instance of the Python interpreter, created and initialised here:

Py_Initialize();    

When the first thread runs the script defined here:

script = "f = open('file_for_this_thread','w')\n"   
         "print f.fileno()\n"
         "f.write('111')\n"                     
         "print f.fileno()\n"

this causes the Python interpreter to assign a global Python variable f. Shortly after that, another thread causes it to redefine the same global variable. This might not have happened at the time of the first print f.fileno(), but it apparently happens before the second one.

The solution is to ensure no global variables are shared across threads (or use a different instance of the Python interpreter in each thread, at great additional memory cost).

Since currently the only global Python variable in your code is f, it suffices to use a different name for f in every thread. As your code gets more complicated, it will be better to define a Python function and use f (and any other variables you need) as a local variables:

PyRun_SimpleString(
   "def myfunc(thread_no):\n"
   "    f = open('file_for_thread_%d' % thread_no,'w')\n"
   "    print f.fileno()\n"
   "    f.write('111')\n"               
   "    print f.fileno()\n"
 );

The above would have to be applied only once and before any of the threads run.

In each thread, you would then simply do

PyRun_SimpleString(QString("myfunc(%d)\n",current_thread_no));

I.e. the threads would only call the Python function, and f would become a local variable.