Python subprocess behaves different when called from Django vs unit test

Question

Python subprocess behaves different when called from Django vs unit test

220 views Asked by Dan At 05 October 2017 at 01:58

My first time posting - please go easy on me. I could not come up with a succinct title that summarizes this issue. I seem to have a codec problem.

My django-based website calls a subprocess (soffice) to convert uploaded documents to basic text files, to then go on to do some processing of the text from the doc. This was working beautifully for a time. On my local dev machine, the unit tests for file conversion still work perfect as does the complete django app, end-to-end. On the production server, where it all used to work, the file conversion call no longer works the same from within the django app, while it does work properly when run from the test code. This change in behavior appears to be the result of running general server updates.

args = ['soffice',
        '--headless',
        '--convert-to',
        'txt:Text',
        '--outdir',
        outDir,
        filePath]

subprocess.call(args)

fo = open(textFilePath, "r")

try:
    docText = fo.read()
except:
    print("Failed to read", textFilePath)
    docText = None

I removed some of the error checking to simplify a bit.

When I run the file conversion code as part of the complete django application on the production server, I can see that certain special characters such as symbol § are turned into garbage. But if I run the same file conversion code on its own, outside of django, on the same machine, those symbols are not corrupted. As mentioned, on my dev machine, it works both standalone and within django. The one difference between the two machines is how I run django. Locally, it's run using django's runserver command. On the production machine, it's using mod_wsgi with apache. I don't see how it's possible for django or mod_wsgi to interfere with what soffice is doing in the subprocess, but it does appear that way. I have opened a python shell on the problem server and run essentially the same code as above, getting clean text back, plus running the unit tests against it works too.

Any help is sincerely appreciated!

Original Q&A

There are 2 answers

Graham Dumpleton On 05 October 2017 at 04:02

If you are using mod_wsgi daemon mode, ensure you are setting lang/locale as otherwise you are going to inherit a default encoding of ASCII from the operating system.

http://blog.dscpl.com.au/2014/09/setting-lang-and-lcall-when-using.html

This would propagate through to sub processes as well.

if not using daemon mode, you really should be looking at doing so as it is preferred over embedded mode of mod_wsgi. If using embedded mode it is somewhat harder to change the lang/locale as must be done in Apache startup scripts and how you do that depends on the platform and distro.

**Dan** · Accepted Answer · 2017-10-10T01:00:50+00:00

Dan On 10 October 2017 at 01:00 BEST ANSWER

The solution was to upgrade mod_wsgi using:

pip install mod_wsgi --upgrade

TechQA.

Python subprocess behaves different when called from Django vs unit test

There are 2 answers

Related Questions in DJANGO

Related Questions in PYTHON-3.X

Related Questions in SUBPROCESS

Related Questions in MOD-WSGI

Related Questions in SOFFICE

Popular Questions

Popular Tags

Trending Questions