I'm running a script from a Linux terminal with Python 3.6.8 and the script started failing when I tried to expand it with a function definition. I whittled it down to the basics and found that the device fails to connect when there is a function definition followed by a print statement in the code, but not when there's a print statement followed by a function definition.
This code successfully connects to (and disconnects from) the device:
import DeviceInterface
device_class = DeviceInterface.Device()
print()
def dummy_function_that_does_nothing():
pass
with device_class:
pass
This code, which swaps the function definition and print statement, gives a device connection error:
import DeviceInterface
device_class = DeviceInterface.Device()
def dummy_function_that_does_nothing():
pass
print()
with device_class:
pass
These examples are the exact file contents of the scripts being run (nothing added or omitted for this post). The DeviceInterface module is a ctypes wrapper around a C-based .so library. That library uses Aravis v0.6.4. The connection failure is caused by a null pointer being returned from a call to arv_camera_new().
I would expect no difference between the 2 versions of code above. There seems to be something deeper going on in Python or Linux libraries that I don't understand.
Why would there be different behavior when the print() comes before the function definition, rather than after? I have workarounds, so my question is not centered around how to get my code working, but rather to understand at a low level why there would be a difference in the way Python is working. I was shocked that there would be a difference between these 2 versions of code.
Reproducibility
Unfortunately, I haven't found a way to reproduce the problem without a library I do not have rights to distribute. I'm hoping someone stumbles on this that knows how Python would behave differently when there's a function definition followed by a print statement (vs a print statement followed by a function definition). If I understood the difference between the 2 versions of code I could likely come up with a more generic way to reproduce the problem.
Other things I've tried
- I've inserted delays in various places, but none had an effect on whether the device successfully connected, so it doesn't seem to be a timing issue as I originally suspected.
- I tried running both versions a number of times, and the problem has been very repeatably linked to the order of the function definition and the print statement (as opposed to being able to randomly connect).
- If I remove the print statement entirely, it succeeds regardless of where I put the function definition.
I thought it might have to do with garbage collection killing a socket. I tried disabling the garbage collection with gc.disable() at the start of the script, but it didn't change the behavior.
This code, which adds an additional function definition, successfully connects:
import DeviceInterface
device_class = DeviceInterface.Device()
def dummy_function_that_does_nothing():
pass
print()
def dummy_function_that_does_nothing_again():
pass
with device_class:
pass
- This code, which adds an additional function definition and another print statement, fails to connect:
import DeviceInterface
device_class = DeviceInterface.Device()
def dummy_function_that_does_nothing():
pass
print()
def dummy_function_that_does_nothing_again():
pass
print()
with device_class:
pass
- Changing the print statement to print(flush=True) or print(sys.stderr) did not change the functionality. However, print(end="") caused the problem to go away.
- Running python with unbuffered stdin/stdout/stderr (python3 -u odd_behavior_test.py) caused the failure to go away.