I have a python2.7 Socketserver that is responding to requests over a congested local network. (Think cluster of computers running said python server talking to a single controller through a switch).
On the other side of the TCP connection is a NodeJS application acting as a client for the server.
The issue arises approx. 1/1000 requests and seemingly randomly. The NodeJS client will report an "ECONNRESET" error but MOST of the time received all of the data. About 1/10 of these errors the "ECONNRESET" is thrown before all the data is received.
After a fair bit of investigation, I have found that the issue lies in the SocketServer. I have tried using a C client as well as a Python client and both report their version of the same error.
I have also found (see code below) That in the socket server if I put a "sleep(0.25)" after all the writes but before the return within the handle function this error no longer occurs. (Tested with ~3,000,000 runs).
This leads me to believe that there is some weird interaction with the socket server forcing the connection to close, but one of the earlier packets needs re-transmission, or something along those lines, but the documentation on the socket server is fairly light on.
from time import sleep import SocketServer class ThreadedServer(SocketServer.ThreadingMixIn, SocketServer.TCPServer): pass class ThreadedRequestHandler(SocketServer.BaseRequestHandler): def handle(self): req = parseRequest() res = processRequest() self.request.send(res) sleep(0.25) return def main(): server = ThreadedServer((parameters.HOST, server_port), ThreadedRequestHandler) server.serve_forever() if __name__ == "__main__": main()
I have of course removed the business logic, but the issue surrounds the sleep after the send. With it there, no issues, without it, sometimes an unexpected server close / last packets not being send. This leads me to believe that it is something to do with dropped packets and the server forcing close, but It could be something else.
Any help would be much appreciated, and if anything is unclear I can clarify.