ansible: 'A worker was found in a dead state' : python exit's SIGABRT when using python .net in an ansible task

106 views Asked by At

i am trying to use a .net C# code in Python / ansible . The goal of the C# code is to download large files (100mb) and store them localy.

When i ran the implemented Python code in a local test it is working! But if i run the code via ansible the following error is thrown

  • error! A worker was found in a dead state

[System-Overview for better understanding][1]

from pythonnet import load
load("coreclr")
import clr

clr.AddReference("yyyyyy.yyyy.yyy")

.....

def doWork(self, x, xx, feedUrl, xxx, xxxx):
          
        from xxxxx import Repository, PackageId, PackageVersion
        from System import Uri, Exception
       
     
        feedUri = Uri(str(feedUrl))           
        pi = PackageId(str(x))
        pv = PackageVersion(str(xx))

        repo = Repository(feedUri,str(xxx), "", str(xxxx))
       
        info= repo.GetPackage(pi, pv)
     
        return info

ansible code where Error is thrown

Question:

As described above the logic/python-code is working in an python test. When running the same python-code with ansible, something is different.

  • So why is the workerprocess closed with 'exitcode=-SIGABRT'
  • Whatis the difference between test code execution and ansible code execution?
  • What can i do to track the error down?

Environment

  • virtual machine Ubuntu 22.04
  • 12 gb ram

What i tried so far:

  1. make C# code strict sync

  2. C# debug code: I added debug code to the C# class and tracked the error down to the

    var response = httpClient.Send(request);

  3. add debug code to ansible (task_queue_manager)

    with this code i get the error detail:

    worker: <WorkerProcess name='WorkerProcess-15' pid=16145 parent=16069 stopped exitcode=-SIGABRT>

  4. increasing ram / watching ram

    when watching the ram (via htop) there is nothing suspicious, from my point of view


Edit:

here are the things i found out today!

  1. the http request done by the httpclient gets terminated witha 'TCP Zero Window' message what is TCP Zero Window means the client is not/ slow processing the data
  2. i tried to rebuild the ansible code in my test and i was able to reproduce the issue. It seems that the issue is related to the Python Process/multiprocessing Thread code works Process code does not work.
            thread = threading.Thread(target = self.DoWork)
            thread.start()
            thread.join()
    
            p = Process(target=self.DoWork)
            p.start()
            p.join()

since ansible is using processes link i still do not know what the problem is.

0

There are 0 answers