Python Memory Leak - Why is it happening?

520 views Asked by At

For some background on my problem, I'm importing a module, data_read_module.pyd, written by someone else, and I cannot see the contents of that module.

I have one file, let's called it myfunctions. Ignore the ### for now, I'll comment on the commented portions later.

import data_read_module

def processData(fname):
    data = data_read_module.read_data(fname)
    ''' process data here '''     
    return t, x
    ### return 1

I call this within the framework of a larger program, a TKinter GUI specifically. For purposes of this post, I've pared down to the bare essentials. Within the GUI code, I call the above as follows:

import myfunctions 

class MyApplication:
    def __init__(self,parent):
        self.t = []
        self.x = []

    def openFileAndProcessData(self):
        # self.t = None
        # self.x = None
        self.t,self.x = myfunctions.processData(fname)
        ## myfunctions.processData(fname)

I noticed what every time I run openFileAndProcessData, Windows Task Manager reports that my memory usage increases, so I thought that I had a memory leak somewhere in my GUI application. So the first thing I tried is the

# self.t = None
# self.x = None 

that you see commented above. Next, I tried calling myfunctions.processData without assigning the output to any variables as follows:

## myfunctions.processData(fname)

This also had no effect. As a last ditch effort, I changed the processData function so it simply returns 1 without even processing any of the data that comes from the module, data_read_module.pyd. Unfortunately, even this results in more memory being taken up with each successive call to processData, which narrows the problem down to data_read_module.read_data. I thought that within the Python framework, this is the exact type of thing that is automatically taken care of. Referring to this website, it seems that memory taken up by a function will be released when the function terminates. In my case, I would expect the memory used in processData to be released after a call [with the exception of the output that I am keeping track of with self.t and self.x]. I understand I won't get a fix to this kind of issue without access to data_read_module.pyd, but I'd like to understand how this can happen to begin with.

1

There are 1 answers

0
kindall On

A .pyd file is basically a DLL. You're calling code written in C, C++, or another such compiled language. If that code allocates memory and doesn't release it properly, you will get a memory leak. The fact that the code is being called from Python won't magically fix it.