Shared memory deleted at exit

2.2k views Asked by At

I have two scripts, one that creates and writes a shared memory block, and the second one that reads that shared memory. The problem is that when the first script ends, the shared memory is deleted even I do not unlink it. Here's my first script :

import argparse
import csv
import os
import subprocess
import sys
import time

from classes.rtData import RtData
from multiprocessing import shared_memory

if '__main__' == __name__:
    sys.stdout.write('starting server ... \n')
    service = RtData()

    shm_a = shared_memory.SharedMemory(name='rtdata', create=True, size=len(msg))
    data = shm_a.buf
    data[0] = 42
    shm_a.close

When adding a breakpoint I can see that the shared memory is created :

$ ls -l /dev/shm/

total 0 -rw------- 1 facundo facundo 4 Sep 28 09:41 rtdata

But when the scripts exits, the shared memory is deleted ( so I can not read it with the second script )

4

There are 4 answers

1
Rudson Rodrigues On

it seems that you are missing a manager process

for example (https://www.geeksforgeeks.org/multiprocessing-python-set-2/): import multiprocessing

def print_records(records): 
    """ 
    function to print record(tuples) in records(list) 
    """
    for record in records: 
        print("Name: {0}\nScore: {1}\n".format(record[0], record[1])) 

def insert_record(record, records): 
    """ 
    function to add a new record to records(list) 
    """
    records.append(record) 
    print("New record added!\n") 

if __name__ == '__main__': 
    with multiprocessing.Manager() as manager: 
        # creating a list in server process memory 
        records = manager.list([('Sam', 10), ('Adam', 9), ('Kevin',9)]) 
        # new record to be inserted in records 
        new_record = ('Jeff', 8) 

        # creating new processes 
        p1 = multiprocessing.Process(target=insert_record, args=(new_record, records)) 
        p2 = multiprocessing.Process(target=print_records, args=(records,)) 

        # running process p1 to insert new record 
        p1.start() 
        p1.join() 

        # running process p2 to print records 
        p2.start() 
        p2.join() 
3
user1077915 On

I found the reason. The documentation states that

When one process no longer needs access to a shared memory block that might still be needed by other processes, the close() method should be called

I've found this bug report related to this issue https://bugs.python.org/issue39959#msg368770

I've tested it in my process by adding unregister( shared_memory_name, 'shared_memory') in the consuming process, it works fine.

0
ShadowRanger On

Here's what's happening, why, and how, and why the solution you found "works".

Shared memory can, on some operating systems, persist indefinitely if not explicitly unlinked, even if no running process is using it. This is a nasty resource leak that doesn't get cleaned up until the machine is rebooted. In C, that's on the programmer; if you didn't want it leaked, you should have been more careful. In Python, this violates general expectations (even if the programmer fails to do proper cleanup, Python will do it for you, emitting a ResourceWarning to the few developers who enable all warnings), so they didn't want to leave a potential leak scenario like this in there.

The solution to it is (on POSIX systems) a resource tracker process (from the undocumented multiprocessing.resource_tracker module) that is used automatically for any interprocess shared resources (right now, just shared memory and named semaphores). The resource tracker process launches separately, with a pipe opened from and written to in the parent, passed to the tracker, which the parent process (and its other children) write to when they want to register a resource for eventual cleanup.

The tracker process intentionally outlives the parent, constantly reading lines written from the parent process to register/unregister resources for cleanup. When all other processes holding the write pipe have closed though, its loop naturally terminates, and any remaining objects registered for cleanup are cleaned automatically, then the resource tracker itself exits.

What this means is:

  1. If you don't unlink the shared memory yourself, the resource tracker will shortly after you, and all your child processes (that did not intentionally close the pipe to the resource tracker), have exited.
  2. If you want to prevent this, as you found, you can manually communicate with the resource tracker to unregister the shared memory by running from multiprocessing.resource_tracker import unregister, then, after creating the shared memory, running unregister(shared_memory_name, 'shared_memory') to immediately tell the resource tracker "Never mind, I'll handle cleaning it up manually." (just make sure you actually do this)
  3. This is needed even for more normal shared memory use cases (where process A launches first and creates/opens the shared memory, unrelated process B launches and opens it, process A exits, then process C launches and wants to open the same shared memory, which likely fails, as the memory was already unlinked, and only exists in an unnamed state until process B closes it's remaining handle(s) to it). The bug you found and the still open duplicate it was closed in favor of is proposing documented solutions, but right now, no such solution exists, so depending on undocumented internals is your only option.
0
Akira Cleber Nakandakare On

According to the other comments, this is a bug in Python 3.8 (and still present in Python 3.10).

I tried to follow user1077915's solution but couldn't. Reading more, I've found turicas' solution and, besides this fix works only in Linux, it's fine to me.

The whole code with example is here: https://bugs.python.org/file49859/mprt_monkeypatch.py

As far as I understand, this code is kind of hacking Python's multiprocessing library, so an update to Python's version may eventually break the code.

For convenience, below is the full code, with an example. I tested it with success with Python 3.8.10 and Python 3.10.7 in Ubuntu for Windows.
I'm just sharing the code here. All credits go to the author.

from multiprocessing import Process, resource_tracker
from multiprocessing.shared_memory import SharedMemory


def remove_shm_from_resource_tracker():
    """Monkey-patch multiprocessing.resource_tracker so SharedMemory won't be tracked

    More details at: https://bugs.python.org/issue38119
    """

    def fix_register(name, rtype):
        if rtype == "shared_memory":
            return
        return resource_tracker._resource_tracker.register(self, name, rtype)
    resource_tracker.register = fix_register

    def fix_unregister(name, rtype):
        if rtype == "shared_memory":
            return
        return resource_tracker._resource_tracker.unregister(self, name, rtype)
    resource_tracker.unregister = fix_unregister

    if "shared_memory" in resource_tracker._CLEANUP_FUNCS:
        del resource_tracker._CLEANUP_FUNCS["shared_memory"]


def create_shm():
    remove_shm_from_resource_tracker()

    print("create_shm started")
    shared_mem = SharedMemory(create=True, size=1024, name="python_testshm")
    shared_mem.buf.obj.write(b"X" * 1024)
    shared_mem.close()
    print("create_shm finished")

def destroy_shm():
    remove_shm_from_resource_tracker()

    print("destroy_shm started")
    shared_mem = SharedMemory(create=False, name="python_testshm")
    result = shared_mem.buf.tobytes()
    shared_mem.close()
    shared_mem.unlink()
    print("destroy_shm finished")

def main():
    p1 = Process(target=create_shm)
    p1.start()
    p1.join()
    p2 = Process(target=destroy_shm)
    p2.start()
    p2.join()


if __name__ == "__main__":
    main()