Perl XS garbage collection

178 views Asked by At

I had to deal with a really old codebase in my company which had C++ apis exposed via perl.

In on of the code reviews, I suggested it was necessary to garbage collect memory which was being allocated in c++.

Here is the skeleton of the code:

char* convert_to_utf8(char *src, int length) {
    .
    .
    .
    length = get_utf8_length(src);
    char *dest = new char[length];
    .
    .
    // No delete
    return dest;
}

Perl xs definition:

PROTOTYPE: ENABLE

char * _xs_convert_to_utf8(src, length)
    char *src
    int length

CODE:
    RETVAL = convert_to_utf8(src, length)

OUTPUT:
    RETVAL

so, I had a comment that the memory created in the c++ function will not garbage collected by Perl. And 2 java developers think it will crash since perl will garbage collect the memory allocated by c++. I suggested the following code.

CLEANUP:
    delete[] RETVAL

Am I wrong here?

I also ran this code and showed them the increasing memory utilization, with and without the CLEANUP section. But, they are asking for exact documentation which proves it and I couldn't find it.

Perl Client:

use ExtUtils::testlib;
use test;

for (my $i=0; $i<100000000;$i++) {
    my $a = test::hello();
}

C++ code:

#define PERL_NO_GET_CONTEXT
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"

#include "ppport.h"
#include <stdio.h>

char* create_mem() {
    char *foo = (char*)malloc(sizeof(char)*150);
    return foo;
}

XS code:

MODULE = test       PACKAGE = test      
    char * hello()
CODE:
    RETVAL = create_mem();
OUTPUT:
    RETVAL
CLEANUP:
    free(RETVAL);
2

There are 2 answers

1
Calle Dybedahl On BEST ANSWER

I'm afraid that the people who wrote (and write) the Perl XS documentation probably consider it too obvious that Perl cannot magically detect memory allocation made in other languages (like C++) to document that explicitly. There's a bit in the perlguts documentation page that says that all memory to be used via the Perl XS API must use Perl's macros to do so that may help you argue.

0
ikegami On

When you write XS code, you're writing C (or sometimes C++) code. You still need to write proper C/C++, which includes deallocating allocated memory when appropriate.


The glue function you desire XS to create is the following:

void hello() {
    dSP;                       // Declare and init SP, the stack pointer used by mXPUSHs.
    char* mem = create_mem();
    mXPUSHs(newSVpv(mem, 0));  // Create a scalar, mortalize it, and push it on the stack.
    free(mem);                 // Free memory allocated by create_mem().
    XSRETURN(1);
}

newSVpv makes a copy of mem rather than taking possession of it, so the above clearly shows that free(mem) is needed to deallocate mem.


In XS, you could write that as

void hello()
CODE:
    {                          // A block is needed since we're declaring vars.
        char* mem = create_mem();
        mXPUSHs(newSVpv(mem, 0));
        free(mem);
        XSRETURN(1);
    }

Or you could take advantage of XS features such as RETVAL and CLEANUP.

SV* hello()
    char* mem;                 // We can get rid of the block by declaring vars here.
CODE:
    mem = create_mem();
    RETVAL = newSVpv(mem, 0);  // Values returned by SV* subs are automatically mortalized.
OUTPUT:
    RETVAL
CLEANUP:                       // Happens after RETVAL has been converted
    free(mem);                 //   and the converted value has been pushed onto the stack.

Or you could also take advantage of the typemap, which defines how to convert the returned value into a scalar.

char* hello()
CODE:
    RETVAL = create_mem();
OUTPUT:
    RETVAL
CLEANUP:
    free(RETVAL);

All three of these are perfectly acceptable.


A note on mortals.

Mortalizing is a delayed reference count decrement. If you were to decrement the SV created by hello before hello returns, it would get deallocated before hello returns. By mortalizing it instead, it won't be deallocated until the caller has a chance to inspect it or take possession of it (by increasing its reference count).