Can R extension safely allocate memory when it comes to exceptional conditions?

298 views Asked by At

I am about to write an extension package for R in C++ and wonder how dynamic memory management is intended to be used without risk of memory leaks. I have read

and immediately get to three questions:

  1. Does R gracefully unwind the C++ stack frame in case of R-exceptions, e.g. when R_alloc runs out of memory or Rf_error is called due to some other condition? – Otherwise, how am I supposed to clean up already R_alloc'ed and PROTECTed or simply Calloc'ed memory? For example, will

    #include<R.h>
    // […]
    void someMethod () {
      char *buffer1 = NULL;
      char *buffer2 = NULL;
      try {
        ClassA a;
        buffer1 = R_Calloc( 10000, char );
        buffer2 = R_Calloc( 10000, char );
        // […]
      } finally {
        try {
          if ( NULL != buffer1 ) {
            R_Free( buffer1 );
          }
        } finally {
          if ( NULL != buffer2 ) {
            R_Free( buffer2 );
          }
        }
      }
    }
    

    guarantee to call the destructor ~ClassA for a and R_Free for buffer1 and buffer2? And if not, what would be the R textbook way to guarantee that?

  2. Could standard C++ (nowadays deprecated) std::auto_ptr or modern std::unique_ptr be employed to simplify the memory allocation idiom?
  3. Is there a proven C++ idiom/best practice to use R's memory allocation in the C++ standard template library, e.g. some suitable allocator template, so that STL classes allocate their memory from the R heap?
1

There are 1 answers

0
Bernhard Bodenstorfer On

Since Rf_error will indeed skip the C++ stack frame and thus bypass destructor calls, I found it necessary to undertake more documentation research. In particular a look into the RODBC package and experimentation monitoring memory use to confirm the findings, made me arrive at:

1: Immediately store pointer in an R external pointer and register a finaliser for that.

The idiom is illustrated in the following somewhat simplistic example:

#define STRICT_R_HEADERS    true

#include <string>
#include <R.h>
#include <Rinternals.h>     // defines SEXP

using namespace std;

class A {
    string name;
    public:
    A ( const char * const name ) : name( name ) { Rprintf( "Construct %s\n", name ); }
    ~A () { Rprintf( "Destruct %s\n", name.c_str() ); }
    const char* whoami () const { return name.c_str(); }
};

extern "C" {
    void finaliseAhandle ( SEXP handle ) {
        A* pointer = static_cast<A*>( R_ExternalPtrAddr( handle ) );
        if ( NULL != pointer ) {
            pointer->~A();
            R_Free( pointer );
            R_ClearExternalPtr( handle );
        }
    }

    SEXP createAhandle ( const SEXP name ) {
        A* pointer = R_Calloc( 1, A );
        SEXP handle = PROTECT( R_MakeExternalPtr(
            pointer,
            R_NilValue, // for this simple example no use of tag and prot
            R_NilValue
        ) );
        try {
            new(pointer) A( CHAR( STRING_ELT( name, 0 ) ) );
            R_RegisterCFinalizerEx( handle, finaliseAhandle, TRUE );
        } catch (...) {
            R_Free( pointer );
            R_ClearExternalPtr( handle );
            Rf_error( "construction of A(\"%s\") failed", CHAR( STRING_ELT( name, 0 ) ) );
        }
        // … more code may follow here, including calls to Rf_error.
        UNPROTECT(1);
        return handle;
    }

    SEXP nameAhandle ( const SEXP handle ) {
        A* pointer = static_cast<A*>( R_ExternalPtrAddr( handle ) );
        if( NULL != pointer ) {
            return mkChar( pointer->whoami() );
        }
        return R_NilValue;
    }

    SEXP destroyAhandle ( const SEXP handle ) {
        if( NULL != R_ExternalPtrAddr( handle ) ) {
            finaliseAhandle( handle );
        }
        return R_NilValue;
    }
}

The assignment of NULL to the pointer in R_ClearExternalPtr( handle ); prevents double calling of R_Free( pointer );`.

Mind that there is still some assumption needed for the suggested idiom to safely work: If the constructor must not fail in the sense of R, i.e. by calling Rf_error. If this cannot be avoided, my advice would be to postpone the constructor invocation to after the finaliser registration so that the finaliser will in any case be able to R_Free the memory. However, logic must be included in order not to call the destructor ~A unless the A object has been validly constructed. In easy cases, e.g. when A comprises only primitive fields, this may not be an issue, but in more complicated cases, I suggest to wrap A into a struct which can then remember whether the A constructor completed successfully, and then allocate memory for that struct. Of course, we must still rely on the A constructor to gracefully fail, freeing all memory it had allocated, regardless of whether this was done by C_alloc or malloc or the like. (Experimentation showed that memory from R_alloc is automatically freed in case of Rf_error.)

2: No.

Neither class has anything to do with registering R external pointer finalisers.

3: Yes.

As far as I have seen, it is considered best practice to cleanly separate the reigns of C++ and R. Rcpp encourages the use of wrappers (https://stat.ethz.ch/pipermail/r-devel/2010-May/057387.html, cxxfunction in http://dirk.eddelbuettel.com/code/rcpp.html) so that C++ exceptions will not hit the R engine.

In my opinion, an allocator could be programmed to use R_Calloc and R_Free. However, to counter the effects of potential Rf_error during such calls, the allocator would require some interface to garbage collection. I imagine locally tying the allocator to a PROTECTed SEXP of type externalptr which has a finaliser registered by R_RegisterCFinalizerEx and points to a local memory manager which can free memory in case of Rf_error.