Column means 3d matrix (cube) Rcpp

2.2k views Asked by At

I have a program in which I need to calculate repeatedly the column means of each slice of a cube X(nRow, nCol, nSlice) in Rcpp, with the resulting means forming a matrix M(nCol, nSlice). The following code produced an error:

#include <RcppArmadillo.h>

// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp; 
using namespace arma;

// [[Rcpp::export]]

mat cubeMeans(arma::cube X){
   int nSlice = X.n_slices;
   int nCol = X.n_cols;
   int nRow = X.n_rows;
   arma::vec Vtmp(nCol);
   arma::mat Mtmp(nRow, nCol);
   arma::mat Means(nCol, nSlice);
   for (int i = 0; i < nSlice; i++){
      Mtmp = X.slice(i);
      for(int j = 0; j < nCol; j++){
         Vtmp(j) = sum(Mtmp.col(j))/nRow; 
      }
      Means.col(i) = Vtmp;
   }
  return(wrap(Means));
}

'/Rcpp/internal/Exporter.h:31:31: error: no matching function for call to 'arma::Cube::Cube(SEXPREC*&)'

I couldn't quite figure it out. I didn't get the error when the input of the function was a matrix (and returned a vector). However, I included the above function as part of my main program i.e.

#include <RcppArmadillo.h>

// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;

mat cubeMeans(arma::cube X){
  int nSlice = X.n_slices;
  ...
  return(Means);
}

// [[Rcpp::export]]

main part of program

The program compiled successfully, but it is painfully slow (almost as slow as the R version of the program using colMeans). Is there a better way to calculate column means on a cube, and why am I getting that compilation error?

I'd appreciate any help.

Regards,

1

There are 1 answers

0
nrussell On BEST ANSWER

I also received this error when attempting to use an arma::cube as an Rcpp function parameter. Based on the compiler error, I believe this is because there is no Rcpp::wrap<arma::cube> currently defined (which is needed to handle the R object you would pass to the function).† After reading a couple of related examples online, it looks like the typical workaround is to read in your R array as a NumericVector, and since it retains its dims attribute, use these to set your arma::cube dimensions. Despite the fact that there is an extra step or two required to account for the missing wrap specialization†, the Armadillo version I put together seems to be quite a bit faster than my R solution:

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::mat cube_means(Rcpp::NumericVector vx) {

  Rcpp::IntegerVector x_dims = vx.attr("dim");
  arma::cube x(vx.begin(), x_dims[0], x_dims[1], x_dims[2], false);

  arma::mat result(x.n_cols, x.n_slices);
  for (unsigned int i = 0; i < x.n_slices; i++) {
    result.col(i) = arma::conv_to<arma::colvec>::from(arma::mean(x.slice(i)));  
  }

  return result;
}

/*** R

rcube_means <- function(x) t(apply(x, 2, colMeans))

xl <- array(1:10e4, c(100, 100 ,10))
all.equal(rcube_means(xl), cube_means(xl))
#[1] TRUE

R> microbenchmark::microbenchmark(
    "R Cube Means" = rcube_means(xl),
    "Arma Cube Means" = cube_means(xl),
    times = 200L)
Unit: microseconds
            expr      min       lq      mean   median       uq       max neval
    R Cube Means 6856.691 8204.334 9843.7455 8886.408 9859.385 97857.999   200
 Arma Cube Means  325.499  380.540  643.7565  416.863  459.800  3068.367   200

*/

where I am taking advantage of the fact that the arma::mean function overload for arma::mats will calculate column means by default (arma::mean(x.slice(i), 1) would give you the row means of that slice).


Edit: † On second thought, I'm not really sure if this has to do with Rcpp::wrap or not - but the issue seems to be related to a missing Exporter<> specialization for arma::cube - line 31 of Rcpp's Exporter.h:

template <typename T>
class Exporter{
public:
  Exporter( SEXP x ) : t(x){}
  inline T get(){ return t ; }

private:
  T t ;
} ;

Regardless, NumericVector / setting dimensions approach I used seems to be functional solution for now.


Based on the output dimensions you described in your question, I assumed you wanted each column of the resulting matrix to be a vector of column means of the corresponding array slice (column 1 = column means of slice 1, etc...), i.e.

R> x <- array(1:27, c(3, 3, 3))
R> rcube_means(x)
     [,1] [,2] [,3]
[1,]    2   11   20
[2,]    5   14   23
[3,]    8   17   26
R> cube_means(x)
     [,1] [,2] [,3]
[1,]    2   11   20
[2,]    5   14   23
[3,]    8   17   26

but it would be trivial for you to alter this if needed.