spatSample() choosing NA cells even when na.rm = TRUE

37 views Asked by At

I assume this is the cause of my error. I have a stack of 18 rasters which underwent the following preprocessing steps:

  • Cropping to a specific region using a shapefile
  • Trim to the extent of the region
  • Masking to remove unwanted values within the region

I'll provide a MRE below, but the original rasters look like this (all have the same shape, extent, CRS and position of NA cells):

enter image description here

In order to perform PCA with prcomp(), I first reduced the dataset to 300 cells using spatsample():

sampled_stack <- spatSample(
  stack,
  size = 300,                      # Reduce stack to 200 samples
  method = "random",               # Select samples randomly
  as.raster = TRUE,                # Get new SpatRaster w/ same extent but fewer cells
  na.rm = TRUE                       # Ignore NA cells when sampling
)

I need as.raster needs to be TRUE. The procedure works, but plotting sampled_stack already shows that na.rm = TRUE didn't work as there are NA cells. For prcomp():

pca_stack <- prcomp(
  sampled_stack,
  center = TRUE,                   # Shift data to zero centering
  scale = TRUE                     # Scale data to have unit variance
)

Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x'

The error is self explanatory and dozens of questions here address it. I am relatively sure my case refers to the NA cells in my sampled stack. However, adding na.rm = TRUE as was the solution to this post didn't work.

How else can I create a randomly reduced sample that ignores my NA cells to then perform prcomp()?

I am using terra 1.7-65 (cannot update package due to constraints in my workplace)

Minimal reproducible example w/ randomly generated raster, then random addition of sufficient NA cells:

set.seed(42)
r <- rast(nrows = 100, ncols = 100)
values <- sample(c(NA, runif(45)), size = ncell(r), replace = TRUE)
values[sample(ncell(r), size = 2500)] <- NA
values(r) <- values
stacked_rasters <- c(r, r * 2, r/ 2)
1

There are 1 answers

1
Robert Hijmans On BEST ANSWER

This is not unexpected. The documentation states

na.rm. logical. If TRUE, NAs are removed. Only used with random sampling of cell values. That is with method="random", as.raster=FALSE, cells=FALSE

You do not explain why you need as.raster as you can do

pca_stack <- prcomp(spatSample(stacked_rasters, 100, na.rm=TRUE))