I have one netCDF file (.nc) with 16 years(1998 - 2014) worth of daily precipitation (5844 layers). The 3 dimensions are time (size 5844), latitude (size 19) and longitude (size 20) Is there a straightforward approach in R to compute for each rastercell:
- Monthly & yearly average
- A cummulative comparison (e.g. jan-mar compared to the average of all jan-mar)
So far I have:
library(ncdf4)
library(raster)
Rname <- 'F:/extracted_rain.nc'
rainfall <- nc_open(Rname)
readRainfall <- ncvar_get(rainfall, "rain") #"rain" is float name
raster_rainfall <- raster(Rname, varname = "rain") # also tried brick()
asdatadates <- as.Date(rainfall$dim$time$vals/24, origin='1998-01-01') #The time interval is per 24 hours
My first challenge will be the compuatation of monthly averages for each raster cell. I'm not sure how best to proceed while keeping the ultimate goal (cummulative comparison) in mind. How can I easily access only days from a certain month?
raster(readRainfall[,,500])) # doesn't seem like a straightforward approach
Hopefully I made my question clear, a first push in the right direction would be much appreciated. Sample data here
Here is one approach using the
zoo
-package:In your example dataset you only have 18 layers, all coming from January 1998. However, the following should also work with more layers (months). First, we will build a function that operates one one vector of values (i.e. pixel time series) to convert the input to a
zoo
object usingdates
and the calculates the mean usingaggregate
. The function returns a vector with the length equal to the number of months indates
.Then, depending on whether you want the output to be a vector / matrix / data frame or want to stay in the raster format, you can either apply the function over the cell values after retrieving them with
getValues
, or use thecalc
-function from theraster
-package to create a raster output (this will be a raster stack with as many layers as there a months in your data)When you're working with large raster datasets you can also apply your functions in parallel using the
clusterR
function. See ?clusterR