Monthly average from netCDF files in R

4.9k views Asked by At

I have one netCDF file (.nc) with 16 years(1998 - 2014) worth of daily precipitation (5844 layers). The 3 dimensions are time (size 5844), latitude (size 19) and longitude (size 20) Is there a straightforward approach in R to compute for each rastercell:

  • Monthly & yearly average
  • A cummulative comparison (e.g. jan-mar compared to the average of all jan-mar)

So far I have:

library(ncdf4)
library(raster)

Rname <- 'F:/extracted_rain.nc'
rainfall <- nc_open(Rname)
readRainfall <- ncvar_get(rainfall, "rain") #"rain" is float name
raster_rainfall <- raster(Rname, varname = "rain") # also tried brick()
asdatadates <- as.Date(rainfall$dim$time$vals/24, origin='1998-01-01') #The time interval is per 24 hours

My first challenge will be the compuatation of monthly averages for each raster cell. I'm not sure how best to proceed while keeping the ultimate goal (cummulative comparison) in mind. How can I easily access only days from a certain month?

raster(readRainfall[,,500])) # doesn't seem like a straightforward approach

Hopefully I made my question clear, a first push in the right direction would be much appreciated. Sample data here

3

There are 3 answers

2
joberlin On BEST ANSWER

Here is one approach using the zoo-package:

### first read the data
library(ncdf4)
library(raster)
library(zoo)

### use stack() instead of raster
stack_rainfall <- stack(Rname, varname = "rain")

### i renamed your "asdatadates" object for simplicity
dates <- as.Date(rainfall$dim$time$vals/24, origin='1998-01-01') 

In your example dataset you only have 18 layers, all coming from January 1998. However, the following should also work with more layers (months). First, we will build a function that operates one one vector of values (i.e. pixel time series) to convert the input to a zoo object using dates and the calculates the mean using aggregate. The function returns a vector with the length equal to the number of months in dates.

monthly_mean_stack <- function(x) {
    require(zoo)
    pixel.ts <- zoo(x, dates)
    out <- as.numeric(aggregate(pixel.ts, as.yearmon, mean, na.rm=TRUE))
    out[is.nan(out)] <- NA     
    return(out)
}

Then, depending on whether you want the output to be a vector / matrix / data frame or want to stay in the raster format, you can either apply the function over the cell values after retrieving them with getValues, or use the calc-function from the raster-package to create a raster output (this will be a raster stack with as many layers as there a months in your data)

v <- getValues(stack_rainfall) # every row displays one pixel (-time series)


# this should give you a matrix with ncol = number of months and nrow = number of pixel
means_matrix <- t(apply(v, 1, monthly_mean_stack))

means_stack <- calc(stack_rainfall, monthly_mean_stack)

When you're working with large raster datasets you can also apply your functions in parallel using the clusterR function. See ?clusterR

0
TJeff On

I think easiest to convert to raster brick and then into a data.frame.

Then can pull stats quite easily using general code DF$weeklymean <- rowMeans(DF[, ])

0
ClimateUnboxed On

The question asked for a solution in R, but in case anyone is looking to do this task and wants a simple alternative command-line solution, these kind of statistics are the bread and butter of CDO

Monthly averages:

cdo monmean in.nc monmean.nc

Annual averages:

cdo yearmean in.nc yearmean.nc

Make the average of all the Jan, Feb etc:

cdo ymonmean in.nc ymonmean.nc

The monthly anomaly relative to the long term annual cycle:

cdo sub monmean.nc ymonmean.nc monanom.nc

Then you want a specific month, just select with selmon, or seldate.

you can call these functions from R using the system command.