i'm doing a simple enough operation, reprojecting a raster from WGS84 to British National Grid, but i am wondering about some of the results post reprojection. The sum of the resulting rasters are quite different; is this due to the resolution and the way the bilinear interpolation in 'projectRaster' works?
I started with a csv that has global data covering the range of -180 to 180 degrees in both latitude and longitude (integer values only), with some z-values. this is subsetted, made into a raster of projection WGS84 and converted to BNG (subset below):
x <- rep(c(-10:3), times = 10)
n <- 14
y <- rep(48:57, each=n)
z <- rnorm(n=140, mean=20, sd=5)
ind <- which(z %in% sample(z, 45))
z[ind]<-NA
df <- data.frame("x"=x,"y"=y,"value"=z)
bng <- '+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs'
wgs84 <- '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs'
coordinates(df) = ~x + y
ras = rasterFromXYZ(df, crs=wgs84)
cellStats(ras,sum)
plot(ras)
ras_bng <- projectRaster(ras,crs=bng)
plot(ras_bng)
cellStats(ras_bng,sum)
the sum goes from 1919 to 2625 (in my case), a fair chunk.
Is it purely from the reprojection creating so many extra cells around the 'edges'? if i were to reproject to a raster of much smaller resolution (5km), would this reduce the differing sums considerably?
thanks, S
> ras
class : RasterLayer
dimensions : 10, 14, 140 (nrow, ncol, ncell)
resolution : 1, 1 (x, y)
extent : -10.5, 3.5, 47.5, 57.5 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0
data source : in memory
names : layer
values : 3.243918, 32.21532 (min, max)
> ras_bng
class : RasterLayer
dimensions : 12, 19, 228 (nrow, ncol, ncell)
resolution : 67900, 111000 (x, y)
extent : -375677.8, 914422.2, -343553.2, 988446.8 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +datum=OSGB36 +units=m +no_defs +towgs84=446.448,-125.157,542.060,0.1502,0.2470,0.8421,-20.4894
data source : in memory
names : layer
values : 8.174968, 28.31331 (min, max)
This is because the
NA
values in 'ras'.The difference is small when you take out
z[ind]<-NA
. This is becauseprojectRaster
implicitly uses "na.rm=TRUE"; perhaps an argument is needed to change that. This is not as simple as in other cases, as, for example, only one of the cells interpolated from may beNA
, in which case it should probably be computed.In practice it is rare to find this many NA values, typically they are confined to the edges (of land) only.
By the way
WGS84
is a datum,it is not a projection. The projection you are using would be 'longitude/latitude' (also known by other names) except that this is also not a projection since the whole point of a projection is to go from such angular coordinates to planar coordinates. So, you are transforming one coordinate reference system ('long/lat, WGS84') to another (BNG).