How to predict gam random field markov model to grid?

196 views Asked by At

I am working on a similar problem as the example in the mgcv package.

I have created a similar model and would like to predict to grid level instead of the districts identified in the training data. Is this possible? If so, how?

I have created a prediction dataset as an example.

library(mgcv)
data(columb)
data(columb.polys)

b <- gam(crime ~ s(district,bs="mrf",xt=xt),data=columb,method="REML")
plot(b,scheme=1)


#Create prediction dataset

df <- data.frame(district=numeric(0),x=numeric(0),y= numeric(0)) #Create empty df to store x, y and IDs for each polygon

# Extract x and y coordinates from each polygon and assign district ID
for (i in 1:length(columb.polys)) {
  district <- i-1
  x <- columb.polys[[i]][,1]
  y <- columb.polys[[i]][,2]
  df <- rbind(df,cbind(district,x,y)) 
}

sp <- df %>%
  group_by(district) %>%
  do(poly=dplyr::select(., x, y) %>%Polygon()) %>%
  rowwise() %>%
  do(polys=Polygons(list(.$poly),.$district)) %>%
  {SpatialPolygons(.$polys)}


#Convert sp to sf object
sp_sf <- st_as_sf(sp, proj4string='+proj=utm +zone=48N +ellps=WGS84 +units=m')
sp_sf <- st_set_crs(sp_sf, '+proj=utm +zone=48N +ellps=WGS84 +units=m')

#Make grid
grid <- st_make_grid(sp_sf, 
                           cellsize = 0.05, 
                              what = "centers", crs = '+proj=utm +zone=48N +ellps=WGS84 +units=m')

#intersection of grid and Columbus, Ohio
pred_grid <- st_intersection(grid, st_buffer(sp_sf, dist=0))

pred_b <- predict(b, pred_data)

Error in data[[txt]] : subscript out of bounds

Thank you!

1

There are 1 answers

0
Gavin Simpson On

This isn't possible with the MRF, at least so far as I know. In the model you show, you only have the MRF smooth, so the conditional distribution of the response is modelled as estimated values for each district under the constraint that the nearby districts are similar to one another-ish. To use the model you need to predict for the districts, not some other data.

I'm assuming in this line

pred_b <- predict(b, pred_data)

you meant

pred_b <- predict(b, pred_grid)

If your real problem has a mixture of data on a grid and at a region level, you can predict on the grid, but you'll need to assign each grid location to a region so that you can pass in a column like District which is what feeds into the the MRF term.