Calculating the number of people who live within or outside a certain distance from hospitals

212 views Asked by At

I'm new to geospatial stats and can't figure out a simple question:

I have two datasets with spatial coordinates. One has coordinates of hospitals and clinics in a particular district. The other has coordinates of all households in that district.

Here's some mock data

hospital_coord <-data.frame(longitude = c(80.15998, 72.89125, 77.65032, 77.60599), 
                latitude = c(12.90524, 19.08120, 12.97238, 12.90927))    

people_coord <-data.frame(longitude = c(72.89537, 77.65094, 73.95325, 72.96746, 
                              77.65058, 77.66715, 77.64214, 77.58415,
                              77.76180, 76.65470, 76.65480, 76.65490, 76.65500, 76.65560, 76.65560), 
                latitude = c(19.07726, 13.03902, 18.50330, 19.16764, 
                             12.90871, 13.01693, 13.00954, 12.92079,
                             13.02212, 12.81447, 12.81457, 12.81467, 12.81477, 12.81487, 12.81497))

I would like to calculate the following:

  • What percentage of households live more than 2 kilometres from the nearest clinic/hospital
  • Create a column in the dataframe indicating which households are within or outside the 2km distance
1

There are 1 answers

0
Calum You On BEST ANSWER

I think this does what you want, using the more recent sf package rather than geosphere from the question linked. The approach is as follows:

  1. Convert the latitude/longitude points into geometry objects using st_as_sf
  2. Set the coordinate reference system to a standard long/lat one since the data is in long/lat (this is WGS84)
  3. Use st_distance to compute the distance between each person and each hospital as a units table, in metres.
  4. Convert that units table into a regular tbl because it is a pain to deal with, and check which pairs have more than 2km separation
  5. Use mutate_at to check each row to see whether each hospital is less than 2km away (T) or more than 2km away (F)
  6. Finally, use pmap and any to check each row and see if at least one hospital is within 2km!

It looks like only the first patient is within 2km of a hospital.

library(tidyverse)
library(sf)
hospital <- tibble(
  longitude = c(80.15998, 72.89125, 77.65032, 77.60599),
  latitude = c(12.90524, 19.08120, 12.97238, 12.90927)
  )
people <- tibble(
  longitude = c(72.89537, 77.65094, 73.95325, 72.96746, 77.65058,
                77.66715, 77.64214, 77.58415, 77.76180, 76.65470,
                76.65480, 76.65490, 76.65500, 76.65560, 76.65560),
  latitude = c(19.07726, 13.03902, 18.50330, 19.16764, 12.90871,
               13.01693, 13.00954, 12.92079, 13.02212, 12.81447,
               12.81457, 12.81467, 12.81477, 12.81487, 12.81497)
  )

hospital_sf <- hospital %>%
  st_as_sf(coords = c("longitude", "latitude")) %>%
  st_set_crs(4326)

people_sf <- people %>%
  st_as_sf(coords = c("longitude", "latitude")) %>%
  st_set_crs(4326)

distances <- st_distance(people_sf, hospital_sf) %>%
  as_tibble() %>%
  mutate_at(vars(V1:V4), as.numeric) %>%
  mutate_at(vars(V1:V4), function (x) x > 2000) %>%
  mutate(within_2km = pmap_lgl(., function(V1, V2, V3, V4) any(V1, V2, V3, V4)))
# A tibble: 15 x 5
   V1    V2    V3    V4    within_2km
   <lgl> <lgl> <lgl> <lgl> <lgl>     
 1 T     F     T     T     T         
 2 T     T     T     T     F         
 3 T     T     T     T     F         
 4 T     T     T     T     F         
 5 T     T     T     T     F         
 6 T     T     T     T     F         
 7 T     T     T     T     F         
 8 T     T     T     T     F         
 9 T     T     T     T     F         
10 T     T     T     T     F         
11 T     T     T     T     F         
12 T     T     T     T     F         
13 T     T     T     T     F         
14 T     T     T     T     F         
15 T     T     T     T     F