Count number of points in 500meter buffer in R

45 views Asked by At

I could come up with the number of points within 500m buffer with following function. However, I want to find the number of points in 500m buffer with similar values under variables.

For example under variable Race: there are 3 values, white, hispanic, black and I want to find the number of points within 500 meter which have similar race as the reference point.

library(geosphere)

coordinates <- cbind(data$Y, data$X)
# Calculate distances between each point and all other points
distances <- distm(coordinates, fun =distHaversine)

# Count the number of points within 500 meters
data$proximity <- rowSums(distances <= 500)
1

There are 1 answers

2
margusl On

You can split your dataset ( split(data, ~Race) ) and work with 3 distance matrices instead of one. Though this is bit more convenient with sf & dplyr where you can just group points by some attribute and then get the number of points within the distance from each of those locations.

library(sf)
#> Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
library(dplyr)
library(ggplot2)

# generate example dataset
set.seed(42)
data <- data.frame(
  id = as.factor(1:7),
  X = runif(7, -.005, .005),
  Y = runif(7, -.005, .005),
  race = sample(c("white", "hispanic", "black"), 7, replace = TRUE))
data
#>   id             X             Y     race
#> 1  1  0.0041480604 -0.0036533340    white
#> 2  2  0.0043707541  0.0015699229    white
#> 3  3 -0.0021386047  0.0020506478 hispanic
#> 4  4  0.0033044763 -0.0004225822 hispanic
#> 5  5  0.0014174552  0.0021911225 hispanic
#> 6  6  0.0001909595  0.0043467225    black
#> 7  7  0.0023658831 -0.0024457118    black

# convert to sf object
data_sf <- 
  st_as_sf(data, coords = c("X", "Y"), crs = "WGS84")

# count points within distance, count includes origin point
# n_within : all points within 500m radius
# n_within_grouped : points from the same group within 500m radius
counts_sf <- 
  data_sf |>
  mutate(n_within = st_is_within_distance(geometry, dist = 500) |> lengths()) |>
  mutate(n_within_grouped = st_is_within_distance(geometry, dist = 500) |> lengths(), .by = race) 

Resulting dataset with grouped and ungrouped counts and plot with point buffers. E.g. check "5" : total size of that neighbourhood is 5 while grouped count (number of triangle-points within 5's buffer) is 3.

counts_sf
#> Simple feature collection with 7 features and 4 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -0.002138605 ymin: -0.003653334 xmax: 0.004370754 ymax: 0.004346722
#> Geodetic CRS:  WGS 84
#>   id     race                       geometry n_within n_within_grouped
#> 1  1    white POINT (0.00414806 -0.003653...        3                1
#> 2  2    white POINT (0.004370754 0.001569...        4                1
#> 3  3 hispanic POINT (-0.002138605 0.00205...        3                2
#> 4  4 hispanic POINT (0.003304476 -0.00042...        5                2
#> 5  5 hispanic POINT (0.001417455 0.002191...        5                3
#> 6  6    black POINT (0.0001909595 0.00434...        3                1
#> 7  7    black POINT (0.002365883 -0.00244...        4                1

ggplot(counts_sf) +
  # 500m buffers around points
  geom_sf(data = st_buffer(counts_sf, 500), aes(color = id, fill = id), alpha = .1) +
  geom_sf(aes(color = id, shape = race), size = 3) +
  geom_sf_text(aes(label = id), nudge_x = -.0004)+
  scale_color_brewer(palette = "Dark2") +
  scale_fill_brewer(palette = "Dark2") +
  ggspatial::annotation_scale() +
  theme_minimal()

Created on 2024-03-31 with reprex v2.1.0