Crossing two sets of coordinates in order to calculate minimum distance? (tidyr geosphere R)

134 views Asked by At

I have a set of 2,220 nest coordinates (var1) and another set of 26 landmark coordinates (var2) in the same bounded area. I want to find the distance between each of the 2,224 coordinates to every point in the set of 26 in order to create a new data frame with the columns (nest coordinates, minimum distance landmark coordinate, distance in m).

I am stuck trying to cross the two sets to produce a set where all of the landmark coordinates are paired with each of the nest coordinates.

**nest**                **landmark**         **distance**

lat1, lon1              lat1, lon1            34
lat1, lon1              lat2, lon2            18
lat1, lon1              lat3, lon3            82
....
lat1, lon1              lat26,lon26           61
lat2, lon2              lat1, lon1            94
lat2, lon2              lat2, lon2            38
...
lat2,220, lon 2,220     lat 26,lon26          46

I've tried crossing(var1, var2) where var1 and var2 are both matrices containing lat & lon values and then calculating the Haversine distance between each resulting row (see below). This seems to work, but I don't think it's giving me the exact outcome I'm expecting. The number of resulting rows from the crossing is not consistent with the product of the nrow of these sets.

I also want to be able to split the resulting set with all the distance values into groups of 26, where each group contains the nest coordinates (repeated for each row), one of the 26 landmark coordinates, and the distance between the two points. From there, I will select for the row with the minimum distance.

newset <- crossing(nests, landmarks)
mindist <- distHaversine(newset[1], newset[2], r=6378137)
newsetwdist <- cbind(newset, mindist)

sv <- split(newsetwdist,rep(1:56056,each=26))
#56056 was the resulting number of rows, even though I expected 57,720.

var3 <- lapply(sv, "[", 3) #returns a nested list of all distances for each nest
var4 <- lapply(var2, "[[", "mindist")

df = as.data.frame(do.call(rbind, lapply(var4, unlist)))
min.dist.from.landmark <- apply(df, 1, FUN=min)

Seems like it should be an easy fix, any help would be appreciated.

1

There are 1 answers

0
rjen On BEST ANSWER

Using data and a data format that I produced for the occasion, you can do the following.

library(dplyr)
library(purrr)
library(tidyr)
library(geosphere)

crossing(nest, landmark) %>%
  mutate(nest_long_lat = map2(nest_long, nest_lat, ~ c(.x, .y)),
         mark_long_lat = map2(mark_long, mark_lat, ~ c(.x, .y)),
         distance = unlist(map2(mark_long_lat, nest_long_lat, ~ distGeo(.x, .y)))) %>%
  group_by(nest_long_lat) %>%
  mutate(min_distance = distance == min(distance)) %>%
  ungroup() %>%
  select(-nest_long_lat, -mark_long_lat)

# # A tibble: 57,720 x 6
#          nest_lat nest_long mark_lat mark_long distance min_distance
#          <dbl>    <dbl>     <dbl>    <dbl>     <dbl>    <lgl>       
# 1        46.5      49.1     48.4      49.8     215350.  TRUE        
# 2        46.5      49.1     48.6      48.7     229592.  FALSE       
# 3        46.5      49.1     48.8      49.9     255689.  FALSE       
# 4        46.5      49.1     48.9      48.4     268789.  FALSE       
# 5        46.5      49.1     49.3      50.1     312691.  FALSE       
# 6        46.5      49.1     49.3      49.2     309549.  FALSE       
# 7        46.5      49.1     49.6      51.6     390862.  FALSE       
# 8        46.5      49.1     49.7      50.8     371686.  FALSE       
# 9        46.5      49.1     49.8      50.6     377182.  FALSE       
# 10       46.5      49.1     49.9      49.9     376530.  FALSE       
# # … with 57,710 more rows

Data

nest <- tibble(nest_lat = rnorm(50, n = 2220),
               nest_long = rnorm(50, n = 2220))

landmark <- tibble(mark_lat = rnorm(50, n = 26),
                   mark_long = rnorm(50, n = 26))