I have a csv file with the pond areas and Latitude and Longitude coordinates for 17,305 ponds. For each pond I would like to identify the coordinates of all the ponds within 1 km of it. I am an R novice so I thought I could adapt some nearest neighbour code. I found this loop in The R Book by Crawley:
x<-runif(100)
y<-runif(100)
par(pty="s")
plot(x,y,pch=16)
distance<-function(x1, y1, x2, y2) sqrt((x2 − x1)^2 + (y2 − y1)^2)
r<-numeric(100)
nn<-numeric(100)
d<-numeric(100)
for (i in 1:100) {
for (k in 1:100) d[k]<-distance(x[i],y[i],x[k],y[k])
r[i]<-min(d[-i])
nn[i]<-which(d==min(d[-i]))
}
for (i in 1:100) lines(c(x[i],x[nn[i]]),c(y[i],y[nn[i]]))
I adapted it and used the deg.dist function in fossil which uses the Haversine formula instead of using Pythagoras.
install.packages("fossil")
library(fossil)
Pond_A<-read.csv("C:\\ PondArea_data\\Pond_areas.csv")
r<-numeric(17305)
nn<-numeric(17305)
d<-numeric(17305)
for (i in 1:17305){
for (k in 1:17305) d[k]<-with(Pond_A,deg.dist(Longitude[i],Latitude[i],Longitude[k],Latitude[k]))
r[i]<-min(d[-i])
nn<-which(d<=1)
}
This appears to give me the identities of all the ponds in 1 km of the last pond. But try as I might I have not been able to work out how to get the answer for all the ponds. I would be very grateful if someone could give me a solution and perhaps explain why it works.
Thanks,
Aidan
You can create a Boolean matrix using gWithinDistance in the rgeos package. The row/col values represent the rownames of the sp object. You can then coerce the matrix to a dataframe and assign back to the sp object. For this example I use the meuse data from the sp package.