I have a spatial point data frame -> spatial_points
and a polygon -> spatial_poly
I can subset all points within the polygon using
subset_within <- spatial_points[spatial_poly,] which is nice and intuitive.
But if I want to subset all points outside the polygon, I can't use
subset_ouside <- spatial_points[-spatial_poly,]
This question has been asked before, and the answer was to use gDifference()
from the rgeos
package. Fine.
My question is, why does [ ] work for selecting within, not the inverse? I don't really understand the error message
Error in h(simpleError(msg, call)) : error in evaluating the argument 'i' in selecting a method for function '[': invalid argument to unary operator
Just curious. Thanks.
EDIT
Here is an example borrowed from Subset spatial points with a polygon
require(rgeos)
require(sp)
##create spdf
coords=expand.grid(seq(150,151,0.1),seq(-31,-30,0.1))
spdf=data.frame("lng"=coords[,1],"lat"=coords[,2])
coordinates(spdf) = ~lng+lat
proj4string(spdf)<- CRS("+init=epsg:4326")
plot(spdf)
##create poly
poly1 = SpatialPolygons(list(Polygons(list(Polygon(cbind(c(150.45,150.45,150.75,150.75,150.45),c(-30.75,-30.45,-30.45,-30.75,-30.75)))),ID=1)))
proj4string(poly1)<- CRS("+init=epsg:4326")
##get points withing polygon
points_within <-spdf[poly1,] # this works
plot(spdf)
plot(poly1, add=T)
plot(points_within,col="blue",pch=16,add=T)
##get points outside polygon
points_outside <-spdf[-poly1,] # this does not work - why??
In this simple example one can use gDifference()
, which works in this example. However, my SpatialPointDataframe is very large, and using gDifference crashes R.
When you do
df[2, 1]
in R, you are actually calling a function. The function is'['(df, 1, 2)
. It's just the parser hides this from you, and this allows you to write code in a more natural way.If you think about it, the
[
operator does different things depending on the type of object you are using, even if the operations are conceptually similar. The actual code that returns a subset of a numeric vector is different from the code that returns the subset of a matrix, or of a list. In fact, there are some objects in R for which calling the[
function doesn't make sense and isn't implemented. For example, if you try to call it on a function name:If you create a complex new class in R with various different members, you need to define what the
[
operator means, and you need to implement it. What does it mean to subset aSpatialPoints
class by aSpatialPolygon
class? R has no way to know that on its own, so when the author of thesp
package created theSpatialPolygons
class, he had to write the methods that do the subsetting based on the operands that are passed to the operator[
. You can see the source code here.If you trace the logic through, you will see that in the case of
spdf[poly1,]
, the subset is determined by the use of other spatial functions, and boils down toAnd these numeric subsets are then used to subset the actual polygons to return a new object consisting of just the subset. That means we could get the
points_outside
in a similar way:But to answer your main question, which is why does
spdf[-poly1,]
not work, you have to realise that this actually means'['(spdf, -poly1)
. For this to be evaluated, it would first be necessary to evaluate-poly1
, but if you try that, then you get:Of course, it doesn't really make sense to apply the
-
operator to aSpatialPoints
object on its own. Take the points away from what?In truth, it would be possible to write the function so it works in this way, but it would require a complex bit of non-standard evaluation. You could submit it as a feature request on that GitHub page, but personally I'd be happy using the function above.
I hope that makes things clearer.