I'm trying to just get non-na values returned from osmdata. For example, take email address. However, the below returns mostly missing emails. How can I set up the query so that it returns only non-missing values... value = "!null"
didn't work either.
library(osmdata)
san <- opq(bbox = 'San Jose, California') %>%
add_osm_feature(key = 'email') %>%
osmdata_sf()
df <- san$osm_points
nrow(df)
sum(!is.na(df$email))
The
osmdata
package follows the same hierarchical structure as the Open Street Map data themselves. If you look into your data a bit further you'll see the following:The
osm_multipolygons
are the highest-level objects in the hierarchy, and all of them have email addresses. Each of those also consists of numerous polygons, not all of which will necessarily have email addresses, and so there are 3 polygons with no email. The points list by default, both in OSM itself andosmdata
, includes every single point that is part of every higher-level object, and so unavoidably includes very many points with no email addresses. So there is no way you can issue a query only for objects in each category which do not have missing values (see further information in repo issue#221).The result you desire can nevertheless be obtained via the
unique_osmdata()
function, which reduces each of the different types of objects down to only those unique values as requested in the original call. This should give you what you want: