Extract count of intersections for Open Street Map way IDs, by route type

1.7k views Asked by At

Edited with additional details added

I have a shapefile of 2,061 Open Street Map (OSM) road segments. Each segment in my shapefile is is identified by its OSM Way ID.

Here is an example of five segments from my data:

data = 
structure(list(osm_id = structure(1:5, .Label = c("17990110", 
"17993246", "17994983", "17994985", "17995338"), class = "factor"), 
    name = structure(c(1L, 3L, 4L, 5L, 2L), .Label = c("109th Avenue Northeast", 
    "85th Avenue Northeast", "Bunker Lake Boulevard Northeast", 
    "Foley Boulevard", "Northdale Boulevard Northwest"), class = "factor")), row.names = c(NA, 
5L), class = c("sf", "data.frame"))

For each of these 2061 road segments, I want to count the number of intersections with other roads, separately for each road type (residential, primary, tertiary...).

For example, this OSM way intersects 27 other ways, 11 of which are "residential," and 3 of which are "secondary" highways.

This is secondary to the analysis, but ultimately, for intersections where multiple types of roads join, I will select the "largest" type of road. For example, this node joins a service road and residential road; I would like to select the residential road road. I think I can create my own hierarchy list for this and deal with it later.

I am trying to use R Open Sci package osmdata

So far I can get at #2 (signalized intersections) using osmdata:

node_dat <- opq_osm_id(type = "node", id = '17990110')%>%
  opq_string()%>%
  osmdata_sf

node_dat_pts <- node_dat$osm_points

nrow(node_dat_pts[node_dat_pts$traffic_signals %in% "signal",])

This reveals that there are three traffic signals along this OSM segment. (Although, in reality, there are only two signalized intersections -- two signals are associated with a divided highway...but that might be a problem for another time).

I'm an R native, which is why the osmdata package is so appealing to me, but I'm open to exploring querying in Overpass API. TBH I found the example on how to get intersection nodes on the website not quite applicable -- and I'm unclear of how to scale up this process to the long list of 2000+ way IDs that I have (but if the docs or an example exist, point me to it).

I'm also open to exploring other tool libraries in Python; the Python osmnx package has what seem like excellent tools to tool to calculate "intersection density," but for specified polygons, like city boundaries, and does not seem to have functionality to create calls within Python by way or node ID.

I also know that I could probably do this in ArcGIS or QGIS, but because I have these OSM IDs handy in my database already, it just seems like a waste to load a whole dang shapefile of intersections for a large metropolitan region and do some kind of buffering process to get the information I need. Plus if I had a script handy to extract some pieces of information by way or node ID, I could more easily broaden the kinds of data I extract to get other tidbits of great information recorded for OSM elements.

Thank you spatial data community!

2

There are 2 answers

1
mpadge On BEST ANSWER

Traffic signals should always be tagged "highway" = "traffic_signals", but individual nodes may also be tagged with a key of "traffic_signals". Thus the first step to get all traffic signals can be done like this:

library(osmdata)
hw <- opq("minneapolis") %>%
    add_osm_feature(key = "highway") %>%
    osmdata_sf()
signal_nodes <- opq("minneapolis") %>%
    add_osm_feature(key = "traffic_signals") %>%
    osmdata_sf()
index <- which (!is.na (hw$osm_points$traffic_signals) |
                grepl ("traffic_signals", hw$osm_points$highway)) # grepl because can be "traffic_signals:<value>"
signal_node_ids <- unique (c (signal_nodes$osm_points$osm_id, hw$osm_points$osm_id [index]))

That then holds all the OSM ID values of nodes describing traffic signals.

One straightforward way to obtain junction densities is to convert the sf representation of highways to a dodgr network, which is a simple data.frame with each row being a network edge. The poly2line step converts strict sf polygons such as roundabouts into linestring objects, while the dodgr_contract_graph() call reduces the network to junction vertices only.

library(dodgr)
hw <- osm_poly2line(hw)$osm_lines %>%
    weight_streetnet(keep_cols = "highway", wt_profile = 1) %>% # wt_profile irrelevant here
    dodgr_contract_graph()

table(net$highway) will give you the frequencies of different kinds of highways. You can then inspect junction frequencies for particular types as follows

net_footway <- net[net$highway == "footway", ]
table(table(c(net_footway$from_id, net_footway$to_id)))

Values of 1 indicate one-way terminal nodes; values of 2 indicate two-way terminal nodes; values of 4 indicate cross-junctions between two edges; and so on. That table goes up to 14, because footways can be quite complex, and there is obviously a junction of seven footways somewhere in Minneapolis. Those IDs are OSM ids, so you can easily check which of them are in the signal_node_ids values to determine which have traffic signals.


Remaining issues to address:

  1. Intersections between singularly defined "highway" types are easy, but intersections between different types would require more complicated modifications to this code. Although straightforward, you'd need to consistently sub-set the dodgr data.frame in which edges are directed: $from_id -> $to_id.
  2. Associating signals with particular junctions will likely require some kind of spatial buffering, because as you suggest a single intersection may have multiple nodes with "traffic_signals". Note that there is no "right" way to do this, because, for example, junctions may have separate signals for pedestrians and automobiles, and decisions of whether to consider these to be "the same" signals will always entail some degree of subjectivity.
1
Ashley Asmus On

I ended up creating a few custom functions that rely upon the osmdat package. I discovered that osmdat allows the user to pass custom API calls to Overpass. After lots of trial and error in Overpass Turbo I figured out the Overpass syntax "well enough" to extract the information I needed. I think these three separate functions could be combined into a single Overpass API call, but that's beyond me.

So I first created a function to get a list of all the ways that were related to my "focal" way (1 of the 2,061 segments in my data frame):

get_related_ways <- function(wayid, bboxstring){
  related_ways_query <- 
    paste0("[bbox:", bboxstring,"];\n",
           "way(id:", wayid, ");\n",
           "rel(bw);\n", # get all sibling 
           "way(r);\n",
           "out;")

  related_ways <- osmdata_sf(related_ways_query)
  related_ways <- related_ways$osm_lines
  related_ways <- related_ways$osm_id
  return(related_ways)
}

I then created a function to extract just the node IDs for my focal way.

get_nodes_from_way <- function(wayid){
  nodes_from_way <- opq_osm_id(id = wayid, "way")%>%osmdata_sf()
  nodes_from_way <- nodes_from_way$osm_points
  nodes_from_way$geometry <- NULL
  setnames(nodes_from_way, old = 'osm_id', new = 'node_from_way_id')
  return(nodes_from_way)
}

And a third function to get the IDs of all ways intersecting my focal way. This function requires as input the node IDs of the focal way.

get_intersecting_ways <- function(nodeid, bboxstring){

  node_to_intways_query <- 
    paste0("[bbox:", bboxstring,"];\n",
           "node(id:", nodeid, ");\n",
           "way(bn)[highway];\n",
           "out;")

  intways_from_node <- osmdata_sf(node_to_intways_query)
  intways_from_node <- intways_from_node$osm_lines
  intways_from_node$geometry <- NULL
  intways_from_node <- intways_from_node[,names(intways_from_node) %in%c("osm_id", "name", "highway", "ref")]

  setnames(intways_from_node, old = 'osm_id', new = 'intersecting_way_id')
  setnames(intways_from_node, old = 'highway', new = 'intersecting_way_type')
  setnames(intways_from_node, old = 'ref', new = 'intersecting_way_ref')
  setnames(intways_from_node, old = 'name', new = 'intersecting_way_name')

  return(intways_from_node)
}

For all three functions, I have a "bboxstring" or bounding box string passed to Overpass, hoping to speed up the queries. Ha. Hoping...

Anyways. I created a nested for loop (don't judge me, I know purrr exists, I just find them intuitive!) using these three functions. I I also tried to machete my way through parallelizing this using foreach and doParallel, and separated my data set into 100 chunks of 26 ways each. It is still VERY slow. The Overpass API is slow maybe? More likely I have done something wrong in setting this up, this is only my 3rd or 4th time using doParallel.

for(this_part in unique(cmp_osmdat$partnum)){

osm_character_ids <- as.character(cmp_osmdat$osm_id)

  # test:
  # osm_character_ids <- osm_character_ids[1:3]

  # for each parallel process to get our intersecting ways ("all ways")
  all_ways <- 
    foreach(w = seq_along(osm_character_ids), 
            # require list of packages from above:
            .packages = packs, 
            .errorhandling = "remove", # remove data frames that throw an error
            # print the stuff: 
            .verbose = TRUE) %dopar% { 

              environmentIsLocked(asNamespace("curl"))
              unlockBinding(sym = "has_internet", asNamespace("curl"))
              assign(x = "has_internet", value = {function() T}, envir = asNamespace("curl"))


              this_way_id <- osm_character_ids[[w]]

              # find ways that are related to this one (same road, different segments)
              # so that we can filter these out as intersections:
              these_related_ways <- get_related_ways(this_way_id, this_bbox_string)

              # get nodes of this way:
              these_nodes_from_way <- get_nodes_from_way(this_way_id)

              # adding a column to store this way id, for easy rbind later 
              # (foreach doesn't store list names?)
              these_nodes_from_way$way_id <- this_way_id

              # create an empty list to store interesecting ways for each node:
              these_intersecting_ways <- list()

              # get intersecing ways from nodes: 
              for(n in seq_along(these_nodes_from_way$node_from_way_id)){
                this_node <- these_nodes_from_way$node_from_way_id[[n]]
                # put intersecting ways into our empty list (the name of the list item will be the node ID)
                these_intersecting_ways[[this_node]] <- get_intersecting_ways(this_node, this_bbox_string)
              } # end get intersecting ways from node

              # combine intersecting ways of each node into one data table:
              these_intersecting_ways_df <- rbindlist(these_intersecting_ways, idcol = 'node_from_way_id', use.names = T, fill = T)

              # get rid of intersections with this way's realtives (other segments of the same road):
              these_intersecting_ways_df <- these_intersecting_ways_df[!these_intersecting_ways_df$intersecting_way_id %in% these_related_ways,]

              # to get node information, merge intersecting ways to our node data: 
              nodes_and_ways <- merge(these_intersecting_ways_df, these_nodes_from_way, by = 'node_from_way_id')

              # return node and intersection data
              return(nodes_and_ways)

            } # end foreach

  nodes_and_ways_df <- rbindlist(all_ways, use.names = T, fill = T)

  # save file, one for each part (results in 10 csvs)
  write.csv(nodes_and_ways_df, 
            file = paste0("intersection_density_results/intersection-density-data-part-", this_part, ".csv"), row.names = F)

} # end 10 parts

stopCluster(cl)

The general logic of this process is:

  1. Select all the WayIDs from the data frame chunk (1-100)
  2. Grab a Way ID (the "Focal way") from my list of 26 ways in the chunk
  3. Find the Way IDs of the focal way's "siblings."
  4. Extract the node ids that comprise the focal way (along with information on where there are signals - TY, osmdata
  5. For each node id of the focal way, find the way ids of the ways that intersect it. Also grab the classification of those ways.
  6. Get rid of the "intersecting ways" that are actually siblings of the Focal Way -- these segments that are continuations of the Focal Way. (For example, I would delete this way from the list of intersecting ways if my Focal Way was this way
    1. rbindlist forever

This should take about 2-3 hours to run for all 2061 segments. That's a long time; but, even direct queries in Overpass Turbo are slow, so...maybe this is about right.