I have a rooted tree with spatially explicit edges (ln_sfnetwork
) with additional edges created by adding a point layer (pt
).
I would like to give all of the edges between each point on the network the same ID so I can calculate the total length of the network between points. I have a manual solution but this has to be done with large datasets > 20,000 points.
install.packages("remotes"); library("remotes")
install_github("luukvdmeer/sfnetworks")
library(sf)
library(sfnetworks)
library(dplyr)
ln <- structure(list(River_ID = c(159, 160, 161, 186, 196), geometry = structure(list(
structure(c(
289924.625, 289924.5313, 289922.9688, 289920.0625,
289915.7499, 289912.7188, 289907.4375, 289905.3438, 289901.1251,
289889, 289888.5, 289887.5938, 289886.5, 289886.4063, 289885.3124,
289884.0938, 289884.0001, 289882.8125, 289881.625, 289878.6875,
289877.9688, 289876.25, 289874.5625, 289874.25, 289872.7188,
289871.2813, 289871.1875, 289870.0313, 289869, 289868.5939,
289867.8436, 289865.8438, 289864.0625, 289862.5939, 289862.375,
289861.5, 289860.7812, 289860.5625, 289859.5313, 289858.375,
289857.7813, 289855.4063, 289854.25, 289850.8749, 289846.4376,
289841.9064, 289836.0625, 289828.1562, 289822.8438, 289816.625,
289812.4376, 289807.9064, 289798.75, 289793.125, 289786.2188,
289781.375, 289777.3124, 289770.0313, 289765.4375, 289762.2188,
289759.25, 289755.5938, 289753.0625, 289747.9687, 289743.7499,
289741.5938, 289739.5, 289736.1874, 289732.75, 289727, 289723.7499,
289719.625, 289715.5626, 289713.7499, 202817.531300001, 202817.2031,
202815.1094, 202812.468699999, 202809.3906, 202806.7656,
202799.7969, 202797.906300001, 202794.093800001, 202783.515699999,
202783.125, 202782.4844, 202781.906300001, 202781.8125, 202781.3594,
202781.093800001, 202780.9999, 202780.5469, 202780, 202777.625,
202777.0469, 202775.718800001, 202774.1875, 202773.906300001,
202772.1875, 202770.4531, 202770.25, 202768.5156, 202766.6719,
202766, 202764.0469, 202759.6719, 202755.8749, 202752.781300001,
202752.1875, 202749.953199999, 202748.297, 202747.906300001,
202746.0625, 202744.2344, 202743.5625, 202740.4375, 202738.8125,
202734.5, 202727.9844, 202723.5625, 202719.1875, 202714.9845,
202713.031300001, 202710.6875, 202710.0469, 202711.406300001,
202714.5626, 202716.9845, 202718.718900001, 202719.5469,
202718.734300001, 202716.4531, 202715.125, 202713.7344, 202712.093800001,
202709.8749, 202708.875, 202709.2655, 202710.7031, 202712.375,
202712.375, 202712.2344, 202711.0469, 202707.906300001, 202705.406300001,
202703.0469, 202701.468800001, 202700.7656
), .Dim = c(
74L,
2L
), class = c("XY", "LINESTRING", "sfg")), structure(c(
289954.375,
289953.5, 289950.6562, 289949.7499, 289949, 289948.125, 289946.0625,
289945.9688, 289944.5313, 289943.4063, 289941.3438, 289939.4375,
289937.4375, 289935.1875, 289932.75, 289930.625, 289928.8125,
289928.25, 289926.7188, 289925.5313, 289925.7813, 289925.625,
289925.4063, 289925.1251, 289924.625, 202872.75, 202872.031400001,
202868.7031, 202867.343699999, 202864.906199999, 202861.515699999,
202858.297, 202854.406300001, 202851.9375, 202849.468800001,
202847.703, 202846.75, 202845.4531, 202843.6719, 202843.0625,
202841.593900001, 202839.7344, 202839.2344, 202838, 202835.9375,
202832.875, 202825.7344, 202822.9531, 202819.4531, 202817.531300001
), .Dim = c(25L, 2L), class = c("XY", "LINESTRING", "sfg")), structure(c(
290042.6563, 290042.3437, 290041.5313, 290038.4376,
290037.625, 290036.5313, 290035.5313, 290034.8438, 290034.5313,
290033.7188, 290032.9375, 290032.125, 290030.3437, 290030.0313,
290028.625, 290027.5626, 290027.3438, 290026.7188, 290024.5313,
290023.625, 290020.625, 290018.0001, 290014.9375, 290012.0938,
290008.5625, 290004.375, 290000.0001, 289999.875, 289997.625,
289993.7188, 289990.5, 289987.1562, 289985.4063, 289980.375,
289973.3124, 289966.375, 289961.8438, 289959, 289954.375,
202884.0625, 202884.25, 202884.843800001, 202888.4531, 202889.75,
202891.0469, 202892.0469, 202892.656300001, 202892.843800001,
202893.2501, 202893.5469, 202893.656300001, 202893.4531,
202893.4531, 202893.343699999, 202893.093800001, 202893.0469,
202892.843800001, 202891.953199999, 202891.5469, 202889.843800001,
202888.218800001, 202885.1094, 202880.9219, 202877.5625,
202873.968800001, 202872.5469, 202872.5156, 202872.625, 202874.5469,
202876.734300001, 202878.1719, 202877.953199999, 202876.3125,
202873.468800001, 202872.031400001, 202872.906199999, 202873.0781,
202872.75
), .Dim = c(39L, 2L), class = c(
"XY", "LINESTRING",
"sfg"
)), structure(c(
290054.125, 290053.4375, 290052.5313,
290051.625, 290050.0313, 290048.125, 290044.125, 290040.4376,
290039.4375, 290036.9688, 290031.4375, 290027.5312, 290024.8125,
290021.7499, 290020.9688, 290018.3437, 290015, 290010.25,
290006.0313, 290002.4376, 290000.0001, 289999.2187, 289996.6875,
289995.3438, 289994.125, 289991.1875, 289989.2187, 289987.9688,
289986.125, 289980.5313, 289975.0314, 289970.9063, 289968.5625,
289961.0312, 289948.0001, 289939.625, 289933.1563, 289928.3125,
289926.5313, 289924.625, 202835.953199999, 202835.656300001,
202835.4531, 202835.343699999, 202835.5469, 202835.7656,
202836.25, 202836.4531, 202836.5469, 202836.5469, 202835.953199999,
202836.031400001, 202836.625, 202837.7969, 202838.4844, 202839.343699999,
202836.25, 202832.7656, 202832.3125, 202833.4844, 202834.4844,
202834.8125, 202834.2344, 202832.625, 202830.625, 202828.593800001,
202828.968800001, 202831.0625, 202833.2655, 202835.5781,
202838, 202838.906199999, 202839.125, 202836.4531, 202830.781300001,
202827.093800001, 202823.625, 202818.5, 202817.5625, 202817.531300001
), .Dim = c(40L, 2L), class = c("XY", "LINESTRING", "sfg")), structure(c(
290042.625, 290042.0313, 290041.2187, 290040.3125,
290038.4063, 290037.7188, 290035.8125, 290033.7188, 290030.9063,
290028.2187, 290021.5313, 290021.2187, 290014.2188, 290013.4063,
290012.3125, 290010.0625, 290007.9375, 290005.9688, 290004.125,
290000.0001, 289999.4063, 289998.3125, 289997.5312, 289996.8438,
289993.625, 289993.0314, 289989.7188, 289989.3438, 289987.625,
289987.2187, 289984.0313, 289978.125, 289977.9375, 289974.3437,
289972.7188, 289970.9375, 289967.9375, 289965.2187, 289965.1563,
289962.3437, 289960.5313, 289959.1251, 289959.0314, 289959.3438,
289959.4375, 289959.4375, 289959.3438, 289959.2187, 289958.9375,
289958.5313, 289956.125, 289954.375, 202953.781300001, 202952.4844,
202951.281300001, 202950.0781, 202948.1875, 202947.5781,
202945.8749, 202944.281300001, 202941.781300001, 202940.1875,
202936.375, 202936.1875, 202931.968800001, 202931.4844, 202930.875,
202929.093800001, 202927.1094, 202925.031300001, 202922.734300001,
202917.2031, 202916.4375, 202915.2031, 202914.5469, 202914.4531,
202911.4531, 202910.843800001, 202908.0469, 202907.75, 202906.75,
202906.5469, 202904.843800001, 202901.843800001, 202901.75,
202900.0469, 202899.156400001, 202898.0469, 202894.656300001,
202892.0469, 202891.9844, 202889.343699999, 202887.656300001,
202885.75, 202884.5469, 202883.343699999, 202882.5469, 202881.343699999,
202880.0469, 202879.343699999, 202877.656300001, 202876.25,
202874.25, 202872.75
), .Dim = c(52L, 2L), class = c(
"XY",
"LINESTRING", "sfg"
))
), n_empty = 0L, crs = structure(list(
input = "OSGB 1936 / British National Grid", wkt = "PROJCRS[\"OSGB 1936 / British National Grid\",\n BASEGEOGCRS[\"OSGB 1936\",\n DATUM[\"OSGB 1936\",\n ELLIPSOID[\"Airy 1830\",6377563.396,299.3249646,\n LENGTHUNIT[\"metre\",1]]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n ID[\"EPSG\",4277]],\n CONVERSION[\"British National Grid\",\n METHOD[\"Transverse Mercator\",\n ID[\"EPSG\",9807]],\n PARAMETER[\"Latitude of natural origin\",49,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8801]],\n PARAMETER[\"Longitude of natural origin\",-2,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8802]],\n PARAMETER[\"Scale factor at natural origin\",0.9996012717,\n SCALEUNIT[\"unity\",1],\n ID[\"EPSG\",8805]],\n PARAMETER[\"False easting\",400000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8806]],\n PARAMETER[\"False northing\",-100000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8807]]],\n CS[Cartesian,2],\n AXIS[\"(E)\",east,\n ORDER[1],\n LENGTHUNIT[\"metre\",1]],\n AXIS[\"(N)\",north,\n ORDER[2],\n LENGTHUNIT[\"metre\",1]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"UK - Britain and UKCS 49°46'N to 61°01'N, 7°33'W to 3°33'E\"],\n BBOX[49.75,-9.2,61.14,2.88]],\n ID[\"EPSG\",27700]]"
), class = "crs"), class = c(
"sfc_LINESTRING",
"sfc"
), precision = 0, bbox = structure(c(
xmin = 289713.7499,
ymin = 202700.7656, xmax = 290054.125, ymax = 202953.781300001
), class = "bbox"))), row.names = c(NA, -5L), class = c(
"sf",
"data.frame"
), sf_column = "geometry", agr = structure(c(River_ID = NA_integer_), .Label = c(
"constant",
"aggregate", "identity"
), class = "factor"))
pt <- structure(list(lat = c(
202805.8942, 202836.136, 202872.9487,
202905.3284
), lng = c(
289912.0584, 290014.8446, 290001.2364,
289984.9382
), id = 1:4, geometry = structure(list(structure(c(
289912.058400425,
202805.894199679
), class = c("XY", "POINT", "sfg")), structure(c(
290014.844597566,
202836.136003318
), class = c("XY", "POINT", "sfg")), structure(c(
290001.236395958,
202872.948712436
), class = c("XY", "POINT", "sfg")), structure(c(
289984.938209474,
202905.32838227
), class = c("XY", "POINT", "sfg"))), n_empty = 0L, crs = structure(list(
input = "OSGB 1936 / British National Grid", wkt = "PROJCRS[\"OSGB 1936 / British National Grid\",\n BASEGEOGCRS[\"OSGB 1936\",\n DATUM[\"OSGB 1936\",\n ELLIPSOID[\"Airy 1830\",6377563.396,299.3249646,\n LENGTHUNIT[\"metre\",1]]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n ID[\"EPSG\",4277]],\n CONVERSION[\"British National Grid\",\n METHOD[\"Transverse Mercator\",\n ID[\"EPSG\",9807]],\n PARAMETER[\"Latitude of natural origin\",49,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8801]],\n PARAMETER[\"Longitude of natural origin\",-2,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8802]],\n PARAMETER[\"Scale factor at natural origin\",0.9996012717,\n SCALEUNIT[\"unity\",1],\n ID[\"EPSG\",8805]],\n PARAMETER[\"False easting\",400000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8806]],\n PARAMETER[\"False northing\",-100000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8807]]],\n CS[Cartesian,2],\n AXIS[\"(E)\",east,\n ORDER[1],\n LENGTHUNIT[\"metre\",1]],\n AXIS[\"(N)\",north,\n ORDER[2],\n LENGTHUNIT[\"metre\",1]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"UK - Britain and UKCS 49°46'N to 61°01'N, 7°33'W to 3°33'E\"],\n BBOX[49.75,-9.2,61.14,2.88]],\n ID[\"EPSG\",27700]]"
), class = "crs"), class = c(
"sfc_POINT",
"sfc"
), precision = 0, bbox = structure(c(
xmin = 289912.058400425,
ymin = 202805.894199679, xmax = 290014.844597566, ymax = 202905.32838227
), class = "bbox"))), row.names = c(NA, 4L), class = c(
"sf",
"data.frame"
), sf_column = "geometry", agr = structure(c(
lat = NA_integer_,
lng = NA_integer_,
id = NA_integer_
), .Label = c(
"constant",
"aggregate", "identity"
), class = "factor"))
Make it an sfnetwork
ln_sfnetwork <- as_sfnetwork(ln)
Add the point geometry
ln_sfnetwork <- st_network_blend(ln_sfnetwork, st_geometry(pt))
Use the point geom to create new edges by essentially splitting the sfnetwork
ln_sf <- ln_sfnetwork %>%
activate("edges") %>%
mutate(new_river_id = as.character(1:n())) %>%
st_as_sf()
The manual bit to group all of the edges between the nodes from pt
. This groups all of the edges occuring between the points.
ln_sf[["new_river_id"]][2] <- "1"
ln_sf[["new_river_id"]][c(1, 3, 7, 5, 9)] <- "2"
ln_sf[["new_river_id"]][6] <- "3"
ln_sf[["new_river_id"]][4] <- "4"
ln_sf[["new_river_id"]][8] <- "5"
Desired output
Then group_by
to get the combined length
ln_sf_merged <- ln_sf %>%
group_by(new_river_id) %>%
summarise(fraglen = sum(seglen)) %>%
as.data.frame() %>%
select(-geometry)
Finally
out_sf <- ln_sf %>%
left_join(ln_sf_merged, by = "new_river_id")
Interesting problem and nice to see
sfnetworks
being used on other type of networks than road networks! I gave this some thought. Beware, the answer is long and may be complicated sometimes.It is clear what you want the output of your example to be, but the example does not cover many different possibilities of how the network structure can look like after blending in the given points. Therefore it is hard to generalize the rules that should be used for merging. But I made some assumptions regarding that:
As far as I understand you want to group edges based on what the first point is that you pass when travelling downstream on the river network. Hence, if the paths downstreams (towards the root of the tree) from edge 1 and edge 2 both pass through point 1 before passing any of the other points, they should be in the same group. All edges from which downstream travelling reaches the root without passing through any of the points should be together in a single group as well.
So the ultimate goal is to assign a group index to each of the edges. In
tidygraph
(the library on whichsfnetworks
is build), there are several grouping functions that can be applied to either the nodes or edges of the network. Such grouping functions are meant to be used inside tidyverse verbs likemutate()
andfilter()
, where the network that is being worked on is known and not needed as an argument to the grouping function. Hence, you can runnetwork %>% mutate(group = group_components())
, without the need to explicitly forward the network as argument togroup_components()
.All nice and well, but as far as I know
tidygraph
does not have a grouping function implemented that addresses your problem. That means, we should create our own! I gave it a try. It may not be exactly what you need, but at least it should give you a good starting point to build further on.The idea is that we first calculate the distances from the nodes that correspond with the points you added to the network, to all nodes in the network. Then, we select for each point a set of nodes such that the travel time from that point to these nodes is lower than from any other point. The edges that connect the nodes in such a set form the group of edges belonging to that point.
I created another example network which covers more possible cases. The first part of the code below is mainly constructing that network. Then we get to writing the custom group function, and finally applying it to my example network, and also to yours.
Created on 2021-02-03 by the reprex package (v0.3.0)