How to speed up geopandas.sjoin_nearest using spatial indexing?

363 views Asked by Andrei At 30 April 2023 at 11:25

I have 2 data frames with routes and linestrings:

df1 = {
  "Route": ["AL013-AL015", "AL013-AL014", "AL013-AL011"],
  "Linestring": ["LINESTRING (20.40350 42.06510, 19.70210 42.16300)", "LINESTRING (20.40350 42.06510, 19.84780 41.78380)", "LINESTRING (20.40350 42.06510, 20.25610 41.60390)"],
}

df2 = {
  "Route": ["NO0A3-NO071", "NO0A3-NO091", "NO0A3-NO0A3"],
  "Linestring": ["LINESTRING (8.53910 62.52120, 14.78250 66.69440)", "LINESTRING (8.53910 62.52120, 8.70540 59.49660)", "LINESTRING (8.53910 62.52120, 8.53910 62.52120)"],
}

The problem is that they are large (df1 has approx 2 million rows and df2 has 300k rows). I want to use geopandas.sjoin_nearest like this:

df_new = gpd.sjoin_nearest(df2, df1, how = 'left')

Nevertheless, the computational time is very long. Are there any ways to speed it up? I was googling it and found spatial indexing in geopandas. But I am not sure if it works with linestrings. It would be nice if someone could explain to me how to apply spatial indexing to linestrings or any other ways to speed up the join.

Original Q&A

TechQA.

How to speed up geopandas.sjoin_nearest using spatial indexing?

There are 0 answers

Related Questions in PYTHON

Related Questions in GIS

Related Questions in GEOPANDAS

Related Questions in SPATIAL-INDEX

Popular Questions

Trending Questions