Problem importing geodata from file GDB with geopandas in python3.x

1k views Asked by At

I'm working on some code to select and export geodata based on a bounding box. The data I want to select comes from 2 seperate layers in a huge File GDB (16GB) covering the entire Netherlands. I use a bounding box as to avoid reading the entire dataset before making a selection.

This method works great when applied on a gpkg database, but with a file geodatabase the time to process is way longer (0,2s vs 300s for a 200x200 meter selection). The File GDB I'm using has a spatial index set for the layers I'm reading. I'm using geopandas to read and select. Below you'll find an example for the layer 'Adres':

import geopandas as gpd

def ImportGeodata(FilePath, BoundingBox):
    importBag=gpd.read_file(FilePath, layer='Adres', bbox=BoundingBox)
    importBag['mergeid']=importBag['identificatie']
    return importBag

Am I overseeing something? Or is this a limitation when importing from a huge File GDB? I can't find an obvious mistake here. For now the workaround is another script that imports and dumps the layers I need in a gpkg. Problem is this runs for 3 to 4 hours (gpkg result is almost 6 GB). I don't want to keep doing that, it would be necessary to do once every month or so in order to process a new version of this dataset.

Curious what you guys come up with.

0

There are 0 answers