Very slow performance with DotSpatial shapefile

3.8k views Asked by At

I'm trying to read all of the feature data from particular shapefile. In this case, I'm using DotSpatial to open the file, and I'm iterating through the features. This particular shapefile is only 9mb in size, and the dbf file is 14mb. There is roughly 75k features to loop through.

Note, this is all programmatically through a console app, so there is no rendering or anything involved.

When loading the shape file, I reproject, then I'm iterating. The loading an reprojecting is super quick. However, as soon as the code reaches my foreach block, it takes nearly 2 full minutes to load the data, and uses roughly 2GB of memory when debugging in VisualStudio. This seems very, very excessive for what's a reasonably small data file.

I've ran the same code outside of Visual Studio, from the command line, however the time is still roughly 2 full minutes, and about 1.3GB of memory for the process.

Is there anyway to speed this up at all?

Below is my code:

// Load the shape file and project to GDA94
Shapefile indexMapFile = Shapefile.OpenFile(shapeFilePath);
indexMapFile.Reproject(KnownCoordinateSystems.Geographic.Australia.GeocentricDatumofAustralia1994);

// Get's slow here and takes forever to get to the first item
foreach(IFeature feature in indexMapFile.Features)
{
    // Once inside the loop, it's blazingly quick.
}

Interestingly, when I use the VS immediate window, it's super super fast, no delay at all...

2

There are 2 answers

1
Juzzbott On

I've managed to figure this out...

For some reason, calling foreach on the features is painfully slow.

However, as these files have a 1-1 mapping with features - data rows (each feature has a relevant data row), I've modified it slightly to the following. It's now very quick.. less than a second to start the iterations.

// Load the shape file and project to GDA94
Shapefile indexMapFile = Shapefile.OpenFile(shapeFilePath);
indexMapFile.Reproject(KnownCoordinateSystems.Geographic.Australia.GeocentricDatumofAustralia1994);

// Get the map index from the Feature data
for(int i = 0; i < indexMapFile.DataTable.Rows.Count; i++)
{

    // Get the feature
    IFeature feature = indexMapFile.Features.ElementAt(i);

    // Now it's very quick to iterate through and work with the feature.
}

I wonder why this would be. I think I need to look at the iterator on the IFeatureList implementation.

Cheers, Justin

1
ipernas On

This has the same problem for very large files (1.2 millions of features), populating .Features collections never ends.

But if you ask for the feature you do not have memory or delay overheads.

        int lRows = fs.NumRows();
        for (int i = 0; i < lRows; i++)
        {

            // Get the feature
            IFeature pFeat = fs.GetFeature(i); 

            StringBuilder sb = new StringBuilder();
            {
                sb.Append(Guid.NewGuid().ToString());
                sb.Append("|");
                sb.Append(pFeat.DataRow["MAPA"]);
                sb.Append("|");
                sb.Append(pFeat.BasicGeometry.ToString());
            }
            pLinesList.Add(sb.ToString());
            lCnt++;

            if (lCnt % 10 == 0)
            {
                pOld = Console.ForegroundColor;
                Console.ForegroundColor = ConsoleColor.DarkGreen;
                Console.Write("\r{0} de {1} ({2}%)", lCnt.ToString(), lRows.ToString(), (100.0 * ((float)lCnt / (float)lRows)).ToString());
                Console.ForegroundColor = pOld;
            }

        }

Look for the GetFeature method.