Calling and executable with arguements and getting its STDOUT more efficiently in a Python Script

873 views Asked by At

I'm writing a script that takes longitude and latitude and runs them through an executable called gdallocationinfo. The executable takes the latitude and longitude as its arguments and returns its STDOUT as the value for that coordinate. I've been reading about sub-processes and I was wondering if this is the most efficient way to implement this if I want to run alot of points. It seems to be taking a really long time.

def gdalgetpointdata(longitude,latitude):
    proc = subprocess.Popen(["gdallocationinfo","C:\Users\data\pop","-wgs84","-valonly","{0}".format(longitude),"{0}".format(latitude)], stdout=subprocess.PIPE, shell=True)
    (out, err) = proc.communicate()
    return int(out)

Is there a better way to call this executable without having to make a new sub-process every time I run my function? Is there something I could do to speed it up?

As a side note I know that if you run the executable from the command line it will keep accepting STDIN and giving ouputs in STDOUT until you tell it to quit()

2

There are 2 answers

2
Mike T On

The utility can do multiple coordinates with one call. For example, prepare a simple coords.txt text file with coordinate pairs:

1.0 2.0
3.0 4.0
5.0 6.0

Then pipe it in and out of gdallocationinfo from an OSGeo4W shell:

gdallocationinfo -wgs84 -valonly raster.tif < coords.txt > values.txt

which will make a values.txt file with the value for each coordinate. You can do the same with Popen with PIPE stdin and stdout arguments.

coords = [(1.0, 2.0), (3.0, 4.0), (5.0, 6.0)]
coord_txt = ''.join(['{0} {1}\n'.format(*c) for c in coords])
p = subprocess.Popen(
    ['gdallocationinfo', r'C:\path\to\raster.tif', '-wgs84', '-valonly'],
    stdin=subprocess.PIPE, stdout=subprocess.PIPE, universal_newlines=True, shell=True)
values_txt, err = p.communicate(coord_txt)
values = values_txt.splitlines()

values will be a list of values, the same length as coords.

1
jfs On

Here's a version of @Mike T's answer that passes input, reads output dynamically (not tested):

#!/usr/bin/env python
from __future__ import print_function
from subprocess import Popen, PIPE
from threading import Thread

def pump_input(pipe, coords):
    with pipe:
        for longitude, latitude in coords:
            print(longitude, latitude, file=pipe)

p = Popen(['gdallocationinfo', r'C:\path\to\raster.tif', '-wgs84', '-valonly'],
          shell=True, #NOTE: don't use a list argument with shell=True on POSIX
          stdin=PIPE, stdout=PIPE, bufsize=-1,
          universal_newlines=True)
Thread(target=pump_input, args=[p.stdin, coords]).start()
with p.stdout:
    for line in p.stdout:
        print(int(line))
if p.wait() != 0:
   raise Error

It might be more efficient to use GDAL's Python bindings instead.