My main usage of generators is processing of rows of CSV files stored on a remote server. It allows me to have consistent interfaces of linearly processing the data stored in them.
Now, I am using paramiko in order to access an SFTP server that stores the files - and paramiko has an outstanding issue of not properly closing connections if you did not close the file itself.
I've got a simple interface of accessing a single file on the sftp (this is obviously a pseudocode - I am omitting the connection error handling code and so on).
def sftp_read_file(filename):
with paramiko.open(filename) as file_obj:
for item in csv.reader(file_obj):
yield item
def csv_append_column(iter_obj, col_name, col_val):
# header
yield next(iter_obj) + (col_name, )
for item in iter_obj:
yield item + (col_val, )
Let's say I would like to test a set of transformations done to the file by running the script for a limited amount of rows:
def main():
for i, item in enumerate(csv_append_column(sftp_read_file('sftp://...'), 'A', 'B')):
print(item)
if i > 0 and i % 100 == 0:
break
The script will exit, but the interpreter will never terminate without SIGINT. What are my possible solutions?
This isn’t the most elegant solution, but maybe we could build off @tadhg-mcdonald-jensen’s suggestion by wrapping the generator in an object:
And then use it like this:
Alternatively, we can just wrap the generator itself if we aren't using the generator methodology for streaming:
Now we can call it like:
This will make sure the with block exits and
paramiko
closes the sftp connection but comes at the expense of reading all of the lines into memory at once.