I have what is probably a really basic Python question.
I'm trying to write a script that eliminates a bunch of blank rows in some .csv files, and the script I've written works on about 90% of my files, but a few throw the following error at me:
Traceback (most recent call last):
File "/Users/stephensmith/Documents/Permits/deleterows.py", line 17, in <module>
deleteRow(file, "output/" + file)
File "/Users/stephensmith/Documents/Permits/deleterows.py", line 8, in deleteRow
for row in csv.reader(input):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py", line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/encodings/utf_8_sig.py", line 69, in _buffer_decode
return codecs.utf_8_decode(input, errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 6540: invalid start byte
Here's my code:
import csv
import os
def deleteRow(in_fnam, out_fnam):
input = open(in_fnam, 'r')
output = open(out_fnam, 'w')
writer = csv.writer(output)
for row in csv.reader(input):
if any(row):
writer.writerow(row)
input.close()
output.close()
for file in os.listdir("/Users/stephensmith/Documents/Permits/"):
print(file)
if file.endswith(".csv"):
deleteRow(file, "output/" + file)
I've tried adding encoding='utf-8', ='ascii', and ='latin1' to both of my open() statements, but no luck. :-( Any idea what I'm doing wrong? The .csv files were created with Excel for Mac 2011, if that helps at all.
Perhaps you could try looping through the csv files that are crashing with something like:
to see if any suspicious characters are popping up.
If you were able to identify the suspicious characters this way, say \0Xý1 pops up, you could clean the file by rewriting and replacing that character:
then try again with the cleaned file.