How do I remove the extra commas and get the correct format of output csv file

519 views Asked by At

I am running the following code:-

import csv
import sys
from collections import OrderedDict

file_name='sample.txt'
with open(file_name,'rb') as f:               
    reader = csv.reader(f)  
    headers = reader.next()
    p=[]
    for row in reader:

        row[0] = row[0].zfill(6) 
        row[2] = row[2].zfill(6)
        row[3] = row[3].zfill(6)
        row[4] = row[4].zfill(6)
        row[1] = row[1][5:7] + "-" + row[1][8:10] + "-" + row[1][:4]
        p.append(row[:5])
print p

with open('sample_out.txt', 'wb') as ofile: 
    header = ['User_ID','Date','Num_1','Num_2','Com_ID']
    extra_headers = sys.argv
    header.extend(sys.argv[1:])
    n = len(sys.argv)
    writer = csv.DictWriter(ofile, fieldnames=header)
    writer.writeheader()
    col_fill = ''
    writer.writerows({col: row_item} for row in p for row_item,col in zip(row+[col_fill]*n,header))

I am passing the column names from command line e.g. python script.py BOL1 BOL2 This is the output file:--

User_ID,Date,Num_1,Num_2,Com_ID,BOL1,BOL1
000101,,,,,,
,04-13-2015,,,,,
,,000012,,,,
,,,000021,,,
,,,,001011,,
,,,,,,
,,,,,,

How do I remove extra commas and make it readable.

1

There are 1 answers

0
Yann Vernier On
writer.writerows({col: row_item} for row in p for row_item,col in zip(row+[col_fill]*n,header))

You're writing rows containing only one column. Simply look at what's inside the braces. Perhaps you meant to use something like:

{col:row_item for row_item,col in zip(row+[col_fill]*n,header)} for row in p

to generate dictionaries with the information from each row. Since we don't need to pad the dictionary with empty columns, and dict accepts an iterable of key,value pairs, this could be written as:

dict(zip(header,row)) for row in p