Convert Delim File to list Objects using Python

54 views Asked by At

I have a delim data file as given below

DAYPART_ID|NAME|LABEL|START_TIME|END_TIME|WEEKEDAYS|STYLE|DAYPART_SET_ID|ORDER
1|Early AM|6:00 am - 9:00 am|6|9|12345|gold|1|01
2|Daytime|9:00 am - 4:00 pm|9|16|12345|red|1|02

I need to conver it to the following type of Json list file

[
{
"STYLE": "gold", 
"NAME": "Early AM", 
"START_TIME": 6, 
"DAYPART_SET_ID": 1, 
"LABEL": "6:00 am - 9:00 am", 
"DAYPART_ID": 1, 
"END_TIME": 9, 
"ORDER": 01, 
"WEEKEDAYS": 12345
},
{
"STYLE": "red", 
"NAME": "Daytime", 
"START_TIME": 9, 
"DAYPART_SET_ID": 1, 
"LABEL": "9:00 am - 4:00 pm", 
"DAYPART_ID": 2, 
"END_TIME": 16, 
"ORDER": 02, 
"WEEKEDAYS": 12345
}
]

So although it a JSON file but it is a little modified like the numeric fields wont have quotes and we have extra third brackets in the file and there is a comma between each record apart from having a end curly braces.

I wrote a coded like below

import csv
import json

csv.register_dialect('pipe', delimiter='|', quoting=csv.QUOTE_NONE)

with open('Infile', "r") as csvfile:
    with open(outtfile, 'w') as outfile:
           for row in csv.DictReader(csvfile, dialect='pipe'):
            data= row
            json.dump(data, outfile, sort_keys = False, indent = 0,ensure_ascii=True)

But it did not give me the exact result. I intended. Can Anyone help here?

1

There are 1 answers

3
Łukasz Rogalski On

What you are doing is actually dumping each row to destination file. These objects has no knowledge of being in list therefore list syntax of json file is missing from your output file. A solution to your problem would be to read all objects to list, and dump the list itself afterwards.

For numbers - simply list all columns with expected type of int and convert them before adding to objects list.

import csv
import json

csv.register_dialect('pipe', delimiter='|', quoting=csv.QUOTE_NONE)
numeric_columns = ['START_TIME', 'END_TIME', 'WEEKEDAYS', 'DAYPART_SET_ID', 'DAYPART_ID']
objects = []
with open('infile', "r") as csvfile:
    for o in csv.DictReader(csvfile, dialect='pipe'):
        for k in numeric_columns:
            o[k] = int(o[k])
        objects.append(o)

with open('outfile', 'w') as dst:
    json.dump(objects, dst, indent=2)