I'm trying to convert Json file to ndjson. I'm reading the file from GCS(google cloud Storage). sample data:
{
"Item1" : "INT",
"Item2" : "INT",
"Item3" : "text",
"Item4" : "text",
"Item5" : "Date"
}{
"Item1" : "INT",
"Item2" : "INT",
"Item3" : "text",
"Item4" : "text",
"Item5" : "Date"
}{
"Item1" : "INT",
"Item2" : "INT",
"Item3" : "text",
"Item4" : "text",
"Item5" : "Date"
}
following is my code.
bucket = client.get_bucket('bucket name')
# Name of the object to be stored in the bucket
object_name_in_gcs_bucket = bucket.get_blob('file.json')
object_to_string = object_name_in_gcs_bucket.download_as_string()
#json_data = ndjson.loads(object_to_string)
json_list = [json.loads(row.decode('utf-8')) for row in object_to_string.split(b'\n') if row]
The error I'm receiving is at json_list:
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 3 (char 2)
required output:
{"Item1" : "INT","Item2" : "INT","Item3" : "text","Item4" : "text","Item5" : "Date"}
{"Item1" : "INT","Item2" : "INT","Item3" : "text","Item4" : "text","Item5" : "Date"}
{"Item1" : "INT","Item2" : "INT","Item3" : "text","Item4" : "text","Item5" : "Date"}
I think your main problem is that you are splitting on line endings instead of the closing brace. Here is an example that accomplishes what I think you are trying.
Output: