How can one create a python dictionary (key : value) from select objects in a JSON file with multiple JSON lines

905 views Asked by At

I have a file with multiple JSON lines as shown below.

{"status str":null,"id":563221, "filter":"low","text" : "Grass is green"}
{"status str":null,"id":612835, "filter":"high","text" : "Textual blue"}

My desired output should show only the ID number and the "Grass is green" as a [key : value] pair as in dictionaries in Python :

563221 : "Grass is green"

612835 : "Textual blue"

I am currently using ObjectPath to query. Using the tuples, I can output all the data but I can't select sections of the data. Below is the code that I am using.

read_data = []
with open(fileName, 'r') as file_to_read:
    for line in filetoread:
        json_tree = objectpath.Tree(read_data)
        dict = {tuple(json_tree.execute('$.id')) : tuple(json_tree.execute('$.text'))}
        line = next(filetoread)
return dict
3

There are 3 answers

0
Patrick Vanhuyse On BEST ANSWER

You should use the json library to translate each line of the file to json then easily extract the data you need.

import json

dict = {}
with open(fileName, 'r') as file_to_read:
    for line in filetoread:
        json_line = json.loads(line)
        dict[json_line['id']] = json_line['text']
return dict

json.loads(json_string) converts the string in json_string to json.

0
kalehmann On

You almost got it. You need to deserialize your json first using the json.loads function and then pass it to the objectpath.Tree.

For example:

import json
import objectpath

data = [
  '{"status str":null,"id":563221, "filter":"low","text" : "Grass is green"}',
  '{"status str":null,"id":612835, "filter":"high","text" : "Textual blue"}'
]

for line in data: 
    jt = objectpath.Tree(json.loads(line))
    d = {jt.execute('$.id') : jt.execute('$.text')} 
    print(d)

results in

{563221: 'Grass is green'}
{612835: 'Textual blue'}

And naming your variable dict is not a good idea, because you will override the python built-in class dict.

Applying this to your code results in

read_data = [] 
with open(fileName, 'r') as file_to_read:
    for line in file_to_read:
        json_tree = objectpath.Tree(json.loads(line))
        read_data.append({json_tree.execute('$.id') : json_tree.execute('$.text')})

print(read_data)
0
Erik Šťastný On

I think use of objectpath is unnecessary. You can do it by really simple way thanks to json package.

Content of data.json:

{"status str":null,"id":563221, "filter":"low","text" : "Grass is green"}
{"status str":null,"id":612835, "filter":"high","text" : "Textual blue"}

code:

import json

file_name = "data.json"

with open(file_name, 'r') as file_to_read:
    for line in file_to_read:
        json_object = json.loads(line)
        dictionary = {json_object["id"]: json_object["text"]}

print(dictionary)

Output:

{563221: 'Grass is green'}
{612835: 'Textual blue'}