Is there a better way to access values in a JSON file than a 'for' loop?

73 views Asked by At

I have a JSON file which looks like this:

[{'data': [{'text': 'add '},
   {'text': 'Stani, stani Ibar vodo', 'entity': 'entity_name'},
   {'text': ' songs in '},
   {'text': 'my', 'entity': 'playlist_owner'},
   {'text': ' playlist '},
   {'text': 'música libre', 'entity': 'playlist'}]},
 {'data': [{'text': 'add this '},
   {'text': 'album', 'entity': 'music_item'},
   {'text': ' to '},
   {'text': 'my', 'entity': 'playlist_owner'},
   {'text': ' '},
   {'text': 'Blues', 'entity': 'playlist'},
   {'text': ' playlist'}]},
 {'data': [{'text': 'Add the '},
   {'text': 'tune', 'entity': 'music_item'},
   {'text': ' to the '},
   {'text': 'Rage Radio', 'entity': 'playlist'},
   {'text': ' playlist.'}]}]

I want to append the values in 'text' for each 'data' in this list.

I have tried the following:

lst = []

for item in data:
    p = item['data']
    p_st = ''
    for item_1 in p:
        p_st += item_1['text'] + ' '
    lst.append(p_st)

print(lst)

Out: ['add  Stani, stani Ibar vodo  songs in  my  playlist  música libre ', 'add this  album  to  my   Blues  playlist ', 'Add the  tune  to the  Rage Radio  playlist. ']

It works, but I am new to JSON and am wondering if there is a better way to do it? Some built-in methods or libraries for JSON maybe?

3

There are 3 answers

0
Nidheesh R On BEST ANSWER

Your code works well for extracting the text values from the JSON data. However, if you want a more concise way to achieve the same result, you can use list comprehensions in Python, which can make your code shorter and more readable. Here's how you can do it:

Using JSON module and list comprehensions:

import json

data = [{'data': [{'text': 'add '}, {'text': 'Stani, stani Ibar vodo', 'entity': 'entity_name'}, {'text': ' songs in '}, {'text': 'my', 'entity': 'playlist_owner'}, {'text': ' playlist '}, {'text': 'música libre', 'entity': 'playlist'}]},
        {'data': [{'text': 'add this '}, {'text': 'album', 'entity': 'music_item'}, {'text': ' to '}, {'text': 'my', 'entity': 'playlist_owner'}, {'text': ' '}, {'text': 'Blues', 'entity': 'playlist'}, {'text': ' playlist'}]},
        {'data': [{'text': 'Add the '}, {'text': 'tune', 'entity': 'music_item'}, {'text': ' to the '}, {'text': 'Rage Radio', 'entity': 'playlist'}, {'text': ' playlist.'}]}]

text_values = [' '.join(item['text'] for item in entry['data']) for entry in data]

print(text_values)

Using pandas:

import pandas as pd

data = [{'data': [{'text': 'add '}, {'text': 'Stani, stani Ibar vodo', 'entity': 'entity_name'}, {'text': ' songs in '}, {'text': 'my', 'entity': 'playlist_owner'}, {'text': ' playlist '}, {'text': 'música libre', 'entity': 'playlist'}]},
        {'data': [{'text': 'add this '}, {'text': 'album', 'entity': 'music_item'}, {'text': ' to '}, {'text': 'my', 'entity': 'playlist_owner'}, {'text': ' '}, {'text': 'Blues', 'entity': 'playlist'}, {'text': ' playlist'}]},
        {'data': [{'text': 'Add the '}, {'text': 'tune', 'entity': 'music_item'}, {'text': ' to the '}, {'text': 'Rage Radio', 'entity': 'playlist'}, {'text': ' playlist.'}]}]

# Create a DataFrame from the data
df = pd.DataFrame(data)

# Extract and join the 'text' values for each 'data' entry
text_values = df['data'].apply(lambda x: ' '.join(item['text'] for item in x))

print(text_values.tolist())

The pandas approach is more suitable if you plan to perform additional data analysis or manipulation on your JSON data, as it provides a powerful and flexible way to work with structured data.

2
Christopher Hatton On

This will work:

with open(filename,'r+') as file:
    #open and load json file into dict
    file_data = json.load(file)
    #append new data to dict
    file_data[].append(new_data)
    #sets file's current position at offset
    file.seek(0)
    #convert back to json
    json.dump(file_data, file, indent = 4)
1
ShadowRanger On

There's no special JSON facility that will help here, because you've already parsed the JSON, and have plain old Python dicts and lists and strs (and no, the parsing process isn't modifiable in any trivial way to do what you want, this should be done after parsing).

That said, your code is non-idiomatic, and has some inefficiencies in it (ones that CPython tries to help with, but the optimization for repeated concatenation of str is brittle, non-portable, and still worse than doing it the right way with str.join). The improved code would look like this:

lst = [' '.join([item_1['text'] for item_1 in item['data']])
       for item in data]
print(lst)

That uses a list comprehension to produce the outer list, where each element produced is the space-separated concatenation of all the 'text' values for that item's 'data'. The use of a listcomp for the outer part makes things a little faster (it's a microoptimization taking advantage of interpreter optimizations for listcomps, but it's not big-O improvement). The use of ' '.join is a big-O algorithmic improvement though; repeated string concatenation is O(n²) (CPython optimizes it to almost O(n) sometimes, but not as well, and not reliably), while bulk concatenation via ' '.join is guaranteed O(n). If your data is only a small number of strings, as shown, the difference may be negligible, but the code is simpler and easier to read/maintain. If the data has many strings to concatenate, this may speed it up significantly.

Note: This does mean the concatenated string will not end with a space. Odds are you don't want that trailing space anyway, but you can always add it back if you really want; a single extra concatenation won't ruin big-O.