I am attempting to fit a JSON file into a dataframe. My presently bugged code creates the JSON file through the following method:
fname = 'python.json'
with open(fname, 'r') as f, open('sentiment.json', 'w') as s:
for line in f:
tweet = json.loads(line)
# Create a list with all the terms
tweet_words = tweet['text']
output = subprocess.check_output(['curl', '-d', "text=" + tweet_words.encode('utf-8'), 'http://text-processing.com/api/sentiment/'])
s.write(output+"\n")
It writes into 'sentiment.json' output requested from the text-processing.com API. I then load the JSON using:
def load_json(file, skip):
with open(file, 'r') as f:
read = f.readlines()
json_data = (json.loads(line) for i, line in enumerate(read) if i%skip==0)
return json_data
And then construct the dataframe using:
sentiment_df = load_json('sentiments.json', 1)
data = {'positive': [], 'negative': [], 'neutral': []}
for s in sentiment_df:
data['positive'].append(s['probability']['pos'])
data['negative'].append(s['probability']['neg'])
data['neutral'].append(s['probability']['neutral'])
df = pd.DataFrame(data)
Error: ValueError: No JSON object could be decoded
I browsed through several related questions, and based on the answer here, from WoodrowShigeru, I suspect it may have something to do with my encoding into 'utf-8' in the first block of code.
Does anyone know a good fix? Or at least provide some directions? Thanks guys!
Edit 1
Your screenshot is not a valid json as a container must hold all comma-separated line items. However, the challenge is your command line call returns a string,
output
, that you then write to text file. You need to create a list of dictionaries that is then dumped to a json file withjson.dumps()
.Consider doing so by casting command line string into a dictionary with
ast.literal_eval()
during the first text file read. Then append each dictionary to a list:From there, read json file into a pandas dataframe with json_normalize. Below uses example data: