I am analysing text using IBM Watson's Tone analyser and I am trying to extract all information relating to the sentence tone (e.g; sentence_id
, text
, tones
, tone_id
, tone_name
, score
) and add that to a dataframe (with columns; sentence_id
, text
, tones
, tone_id
, score
and tone_name
). This is a sample of the output I have:
> [{'document_tone': {'tones': [{'score': 0.551743,
'tone_id': 'analytical',
'tone_name': 'Analytical'}]},
'sentences_tone': [{'sentence_id': 0,
'text': '@jozee25 race is the basis on which quotas are implemented.',
'tones': []},
{'sentence_id': 1, 'text': 'helloooooo', 'tones': []}]},
{'document_tone': {'tones': []}},
{'document_tone': {'tones': [{'score': 0.802429,
'tone_id': 'analytical',
'tone_name': 'Analytical'},
{'score': 0.60167, 'tone_id': 'confident', 'tone_name': 'Confident'}]},
'sentences_tone': [{'sentence_id': 0,
'text': '@growawaysa @cricketandre i have the answer on top yard from dpw:it is not currently "surplus to govt requirements".it is still being used for garaging until a new facility is ready in maitland.the',
'tones': [{'score': 0.631014,
'tone_id': 'analytical',
'tone_name': 'Analytical'}]},
{'sentence_id': 1,
'text': 'cost of the housing options will of course depend on prospects for cross subsidisation.',
'tones': [{'score': 0.589295,
'tone_id': 'analytical',
'tone_name': 'Analytical'},
{'score': 0.509368, 'tone_id': 'confident', 'tone_name': 'Confident'}]}]},
{'document_tone': {'tones': [{'score': 0.58393,
'tone_id': 'tentative',
'tone_name': 'Tentative'},
{'score': 0.641954, 'tone_id': 'analytical', 'tone_name': 'Analytical'}]}},
{'document_tone': {'tones': [{'score': 0.817073,
'tone_id': 'joy',
'tone_name': 'Joy'},
{'score': 0.920556, 'tone_id': 'analytical', 'tone_name': 'Analytical'},
{'score': 0.808202, 'tone_id': 'tentative', 'tone_name': 'Tentative'}]},
'sentences_tone': [{'sentence_id': 0,
'text': 'thanks @khayadlangaand colleagues for the fascinating tour yesterday.really',
'tones': [{'score': 0.771305, 'tone_id': 'joy', 'tone_name': 'Joy'},
{'score': 0.724236, 'tone_id': 'analytical', 'tone_name': 'Analytical'}]},
{'sentence_id': 1,
'text': 'eyeopening and i learnt a lot.',
'tones': [{'score': 0.572756, 'tone_id': 'joy', 'tone_name': 'Joy'},
{'score': 0.842108, 'tone_id': 'analytical', 'tone_name': 'Analytical'},
{'score': 0.75152, 'tone_id': 'tentative', 'tone_name': 'Tentative'}]}]},
This is the code I have wrote to get this output:
result =[]
for i in helen['Tweets']:
tone_analysis = tone_analyzer.tone(
{'text': i},
'application/json'
).get_result()
result.append(tone_analysis)
First of all, as your JSON is not well-formed, I am using the JSON from the Tone analyzer API reference available here
Using the JSON from the API reference and Pandas json_normalize, here's the code I came up with
The output dataframe will be
Also, I have created REPL for you to change the input and run the code on the browser - https://repl.it/@aficionado/DarkturquoiseUnnaturalDistributeddatabase
Refer this Kaggle link to understand more about flattening JSON in Python using Pandas