How to transform JSON International Financial Statistics into pandas data frame

Question

How to transform JSON International Financial Statistics into pandas data frame

465 views Asked by Jorge Alonso At 31 December 2024 at 10:51

I am struggling with data from the International Monetary Fund, which is in JSON format. After inspecting some of the posts, I couldn't figure out how to do it.

What I tried

import requests
import pandas as pd
import json

# These are the variables I want to have as columns, plus setting a time index
var = ['NGDP_XDC', 'NCP_XDC', 'NCGG_XDC', 'NFI_XDC', 'NINV_XDC', 'NX_XDC', 
       'NM_XDC', 'NSDGDP_XDC', 'NGDP_R_K_IX', 'NGDP_D_IX']

# URL for the IMF JSON Restful Web Service,
# IFS database
base = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/IFS/'
period = 'A'
country = 'MX'

var = 'NGDP_XDC+NCP_XDC+NCGG_XDC+NFI_XDC+NINV_XDC+NX_XDC+NM_XDC+NSDGDP_XDC+NGDP_R_K_IX+NGDP_D_IX'
    
time = '?startPeriod=1970&endPeriod=2019'

# Get data from the above URL using the requests package
url = base + period + '.' + country + '.' + var + '.' + time

response = requests.get(url)
dictr = response.json()

... so far so good... However, this is the step I am struggling with


flat = dictr['CompactData']['DataSet']['Series']

temp = pd.json_normalize(flat)
temp = temp.drop(columns=['@FREQ', '@REF_AREA', '@UNIT_MULT', '@BASE_YEAR'])

I was expecting a flat-file that I could pivot to my will. However, this is what I get


    @INDICATOR @TIME_FORMAT                                                Obs
0     NINV_XDC          P1Y  [{'@TIME_PERIOD': '1970', '@OBS_VALUE': '37.21...
1       NX_XDC          P1Y  [{'@TIME_PERIOD': '1970', '@OBS_VALUE':

Which I have no clue how to transform it into

year variable1 ... variableN

1970    10     ...    45
1980    20     ...    12
. 
.
.
2019    15     ...    10

Original Q&A

There are 2 answers

**baduker** · Answer 1 · 2020-10-10T17:54:31+00:00

Maybe this will nudge you in the right direction.

The value of ['CompactData']['DataSet']['Series'] is a dict that contains a list of dicts as its value that you're after.

So you have to flatten this:

series = response['CompactData']['DataSet']['Series']
flat = [item for sublist in [i['Obs'] for i in series] for item in sublist]

Putting it all together:

import requests
import pandas as pd

# These are the variables I want to have as columns, plus setting a time index
var = [
    'NGDP_XDC', 'NCP_XDC', 'NCGG_XDC', 'NFI_XDC', 'NINV_XDC', 'NX_XDC',
    'NM_XDC', 'NSDGDP_XDC', 'NGDP_R_K_IX', 'NGDP_D_IX',
]

base = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/IFS/'
period = 'A'
country = 'MX'
time = '?startPeriod=1970&endPeriod=2019'

# Get data from the above URL using the requests package
url = f"{base}{period}.{country}.{'+'.join(var)}.{time}"
response = requests.get(url).json()

series = response['CompactData']['DataSet']['Series']
flat = [item for sublist in [i['Obs'] for i in series] for item in sublist]
print(pd.DataFrame(flat))

Output:

    @TIME_PERIOD        @OBS_VALUE @OBS_STATUS
0           1970   37.210816346586         NaN
1           1971  35.6027864361386         NaN
2           1972   36.123021665698         NaN
3           1973  50.9603299629663         NaN
4           1974   80.992068185601         NaN
..           ...               ...         ...
[499 rows x 3 columns]

**Jorge Alonso** · Answer 2 · 2020-10-11T15:47:12+00:00

I implemented your nudge in a much less elegant way as I could not understand how to retrieve variable codes and time index from your procedure. This also works as well:

url = f"{base}{period}.{country}.{'+'.join(var)}.{time}"
response = requests.get(url).json()
series = response['CompactData']['DataSet']['Series']

nipa = pd.DataFrame(index=range(1970, 2020))
N = len(var)

for n in range(0, N):
    temp = pd.DataFrame(series[n]['Obs'], index=range(1970, 2020))
    temp = temp.drop(columns='@TIME_PERIOD')
    temp.rename(columns={'@OBS_VALUE': var[n]}, inplace=True)
    nipa = pd.merge(nipa, temp, left_index=True, right_index=True)

TechQA.

How to transform JSON International Financial Statistics into pandas data frame

What I tried

There are 2 answers

Related Questions in PYTHON

Related Questions in JSON

Related Questions in PANDAS

Related Questions in REQUEST

Related Questions in SDMX

Popular Questions

Popular Tags

Trending Questions