I have the below XML Response from http://dataportal.ins.tn/en/API

https://www.dropbox.com/s/8x6tpbpd6m8j1f9/DimensionElements_response_2019-05-09_11-46.xml?dl=0

enter image description here

I use below code to convert to Dataframe:

import requests
import xml.etree.ElementTree as ET
import pandas as pd

Dimension_Id = 'OBJ5263019'
Language = '1033'

Request_URL = 'http://dataportal.ins.tn/WebApi/GetDimensionElements'
Method_Post_Body = "<QueryMessage lcid='" + Language + "'> <DataWhere> <DimensionId>" + Dimension_Id + "</DimensionId> </DataWhere> </QueryMessage>"

Post_Response = requests.post(Request_URL, data=Method_Post_Body, headers={'Content-type': 'text/xml'})
XTree = Post_Response.content
XRoot = ET.XML(XTree)

XML_List = []
XML_Structure_All = pd.DataFrame()
for Tag_1 in XRoot[1]:
    for Child in Tag_1.iter():
    XML_Dict = Child.attrib
    XML_List.append(XML_Dict)

XML_Dimension_Items = pd.DataFrame(XML_List)

I want to generate parent for each Element, want consider "Element" attrib "KEY" as parent.

In above example:

First Element does not have parent so I want keep "Parent" = ''

Second Element (KEY="27932019") has multiple sub elements so KEY="27932019" will be parent code for child elements, this should work for all nested elements.

Is there anyway to achieve this?

1 Answers

0
SPy On Best Solutions

We can get parent from lxml:

import pandas as pd
import requests
from lxml import etree
from io import StringIO, BytesIO

Dimension_Id = 'RDS_DICT_REGIONS_NSO'
Language = '1033'

Request_URL = 'http://dataportal.ins.tn/WebApi/GetDimensionElements'
Method_Post_Body = "<QueryMessage lcid='" + Language + "'> <DataWhere> <DimensionId>" + Dimension_Id + "</DimensionId> </DataWhere> </QueryMessage>"

Post_Response = requests.post(Request_URL, data=Method_Post_Body, headers={'Content-type': 'text/xml'})

XRoot_P = etree.fromstring(Post_Response.content)

XML_List = []
XML_Structure_All = pd.DataFrame()
for Tag_1 in XRoot_P[1]:
    for Child in Tag_1.iter():
    if len(Child.getparent().attrib) > 0:
        if 'CODE' in Child.getparent().attrib.keys():
        Parent = Child.getparent().attrib['CODE']
        elif 'C_CODE' in Child.getparent().attrib.keys():
        Parent = Child.getparent().attrib['C_CODE']
        elif 'KEY' in Child.getparent().attrib.keys():
        Parent = Child.getparent().attrib['KEY']
    else:
        Parent = ''

    if 'CODE' in Child.attrib.keys(): Col = 'CODE'
    elif 'C_CODE' in Child.attrib.keys(): Col = 'C_CODE'
    elif 'KEY' in Child.attrib.keys(): Col = 'KEY'

    XML_Dict = {'CODE': Child.attrib[Col], 'Parent': Parent}
    XML_List.append(XML_Dict)
XML_Dimension_Parent = pd.DataFrame(XML_List)