I'm setting up a python script that will ask for a list of input xml files that all have the same format and read out a specific line from each xml file.
Everything works as I want it to, however I am getting an error when reading from the xml file due to the content of the xml file itself.
I have got the script to work by editing the xml file but this is not a solution for me as I need this script to run thousands of files
here is the code I'm using:
import os import tkinter as tk from tkinter import filedialog import xml.etree.ElementTree as ET root = tk.Tk() root.withdraw() file_path = filedialog.askopenfilenames() tup=0 count = len(file_path) for i in range(len(file_path)): filename = os.path.basename(file_path[tup]) print('file =',os.path.basename(' '.join(file_path))) tree = ET.parse(file_path[tup]) root = tree.getroot() for child in root: data = child.tag print(data) for data in root.findall(data): name = data.find('subdata2').text print('ID =', name) tup +=1
and here is an example of the xml:
<?xml version="1.0"?> <Data xmlns="link"> <subdata1 id = "something"> <subdata2>data <subdata3>data</subdata3> </subdata2> </subdata1> </Data>
The problem comes from the text attached to the root "link3" it changes the tag of subdata1 from
and this is then changing the output from:
ID = data
Traceback (most recent call last): File "debug.py", line 25, in <module> name = data.find('subdata2').text AttributeError: 'NoneType' object has no attribute 'text'
is there another way of extracting the data from this xml file that doesn't involve modifying the xml file itself?