I am using python to extract the coordinate data from an non-standard xml file.
The issue is that some of the coordinates
tags contain xml processing instructions, for example
<?xml version="1.0" encoding="UTF-8"?>
<kml>
<Document>
<Folder>
<Placemark id="CAR1">
<coordinates>
28.236931382585154893,36.380279408060658852,200<?AtTime 0.000000?>
28.236862947306430982,36.380324400967296583,200<?AtTime 2.333333?>
28.236862521767100986,36.380323843439811071,200<?AtTime 2.500000?>
</coordinates>
</Placemark>
<Placemark id="CAR2">
<coordinates>
28.236931382585154893,36.380279408060658852,200
28.236862947306430982,36.380324400967296583,200
28.236862521767100986,36.380323843439811071,200
</coordinates>
</Placemark>
</Folder>
</Document>
</kml>
Below is my test code
from xml.dom import minidom
myFile = "test.xml"
mydoc = minidom.parse(myFile)
items = mydoc.getElementsByTagName('Placemark')
for elem in items:
print("coordinates ",elem.attributes['id'].value)
coorTag = elem.getElementsByTagName('coordinates')
if len(coorTag) > 0:
for elem in coorTag:
theData = elem.firstChild.nodeValue
print(theData)
This generates the following output.
coordinates CAR1
28.236931382585154893,36.380279408060658852,200
coordinates CAR2
28.236931382585154893,36.380279408060658852,200
28.236862947306430982,36.380324400967296583,200
28.236862521767100986,36.380323843439811071,200
How can I obtain all the coordinate data even when the internal data of a given tag may contain processing instructions?
Is this just a problem with minidom, maybe an alternative python library would be better?
The output I am looking for is.
coordinates CAR1
28.236931382585154893,36.380279408060658852,200<?AtTime 0.000000?>
28.236862947306430982,36.380324400967296583,200<?AtTime 2.333333?>
28.236862521767100986,36.380323843439811071,200<?AtTime 2.500000?>
coordinates CAR2
28.236931382585154893,36.380279408060658852,200
28.236862947306430982,36.380324400967296583,200
28.236862521767100986,36.380323843439811071,200