Linked Questions

Popular Questions

Parsing xml with python (find tags with specific text)

Asked by At

I have been tasked with processing an xml file, to find specific elements and export them in a csv file..

I am specificly having trouble with some information that is keept in the same tags:

<name>text</name>
<value>value</value>

each name tag contains a different value and I only need some of them.. I have tried looping through the file with this code:

try:
        descr = member.find('.//name').text
        if descr == 'description':
            plugin.append(descr)
    except AttributeError:
        descr = 'Unknown'
        plugin.append(descr)

But it only returns 'Unknown'

My whole code is as such (not finished):

import xml.etree.ElementTree as ET
import csv

tree = ET.parse('plugins.xml')
root = tree.getroot()

nessus_out = open('/home/rj/Documents/python/nessus_out.csv', 'w')

csvwriter = csv.writer(nessus_out)

for member in root.findall('nasl'):
    plugin = []

    id = member.find('script_id').text
    plugin.append(id)

    name = member.find('script_name').text
    plugin.append(name)

    family = member.find('script_family').text
    plugin.append(family)

    #for each in member.iterfind('nasl'):
    try:
        solution = member.xpath('.//name/text()')
        if solution == 'solution':
            plugin.append(solution)
    except AttributeError:
        solution = 'Unknown'
        plugin.append(solution)
    csvwriter.writerow(plugin)
nessus_out.close()

The ultimatate goals is to search for "solution" and get the corrensponding value from its tag.

The xml structure is as follows:

nasl_plugins
nasl_plugins/nasl
nasl_plugins/nasl/filename
nasl_plugins/nasl/script_id
nasl_plugins/nasl/script_name
nasl_plugins/nasl/script_family
nasl_plugins/nasl/attributes/attribute/name
nasl_plugins/nasl/attributes/attribute/value

For Daniel:

Xml snippet:

<nasl>
<filename>fedora_2017-c3149b5fcb.nasl</filename>
<script_id>101028</script_id>
<script_name>Fedora 25 : xen (2017-c3149b5fcb)</script_name>
<script_version>$Revision: 1.5 $</script_version>
<script_copyright>This script is Copyright (C) 2017-2018 Tenable Network Security, Inc.</script_copyright>
<script_family>Fedora Local Security Checks</script_family>
<cves>
 <cve>CVE-2017-10911</cve>
 <cve>CVE-2017-10912</cve>
 <cve>CVE-2017-10913</cve>
 <cve>CVE-2017-10915</cve>
 <cve>CVE-2017-10916</cve>
 <cve>CVE-2017-10917</cve>
 <cve>CVE-2017-10918</cve>
 <cve>CVE-2017-10919</cve>
 <cve>CVE-2017-10920</cve>
 <cve>CVE-2017-10923</cve>
</cves>
<bids>
</bids>
<xrefs>
 <xref>FEDORA:2017-c3149b5fcb</xref>
 <xref>IAVB:2017-B-0074</xref>
</xrefs>
<dependencies>
 <dependency>ssh_get_info.nasl</dependency>
</dependencies>
<required_keys>
 <key>Host/local_checks_enabled</key>
 <key>Host/RedHat/release</key>
 <key>Host/RedHat/rpm-list</key>
</required_keys>
<attribute> 
  <name>plugin_type</name> 
  <value>local</value> 
</attribute> 
<attribute> 
  <name>plugin_modification_date</name> 
  <value>2018/02/02</value> 
</attribute> 
<attribute> 
  <name>stig_severity</name> 
  <value>I</value> 
</attribute> 
<attribute> 
  <name>cvss_base_score</name> 
  <value>10.0</value> 
</attribute> 
</attributes> 

What im looking for is the values of stig_severity, base_cvss_score and som others aswell.. So my reasoning as to search for the and they move down one line and get the value.. As for the csv, i need it in one line pr. plugin, so in this format: id,name,family,solution,description,synopsis,base_cvss_score,plugin_type,stig_severity and then values for the next plugin on the next line..

Related Questions