Trouble Parsing MusicXML Files from File Paths in Python

187 views Asked by At

I'm working on a Python project where I need to parse MusicXML files. So far, I've been manually copying the content from these files into a string variable, and the code works perfectly. However, I'd like to make my script more user-friendly by allowing it to accept file paths instead.

Here's how I'm attempting to parse the file from a given file path:

xml_file_path = r"C:\Users\myfilepath.musicxml"
tree = ET.parse(xml_file_path)
root = tree.getroot()
xml_content = ET.tostring(root, encoding="utf-8").decode("utf-8")

The problem I'm encountering is that when I run my code with the file path, it doesn't seem to work as expected. I'm not getting any error messages, but the parsing results are not what I'm looking for, and they often seem incomplete.

I would really appreciate it if someone could help me understand what might be going wrong here, or if there's a better way to directly parse MusicXML files from file paths in Python. Any tips, suggestions, or guidance would be incredibly helpful.

Thanks so much for your time!

1

There are 1 answers

4
Hermann12 On

You don’t explain, what you like to parse or what you like to do with the content?

import xml.etree.ElementTree as ET
import os 

def xml_parse(filename):
    tree = ET.parse(filename)
    root = tree.getroot()
    
    for elem in root.iter():
        print(elem.tag, elem.attrib, elem.text)



if __name__ == "__main__": 
    # insert the path where you are interessted e.g.
    # path = r"C:\Users\sselt\Documents\blog_demo" 
    path = r"."
    
    for (root,dirs,files) in os.walk(path, topdown=True): 
        for filename in files:
            if filename.endswith('.xml'):
                print(filename)
                xml_parse(filename)

Instead of os.walk() you can let pathlib() do the job:

import xml.etree.ElementTree as ET
from pathlib import Path

def xml_parse(filename):
    print(f"File: {filename}")
    tree = ET.parse(filename)
    root = tree.getroot()
    
    for elem in root.iter():
        if elem.tag == "score-instrument" and elem.get('id') == 'P1-I1':
            print(f"{elem.find('./instrument-name').text:>17}")
        
if __name__ == "__main__": 
    # insert the path where you are interessted e.g.
    p = Path('.')
    print(p.cwd())
    print(Path.home())
    print(Path.home().parent)
    l = list(p.glob('*.xml'))

    for filename in l:
        xml_parse(filename)

Output from my MusicXML file:

File: musicXML.xml
      Wood Blocks
File: test.xml
File: outxml.xml
File: helloWorld.xml

Should work on Linux and Windows.