Remove xml node from ttml file C#

101 views Asked by At

Since I was not able to find an answer to my previous question: Xmlstarlet ed encoding and powershell inside Process C# I want to try another path.

I need to just be able to delete nodes from a ttml file (it's a type of xml used for subtitles). With xmlstarlet I was able to do it like this:

./xml.exe ed -N ns=http://www.w3.org/2006/04/ttaf1 -d '//ns:div[not(contains(@xml:lang,''Italian''))]' "C:\Users\1H144708H\Downloads\a.mul.ttml"

but I'm not able to do it without lose utf-8 encoding on windows powershell (I was able to do this on linux bash).

If I want to do the same thing on C# how can I do it? I know how to open/read/write text files of course but I don't know if there is a way to create an xml with a specific namespace and how to delete every nodes that don't contain xml:lang languageToKeep.

EDIT. Something like this:

XmlDocument xml = new XmlDocument();
xml.Load(files[0]);
XmlNodeList nodes = xml.SelectNodes("//ns:div[not(contains(@xml:lang,''Italian''))]");
Console.WriteLine(nodes.ToString());

But I think that I need the namespace... and I don't know how.

1

There are 1 answers

0
LiefLayer On BEST ANSWER

In the end I just used a StreamReader to ReadLine by line the file. With a simple Contains I decide where is the xml:lang="Language" and I then start to add every line to a string. Of course I added the head and the end to my file before the while loop and I stop to add every line when I read a line that Contains . I know that this is not the best way to do things, but it works for my case.