John Doe AZCDEF" /> John Doe AZCDEF" /> John Doe AZCDEF"/>

how to get the value of an "item" node

95 views Asked by At

I have thousands of .XML files that contain code similar to the following:

<work-item>
    <field id="assignee">
        <list>
            <item>John Doe</item>
            <item>AZCDEF</item>
        </list>
    </field>...

I am using the following Powershell to try to get the value of the "item" nodes:

$files = Get-ChildItem -Path C:\temp\dev\workitems\ -include workitem.xml -Recurse | % { $_.FullName }
$files | foreach {
    $Doc = [xml]$MyXML = Get-Content $_
    write($_)
    #write($Doc.name)
    $XMLNode  = $MyXML.SelectNodes('//field[@id="assignee"]/list/item')

$XMLNode | foreach {
    If ($_.'#text'.length -ne 6) {
        write("Removing " + $_.'#text' + " From xml.")
        [void]$_.ParentNode.RemoveChild($_)
    }
}
$Doc.Save($_)   

}

However, I get the following error when saving the document:

Method invocation failed because [System.Object[]] doesn't contain a method named 'Save'. At C:\ALM\PowerShell Scripts\Remove bad assignee.ps1:14 char:14 + $Doc.Save <<<< ($_)
+ CategoryInfo : InvalidOperation: (Save:String) [], RuntimeException + FullyQualifiedErrorId : MethodNotFound

The problem is I can't seem to get the value (John Doe or AZCDEF) of the item node so I can check its length and second character.

How do I save the document after I have removed the element?

2

There are 2 answers

3
mclayton On

To get the text content of the node (incuding any child nodes) you can use the InnerText property:

PS> $xml = [xml] @"
<work-item>
    <field id="assignee">
        <list>
            <item>John Doe</item>
            <item>AZCDEF</item>
        </list>
    </field>
</work-item>
"@

PS> $nodes = $xml.SelectNodes('//field[@id="assignee"]/list/item')

PS> $nodes | foreach { write($_.InnerText) }
John Doe
AZCDEF

And once you've got the string you can filter the value - e.g.

PS> $nodes | where-object { ($_.InnerText.Length -le 2) -or ($_.InnerText[1] -ne 'Z') }

#text
-----
John Doe

Bearing in mind that strings are zero-indexed, so the second character is $_.InnerText[1]...

So now we've expression which filters the list of nodes to remove, and we can just pipe that into RemoveChild:

PS> $nodes `
| where-object { ($_.InnerText.Length -le 2) -or ($_.InnerText[1] -ne 'Z') } `
| foreach-object { $_.ParentNode.RemoveChild($_) }

PS> $xml.OuterXml
<work-item><field id="assignee"><list><item>AZCDEF</item></list></field></work-item>
2
Steve Gray On

I got it working using the following code.

$files = Get-ChildItem -Path C:\temp\dev\workitems\ -include workitem.xml -Recurse | % { $_.FullName }
$files | foreach {
    write($_)
    [xml]$MyXML = Get-Content $_
    $XMLNode  = $MyXML.SelectNodes('//field[@id="assignee"]/list/item')

    $XMLNode | foreach {
        write($_.InnerText)
        If (($_.InnerText.Length -le 2) -or ($_.InnerText.Length -ne 6) -or ($_.InnerText[1] -ne 'Z')) {
            write("Removing " + $_.InnerText + " From xml.")
            [void]$_.ParentNode.RemoveChild($_)
        }
    }
    $MyXML.Save($_)   
}