How to read XML containing character entities using XmlUrlResolver by using Powershell

419 views Asked by At

Using the following Powershell line works correctly until it gets to an XML file containing character entities:

$xml = [xml] (Get-Content $file.Name)

How do you read XML files using Powershell and have it resolve the character entities from the DTD instead of generating errors like this:

Cannot convert value "System.Object[]" to type "System.Xml.XmlDocument". Error: "Reference to undeclared entity 'nbsp'. Line 3, position 324."
Cannot convert value "System.Object[]" to type "System.Xml.XmlDocument". Error: "Reference to undeclared entity 'Oacute'. Line 3, position 239."

Reading XML files is easy when they are valid and don't contain character entities. I have a DTD specified in the XML file with these character entities, but it's not using it. Example start of XML file:

<?xml version="1.0"?>
<!DOCTYPE catalog SYSTEM "manual.dtd">
<catalog ...

How do I turn on the XML resolver in Powershell? The DTD file is in the same folder as the XML files.

I have the code to bypass this issue in C#, but how do I do the following using Powershell?

XML = XMLString;
var dtdPath = HttpContext.Current.ApplicationInstance.Server.MapPath("~/App_Data") + "\\Manual.dtd";
XML = XML.Replace("manual.dtd", dtdPath);
XmlUrlResolver resolver = new XmlUrlResolver();
resolver.Credentials = System.Net.CredentialCache.DefaultCredentials;
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;
settings.ValidationType = ValidationType.None;
settings.XmlResolver = resolver;
XmlReader reader = XmlReader.Create(new System.IO.MemoryStream(System.Text.UTF8Encoding.UTF8.GetBytes(XML)), settings);
var XMLPrimary = XDocument.Load(reader);

Here is my best guess at the Powershell code to do this, but it's still not working. How do you set XmlUrlResolver using Powershell?

$resolver = New-Object -TypeName System.Xml.XmlUrlResolver
$resolver.Credentials = [System.Net.CredentialCache]::DefaultCredentials
$readerSettings = New-Object -TypeName System.Xml.XmlReaderSettings
$readerSettings.DtdProcessing = [System.Xml.DtdProcessing]::Parse
$readerSettings.ValidationType = [System.Xml.ValidationType]::DTD
$readerSettings.XmlResolver = $resolver
$readerSettings.MaxCharactersFromEntities = 2048;
$readerSettings.ValidationFlags = [System.Xml.Schema.XmlSchemaValidationFlags]::ProcessInlineSchema -bor [System.Xml.Schema.XmlSchemaValidationFlags]::ProcessSchemaLocation
$readerSettings.add_ValidationEventHandler(
{
    Write-Host $("`nError found in XML: " + $_.Message + "`n") -ForegroundColor Red
    $script:errorCount++
});
$reader = [System.Xml.XmlReader]::Create($XmlFile.FullName, $readerSettings)
while ($reader.Read()) { }
$reader.Close()
0

There are 0 answers