How to get IXmlDomDocument2.XML to escape quotes properly?

2.4k views Asked by At

I'm working on a problem where XML exported from our program doesn't escape quotes, (turning " into ",) leading to problems on the receiving end. It escapes &s and angle brackets just fine, but not quotes.

When I dug around in the XML export code, I found that it was a pretty straightforward IXmlDomDocument2 DOM interface. But when I got to the step where it produces the XML string output by calling the .XML method, I ran smack into a wall of proprietariness that I can't trace into, since all the work is taking place inside of C:\Windows\System32\msxml3.dll.

So apparently Microsoft's IXmlDomDocument2 implementation knows how to escape some symbols but not others. And just to make it worse, the obvious but ugly solution, (running a preprocessing step by recursively traversing the entire document and replacing all quotes in values with '"' before I call .XML,) won't work because the .XML method will see those &s in there and escape them! Is there any way to fix this?

1

There are 1 answers

3
Robert Love On

This could be considered a bug in the XML Parser used on the other end. The XML Specification details the entities that can be escaped. But they only need to be escaped inside the attributes, which works as shown here:

program Project2;

{$APPTYPE CONSOLE}

uses
  ActiveX,
  MSXML2_TLB,
  SysUtils;
var
  Dom : IXMLDOMDocument2;
  Root :  IXMLDOMNode;
  Attr : IXMLDOMNode;
begin
  CoInitialize(nil);
  try
    DOM := CoDOMDocument40.Create;
    Root := Dom.createElement('root');
    Attr := Dom.createAttribute('attr');
    Attr.text := '"';
    root.attributes.setNamedItem(Attr);
    root.text := '"Hello World"';
    DOM.appendChild(Root);
    writeln(Root.xml);
    readln;
  except
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);
  end;
end.

But the reality is that you may not have control over the other side of the equation. So you can get the desired behavior doing the following:

program Project2;

{$APPTYPE CONSOLE}

uses
  ActiveX,
  MSXML2_TLB,
  SysUtils;
function QuoteEscape(const v : String) : String;
begin
  result := StringReplace(V,'"','"',[rfReplaceAll]);
end;


var
  Dom : IXMLDOMDocument2;
  Root :  IXMLDOMNode;
  Attr : IXMLDOMNode;
begin
  CoInitialize(nil);
  try
    DOM := CoDOMDocument40.Create;
    Root := Dom.createElement('root');
    Attr := Dom.createAttribute('attr');
    Attr.text := '"';
    root.attributes.setNamedItem(Attr);
    root.text :=  QuoteEscape('"Hello World"');
    DOM.appendChild(Root);
    writeln(StringReplace(Root.xml,'"','"',[rfReplaceAll]));
    readln;
  except
    on E:Exception do
      Writeln(E.Classname, ': ', E.Message);
  end;
end.