XmlTextReader passes end of XML document without recognizing

643 views Asked by At

I'm trying to create a simple App which reads a XML using SAX (XmlTextReader) from a stream which does not only contain the XML but also other data such as binary blobs and text. The structure of the stream is simply chunk based.

When entering my reading function, the stream is properly positioned at the beginning of the XML. I've reduced the issue to the following code example:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?><Models />" + (char)0x014;

XmlTextReader reader = new XmlTextReader(new StringReader(xml));
reader.MoveToContent();
reader.ReadStartElement("Models");

These few lines causes an exception when calling ReadStartElement due to the 0x014 at the end of the string.

The interesting thing about it is, that the code runs just fine when using the following input instead:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?><Models></Models>" + (char)0x014;

I don't want to read the whole document due to its size nor do I want to change the input as I need to stay backward compatible to older data inputs.

The only solution I can think of at first is a custom stream reader which doesn't continue to read after the last ending tag but that would involve some major parsing efforts.

Do you have any ideas on how to solve this issue? I've already tried to use LINQ's XDocument but that also failed.

Thank you very much in advance, Cheers,

Romout

1

There are 1 answers

0
Edwin de Koning On

I don't know if this is quite what you are looking for, but if you instead call:

reader.IsStartElement("Models");,

than the <Models/> node will only be tested if it is a start tag or empty element tag and if the Name matches. The reader will not be moved beyond it (the Read() method will not be called).