Read Parts of an Xml File trough Stream instead of only one

Question

Read Parts of an Xml File trough Stream instead of only one

194 views Asked by Endless At 13 December 2020 at 02:14

So I've been working on a old piece of code for a project. I've managed to optimize it for 64bit usage. But there's only 1 issue. When using the XmlSerializer.Deserialize It breaks because the input text/Deserialized data is TOO BIG. (overflow/exceeds the 2gb int limit).

I've tried to find a fix, but no answer was helpful.

Here's the code in question.

if (File.Exists(dir + "/" + fileName))
{
    string XmlString = File.ReadAllText(dir + "/" + fileName, Encoding.UTF8);
    BXML_LIST deserialized;
    using (MemoryStream input = new MemoryStream(Encoding.UTF8.GetBytes(XmlString)))
    {
        using (XmlTextReader xmlTextReader = new XmlTextReader(input))
        {
            xmlTextReader.Normalization = false;
            XmlSerializer xmlSerializer = new XmlSerializer(typeof(BXML_LIST));
            deserialized = (BXML_LIST)xmlSerializer.Deserialize(xmlTextReader);
        }
    }
    xml_list.Add(deserialized);
}

Following many questions asked here, I tought I could use a method to "split" the xml file (WHILE KEEPING THE SAME TYPE OF BXML_LIST) Then deserialize it and to finish: Combine it to match it's original content to avoid having the overflow error when deserializing the whole file.

Thing is, I have no idea how to implement this. Any help or guidance would be amazing!

// Edit 1:

I've found a piece of code from another site, don't know if it could be a reliable way to combine the splitted xml file:

var xml1 = XDocument.Load("file1.xml");
var xml2 = XDocument.Load("file2.xml");
//Combine and remove duplicates
var combinedUnique = xml1.Descendants("AllNodes")
                          .Union(xml2.Descendants("AllNodes"));
//Combine and keep duplicates
var combinedWithDups = xml1.Descendants("AllNodes")
                           .Concat(xml2.Descendants("AllNodes"));

Original Q&A

There are 1 answers

**Alexander Petrov** · Accepted Answer · 2020-12-13T04:33:52+00:00

Your code gives me the creeps, you're so inefficient at using up memory.

string XmlString = File.ReadAllText - Here you load the entire file into memory at the first time.

Encoding.UTF8.GetBytes(XmlString) - Here you spend memory for the same data for the second time.

new MemoryStream(...) - Here you spend memory for the same data for the third time.

xmlSerializer.Deserialize - Here, memory is spent again for deserialized data. But there's no getting away from it.

Write like this

using (XmlReader xmlReader = XmlReader.Create(dir + "/" + fileName))
{
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(BXML_LIST));
    deserialized = (BXML_LIST)xmlSerializer.Deserialize(xmlReader);
}

In this case, xmlSerializer will read data from the file using xmlReader in a stream, in parts.

Perhaps, this may be enough to solve your problem.

TechQA.

Read Parts of an Xml File trough Stream instead of only one

There are 1 answers

Related Questions in C#

Related Questions in XML

Related Questions in XMLSERIALIZER

Related Questions in XMLSTREAMREADER

Related Questions in XDOC

Popular Questions

Popular Tags

Trending Questions