I want to write an xml document to disk in a compact format. To this end, I use the net framework method XmlDictionaryWriter.CreateBinaryWriter(Stream stream,IXmlDictionary dictionary)
This method writes a custom compact binary xml representation, that can later be read by XmlDictionaryWriter.CreateBinaryReader
. The method accepts an XmlDictionary
that can contain common strings, so that those strings do not have to be printed in the output each time. Instead of the string, the dictionary index will be printed in the file. CreateBinaryReader
can later use the same dictionary to reverse the process.
However the dictionary I pass is apparently not used. Consider this code:
using System.IO;
using System.Xml;
using System.Xml.Linq;
class Program
{
public static void Main()
{
XmlDictionary dict = new XmlDictionary();
dict.Add("myLongRoot");
dict.Add("myLongAttribute");
dict.Add("myLongValue");
dict.Add("myLongChild");
dict.Add("myLongText");
XDocument xdoc = new XDocument();
xdoc.Add(new XElement("myLongRoot",
new XAttribute("myLongAttribute", "myLongValue"),
new XElement("myLongChild", "myLongText"),
new XElement("myLongChild", "myLongText"),
new XElement("myLongChild", "myLongText")
));
using (Stream stream = File.Create("binaryXml.txt"))
using (var writer = XmlDictionaryWriter.CreateBinaryWriter(stream, dict))
{
xdoc.WriteTo(writer);
}
}
}
The produced output is this (binary control characters not shown)
@
myLongRootmyLongAttribute˜myLongValue@myLongChild™
myLongText@myLongChild™
myLongText@myLongChild™
myLongText
So apparently the XmlDictionary has not been used. All strings appear in their entirety in the output, even multiple times.
This is not a problem limited to XDocument. In the above minimal example I used a XDocument to demonstrate the problem, but originally I stumbled upon this while using XmlDictionaryWriter in conjunction with a DataContractSerializer, as it is commonly used. The results were the same:
[Serializable]
public class myLongChild
{
public double myLongText = 0;
}
...
using (Stream stream = File.Create("binaryXml.txt"))
using (var writer = XmlDictionaryWriter.CreateBinaryWriter(stream, dict))
{
var dcs = new DataContractSerializer(typeof(myLongChild));
dcs.WriteObject(writer, new myLongChild());
}
The resulting output did not use my XmlDictionary.
How can I get XmlDictionaryWriter to use the suplied XmlDictionary?
Or have I misunderstood how this works?
with the DataContractSerializer approach, I tried debugging the net framework code (visual studio/options/debugging/enable net. framework source stepping). Apparently the Writer does attempt to lookup each of the above strings in the dictionary, as expected. However the lookup fails in line 356 of XmlbinaryWriter.cs, for reasons that are not clear to me.
Alternatives I have considered:
There is an overload for XmlDictionaryWriter.CreatebinaryWriter, that also accepts a XmlBinaryWriterSession. The writer then adds any new strings it encounters into the session dictionary. However, I want to only use a static dictionary for reading and writing, which is known beforehand.
I could wrap the whole thing into a
GzipStream
and let the compression take care of the multiple instances of strings. However, this would not compress the first instance of each string, and seems like a clumsy workaround overall.
Yes there is a misunderstanding.
XmlDictionaryWriter
is primarily used for serialization of objects and it is child class ofXmlWriter
.XDocument.WriteTo(XmlWriter something)
takesXmlWriter
as argument. The callXmlDictionaryWriter.CreateBinaryWriter
will create an instance ofSystem.Xml.XmlBinaryNodeWriter
internally. This class has both methods for "regular" writing:and for dictionary based approach:
The later is mostly used if you serialize object via
DataContractSerializer
(notice its methodWriteObject
takes argument of bothXmlDictionaryWriter
andXmlWriter
type), whileXDocument
takes justXmlWriter
.As for your problem - if I were you I'd make my own
XmlWriter
:UPDATE (based on your comment)
If you indeed use
DataContractSerializer
you have few mistakes in your code.1) POC classes have to be decorated with
[DataContract]
and[DataMember]
attribute, the serialized value should be property and not field; also set namespace to empty value or you'll have to deal with namespaces in your dictionary as well. Like:2) Provide instance of session as well; for null session the dictionary writer uses default (
XmlWriter
-like) implementation: