.NET Binary Serialize object with references to other objects . . . what happens?

2.6k views Asked by At

If you have an object instance A that references other objects (for example instances B and C), and you binary serialize A to a file, what happens? Do you now have serialized data that includes A, B and C?

How does it work exactly? What will I get if I deserialize the data? A, B, and C??

(Feel free to include internal workings explanations as well).

3

There are 3 answers

2
Cody Gray - on strike On BEST ANSWER

All of the references to other objects will be serialized as well. If you deserialize the data, you will end up with a complete, working set of its data, including objects A, B, and C. That's probably the primary benefit of binary serialization, as opposed to XML serialization.

If any of the other classes your object holds a reference to are not marked with the [Serializable] attribute, you'll get a SerializationException at run-time (the image of which was shamelessly stolen from the web; run-time errors don't even look like this anymore in the current versions of VS):

    Example of an unhandled SerializationException

Further than that, I'm not really sure what "internal things" you were hoping to understand. Serialization uses reflection to walk through the public and private fields of objects, converting them to a stream of bytes, which are ultimately written out to a data stream. During deserialization, the inverse happens: a stream of bytes is read in from the data stream, which is used to synthesize an exact replicate of the object, along with type information. All of the fields in the object have the same values that they held before; the constructor is not called when an object is deserialized. The easiest way to think about it is that you're simply taking a snapshot-in-place of the object, that you can restore to its original state at will.

The class that is responsible for the actual serialization and deserialization is called a formatter (it always inherits from the IFormatter interface). It's job is to generate an "object graph", which is a generalized tree containing the object that is being serialized/deserialized as its root. As mentioned above, the formatter uses reflection to walk through this object graph, serializing/deserializing all object references contained by that object. The formatter is also intelligent enough to know not to serialize any object in the graph more than once. If two object references actually point to the same object, this will be detected and that object will only be serialized once. This and other logic prevents entering an infinite loop.

Of course, it's easy to have a good general understanding of how this process works. It's much harder to actually write the code that implements it yourself. Fortunately, that's already been done for you. Part of the point of the .NET Framework is that all this complicated serialization logic is built in, leaving you free from worrying about it. I don't claim to understand all of it myself, and you certainly don't need to either to take full advantage of the functionality it offers. Years of writing all that code by hand are finally over. You should be rejoicing, rather than worrying about implementation details. :-)

6
Felice Pollano On

The objects referred by the main object has to be [Serializable] as well. Providing so all is done automatically by the formatter.

0
DuckMaestro On

Firstly, object A's type must be tagged with the [Serializable] attribute. Serializing A will serialize all its member data, private or public, provided the members' types are also tagged with [Serializable] (or to use your example, provided that B and C's types are marked [Serializable]). Attempts to serialize data, directly or indirectly, of a type that is not [Serializable] will result in an exception.

A number of the built-in .NET types are already marked as [Serializable], including System.Int32 (int), System.Boolean (bool), etc.

You can read more about .NET serialization here: http://msdn.microsoft.com/en-us/library/4abbf6k0.aspx.