Automatic Pointer Swizzling by Java?

837 views Asked by At

Suppose we have an arbitrary graph represented by nodes and pointers like this:

class Node
{
    public ValueType data;
    public ArrayList<Node> adj;
}

Now, I want to take a copy of it or write/read it on the disk (AKA serialize/deserialize). I also know that it can be done using a search algorithm + associative arrays. And, it turns out that this method is called swizzling.

Here goes my question:

I have heard that in Java by declaring the class as Serializable, this feature is provided for you automatically. (which sounds like a magic to me!)

Is this statement correct? Does Java automatically run BFS to traverse the graph and swizzle the pointers? In other words, does serialize/deserialize clone the object for me? (a completely brand new object with the same structure but new nodes and updated pointers)

If yes, then what if in some cases I just want to copy the pointers? what if I want to serialize the object just to keep the original pointers?

I appreciate any comments on this. :-)

2

There are 2 answers

1
Mehrdad Afshari On BEST ANSWER

I'll address your last question first. The purpose of serialization is not cloning an object graph in memory. It is to transform an object graph to a stream of bytes in order to do things like saving in a file or sending across the wire. The deserialization process might be done on a different computer, at a different time, in a different process, or even by a non-Java program, so it is not a reasonable expectation to get references to the same objects as before. It is the structure and contents of the object graph that are being saved and later restored, not in-memory addresses. For precisely this reason, it does not make sense for all objects to be serializable. For instance, serializing a Thread is not going to be useful, because it is not going to be meaningful outside the current instance of a program.

The magic behind automatic serialization is not very complex. Ignoring custom serialization methods that you can write for your own classes to precisely control the serialization and deserialization behavior, yes, the system will effectively traverse the object graph in order to generate a stream of bytes. This traversal is generally done as a DFS, not BFS. Basically, you ask Java to serialize an object, passing a reference to it. That reference will serve as the root of the object graph. From there, Java will recursively serialize the fields of that object. Of course, it does track circular references and writes out appropriate markings in the output stream so that the deserializer is able to hook up the pointers and recreate the structure as it was before.

0
parkovski On

I don't think it's quite how you're thinking of it, but pretty much. Serialization in Java is a fairly opaque process. All you really need to know about it is, assuming an class, and all the types of its members, implement Serializable, Java knows how to convert it to a stream of bytes, and how to recreate the instances of the objects from that stream when you ask it to deserialize.

Coming from C++, it did seem like black magic at first. I was kind of sceptical of the whole process and didn't really trust the JVM to take care of it for me, because in C++ it just doesn't know enough about a plain object to do this. But it's actually really nice, assuming you only need to access the data from Java.

Basically, there is no need to worry about pointers or what algorithm it uses underneath. You just tell it to write an object, and then later tell it to read one back, and you've essentially got the exact same in-memory structure as you had before.

One more thing: if you declare a variable as transient, it won't be saved and you'll have to restore it yourself. This is useful if you have fields that cache certain values that you don't want to waste space for, or fields with sensitive data that you don't want lying around. But you'll have to remember to restore it yourself.