What's the size cost of Java inheritance?

5.1k views Asked by At

There are various articles out there on the interwebs that try to empirically estimate the overhead of a java.lang.Object in particular JVM implementations. For example, I've seen the size overhead of a bare Object estimated at 8 bytes in some JVMs.

What I would like to know is whether a typical JVM implementation of the extends relationship introduces incremental size overhead at every level of the class hierarchy. In other words, suppose you have a class hierarchy with N levels of subclasses. Is the overhead of the in-memory representation of a class instance O(1) or O(N)?

I imagine it is O(1) because although the size of some of the hidden fluffy stuff you need to be a Java Object (vtable, chain of classes) will grow as the inheritance hierarchy grows, they grow per-class, not per-instance, and the JVM implementation can store constant-size pointers to these entities in a constant-size header attached to every Object.

So in theory, the overhead directly attached to the in-memory representation of any Java object should be O(1) for inheritance depth N. Does anyone know if it's true in practice?

5

There are 5 answers

3
tucuxi On BEST ANSWER

When in doubt, look at the source (well, a source; each JVM is free to choose how to do it, as the standard does not mandate any internal representation). So I had a look, and found the following comment within the implementation of JDK 7-u60's hotspot JVM:

// A Klass is the part of the klassOop that provides:
//  1: language level class object (method dictionary etc.)
//  2: provide vm dispatch behavior for the object
// Both functions are combined into one C++ class. The toplevel class "Klass"
// implements purpose 1 whereas all subclasses provide extra virtual functions
// for purpose 2.

// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object.  Thus,
// normal oops don't have any virtual functions.  Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's

The way I read it, this means that, for this (very popular) implementation, object instances only store a pointer to their class. The cost, per-instance of a class, of that class having a longer or shorter inheritance chain is effectively 0. The classes themselves do take up space in memory though (but only once per class). Run-time efficiency of deep inheritance chains is another matter.

7
user207421 On

in theory, the overhead directly attached to the in-memory representation of any Java object should be O(1) for inheritance depth N. Does anyone know if it's true in practice?

It can't be O(1) unless there are zero instance members at every level. Every instance member requires space per instance.

3
Sotirios Delimanolis On

The JVM specification states

The Java Virtual Machine does not mandate any particular internal structure for objects.

So the specification does not care how you do it. But...

In some of Oracle’s implementations of the Java Virtual Machine, a reference to a class instance is a pointer to a handle that is itself a pair of pointers: one to a table containing the methods of the object and a pointer to the Class object that represents the type of the object, and the other to the memory allocated from the heap for the object data.

So in typical Oracle implementations it is O(1) for methods. This method table is the Method Area which is per class.

The Java Virtual Machine has a method area that is shared among all Java Virtual Machine threads. The method area is analogous to the storage area for compiled code of a conventional language or analogous to the "text" segment in an operating system process. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors, including the special methods (§2.9) used in class and instance initialization and interface initialization.

Also, about method entries

The method_info structures represent all methods declared by this class or interface type, including instance methods, class methods, instance initialization methods (§2.9), and any class or interface initialization method (§2.9). The methods table does not include items representing methods that are inherited from superclasses or superinterfaces.

0
Steve Jessop On

An instance generally requires the following data, although it's up to the implementation exactly what to do:

  • the instance fields of the class and its parent classes, which I assume you don't mean to include in the term "overhead"
  • some means to lock the object
  • if the garbage collector relocates objects, then some means to record the original hash of the object (for Object.hashCode)
  • some means to access type information

As you guess in your question, in a "normal" Java implementation the type information is stored per-class, not per instance. Part of the definition of "type" is that two instances of the same class necessarily have the same type information, there's no obvious reason not to share it. So you would expect that per-instance overhead to be constant, not to depend on the class hierarchy.

That is to say, adding extra empty classes or interfaces to a class should not increase the size of its instances. I don't think either the language or the JVM spec actually guarantees this, though, so don't make too many assumptions about what a "non-normal" Java implementation is allowed to do.

As an aside, the second and third things on my list can be combined via cunning trickery, so that both of them together are a single pointer. The article you link to refers to references taking 4 bytes, so the 8 bytes it comes up with for an object is one pointer to type information, one field containing either a hashcode or a pointer to a monitor, and probably some flags in the lowest 2 bits of one or both of those pointer fields. Object will (you'd expect) be larger on a 64-bit Java.

1
user949300 On

Double and Integer, which extend Number, which extends Object, do not have an O(n) behavior, that is, an Integer is not 3X the size of an Object, so I think the answer is O(1). e.g. see this old SO question