ClassWriter COMPUTE_FRAMES in ASM

1.3k views Asked by At

I've been trying to understand how stack map frames work in Java by playing around with jumps in ASM. I created a simple method to try some things out: (disassembled with Krakatau):

    L0:     ldc 'hello' 
    L2:     astore_1 
    L3:     getstatic Field java/lang/System out Ljava/io/PrintStream; 
    L6:     new java/lang/StringBuilder 
    L9:     dup 
    L10:    invokespecial Method java/lang/StringBuilder <init> ()V 
    L13:    ldc 'concat1' 
    L15:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L18:    aload_1 
    L19:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L22:    invokevirtual Method java/lang/StringBuilder toString ()Ljava/lang/String; 
    L25:    invokevirtual Method java/io/PrintStream println (Ljava/lang/String;)V 
    L28:    getstatic Field java/lang/System out Ljava/io/PrintStream; 
    L31:    new java/lang/StringBuilder 
    L34:    dup 
    L35:    invokespecial Method java/lang/StringBuilder <init> ()V 
    L38:    ldc 'concat2' 
    L40:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L43:    aload_1 
    L44:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L47:    invokevirtual Method java/lang/StringBuilder toString ()Ljava/lang/String; 
    L50:    invokevirtual Method java/io/PrintStream println (Ljava/lang/String;)V 
    L53:    return 

All it does is create a StringBuilder to join some strings with variables.

Since the invokespecial call at L35 has exactly the same stack as the invokespecial call at L10, I decided to add an ICONST_1; IFEQ L10 sequence just before L35 with ASM.

When I dissassembled (again with Krakatau), I found the results quite strange. ASM had computed the stack frame at L10 to be:

.stack full
    locals Object [Ljava/lang/String; Object java/lang/String 
    stack Object java/io/PrintStream Top Top 
.end stack

instead of

    stack Object java/io/PrintStream Object java/lang/StringBuilder Object java/lang/StringBuilder

as I had expected.

Furthermore, this class would also not pass verification as one cannot call StringBuilder#<init> on Top. According to the ASM manual, Top refers to an uninitialized value, but it doesn't seem to be uninitialized in code, both from the jump location and the code before. I don't understand what is wrong with the jump.

Is there something wrong with the jump I inserted that somehow makes the class impossible to compute frames for? Is this perhaps a bug with ASM's ClassWriter?

2

There are 2 answers

1
Holger On BEST ANSWER

Uninitialized instances are special. Consider that, when you dup the reference, you have already two references to the same instance on the stack and you might perform even more stack manipulations or transfer the reference to a local variable and from there, copy it to other variables or push it again. Still, the target of the reference is supposed to be initialized exactly once before you use it in any way. To verify this, the identity of the object must be tracked, so that all these references to the same object will turn from uninitialized to initialized when you perform an invokespecial <init> on it.

The Java programming language doesn’t use all the possibilities, but for legal code like
new Foo(new Foo(new Foo(), new Foo(b? new Foo(a): new Foo(b, c))), it should not loose track about which Foo instance has been initialized and which not, when the branch is made.

So each Uninitialized Instance stack frame entry is tied to the new instruction that created it. All entries keep the reference (which can be handled as easy as remembering the byte code offset of the new instruction) when being transferred or copied. Only after invokespecial <init> has been invoked on it, all references pointing to the same new instruction turn to an ordinary instance of the declaring class and can be subsequently merged with other type compatible entries.

This implies that a branch, like you are trying to achieve, is not possible. The two Uninitialized Instance entries of the same type, but created by different new instructions, are incompatible. And incompatible types are merged to a Top entry, which is basically an unusable entry. It could be even correct code, if you don’t attempt to use that entry at the branch target, so ASM is not doing anything wrong when merging them to Top without complaining.

Note that this also implies that any kind of loop that could lead to a stack frame having more than one uninitialized instance created by the same new instruction, is not allowed.

4
Rafael Winterhalter On

The new java/lang/StringBuilder does not create a valid StringBuilder but rather an unitialized object which is donated with TOP in stack map frames. This value is used when a jump instruction is added during the construction of an object, for example:

new Foo(a ? b : c);

which is translated to several goto-statements.

The object is first considered a StringBuilder when the constructor is invoked on an object, i.e. invokespecial Method java/lang/StringBuilder <init> ()V. The JVM does not support initalizing this object at a different location as the verifier can only look at the TOP type which does not reflect the desired type actual shade which is an unitialized StringBuilder. You could argue that the JVM should support this but this would require larger arrays to contain stack map frames to reflect both the type and the initialization state which would probably not justify this power which is not even used by the Java language.

To make this clear, consider the following case:

new Foo
dup 
.stack full
    locals 
    stack Top Top 
.end stack
invokespecial Bar <init> ()V 

This would be valid if the JVM allowed unchecked initialization on TOP types but you clearly should not be allowed to call a Bar constructor on a Foo.