Eligibility for escape analysis / stack allocation with Java 7

4.5k views Asked by At

I am doing some tests with escape analysis in Java 7 in order to better understand what objects are eligible to stack allocation.

Here is the code I wrote to test stack allocation:

import java.util.ArrayList;
import java.util.Iterator;


public class EscapeAnalysis {

    private static final long TIME_TO_TEST = 10L * 1000L; // 10s

    static class Timestamp {
        private long millis;
        public Timestamp(long millis) {
            this.millis = millis;
        }
        public long getTime() {
            return millis;
        }
        public void setTime(long time) {
            millis = time;
        }
    }

    public static void main(String[] args) {
        long r = 0;
        System.out.println("test1");
        r += test1();
        System.out.println("test2");
        r += test2();
        System.out.println("test3");
        r += test3();
        System.out.println("test4");
        r += test4();
        System.out.println("test5");
        r += test5();
        System.out.println("test6");
        r += test6();
        System.out.println(r);
    }

    public static long test1() {
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            r += new Timestamp(System.currentTimeMillis()).getTime();
        }
        return r;
    }

    public static long test2() {
        ArrayList<Integer> l = new ArrayList<Integer>(1000);
        for (int i = 0; i < 1000; ++i) {
            l.add(i);
        }
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            for (Iterator<Integer> it = l.iterator(); it.hasNext(); ) {
                r += it.next().longValue();
            }
        }
        return r;
    }

    public static long test3() {
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            Timestamp ts = new Timestamp(System.currentTimeMillis());
            ts.setTime(42);
            r += ts.getTime();
        }
        return r;
    }

    public static long test4() {
        ArrayList<Integer> l = new ArrayList<Integer>(1000);
        for (int i = 0; i < 1000; ++i) {
            l.add(i);
        }
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            Iterator<Integer> it = l.iterator();
            r += it.next().longValue();
            r += it.next().longValue();
            r += it.next().longValue();
            r += it.next().longValue();
        }
        return r;
    }

    public static long test5() {
        ArrayList<Integer> l = new ArrayList<Integer>(1000);
        for (int i = 0; i < 1000; ++i) {
            l.add(i);
        }
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            Iterator<Integer> it = l.iterator();
            for (int i = 0; i < l.size(); ++i) {
                r += it.next().longValue();
            }
        }
        return r;
    }

    public static long test6() {
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            for (Timestamp ts = new Timestamp(System.currentTimeMillis());
                    ts.getTime() > 0;
                    ts.setTime(ts.getTime() + System.currentTimeMillis())) {
                r += ts.getTime();
            }
        }
        return r;
    }

}

And here is what it outputs with Java 7 on Linux

java -server -version                                                 
java version "1.7.0_02"
Java(TM) SE Runtime Environment (build 1.7.0_02-b13)
Java HotSpot(TM) 64-Bit Server VM (build 22.0-b10, mixed mode)

java -server -verbose:gc -XX:CompileThreshold=1 -cp bin EscapeAnalysis
test1
test2
[GC 15616K->352K(59776K), 0,0014270 secs]
[GC 15968K->288K(59776K), 0,0011790 secs]
[GC 15904K->288K(59776K), 0,0018170 secs]
[GC 15904K->288K(59776K), 0,0011100 secs]
[GC 15904K->288K(57152K), 0,0019790 secs]
[GC 15520K->320K(56896K), 0,0011670 secs]
[GC 15232K->284K(56256K), 0,0011440 secs]
test3
test4
test5
[GC 14876K->348K(55936K), 0,0005340 secs]
[GC 14620K->348K(56000K), 0,0004560 secs]
[GC 14300K->316K(55296K), 0,0004680 secs]
[GC 13948K->316K(55488K), 0,0003590 secs]
[GC 13692K->316K(54784K), 0,0004580 secs]
[GC 13436K->316K(54976K), 0,0005430 secs]
[GC 13180K->316K(54272K), 0,0004500 secs]
[GC 12924K->316K(54464K), 0,0005090 secs]
[GC 12668K->316K(53760K), 0,0004490 secs]
[GC 12412K->316K(53888K), 0,0004350 secs]
[GC 12156K->316K(53312K), 0,0005060 secs]
test6
6737499643744733086

I am using GC logs to known whether objects were allocated on the stack (idea from Escape analysis in Java) which might not be 100% reliable but seems to give good hints.

Baed on the output, stack allocation works for test1, test3, test4 and test6 and doesn't work for test2 and test5. I don't understand why this doesn't work with an iterator in for-loop although it works

  • with an iterator outside a for-loop (see test4),
  • with another object inside a for-loop (see test6).

I have read the code for the ArrayList iterator and I don't understand why it would not be eligible for stack allocation in tests 2 and 5 since it does neither escape the current method nor the current thread.

Any idea?

3

There are 3 answers

2
Matt On BEST ANSWER

EA is something the C2 compiler analyses based on the IR it generates therefore you need it to compile the method before enjoying the benefits. Each test is called once only so there is no chance for it to compile. Details on EA and the C2 IR in the hotspot internals wiki (https://wiki.openjdk.java.net/display/HotSpot/Overview+of+Ideal,+C2%27s+high+level+intermediate+representation and https://wiki.openjdk.java.net/display/HotSpot/EscapeAnalysis)

Here's a version that attempts to show the impact

import com.sun.management.ThreadMXBean;

import java.lang.management.ManagementFactory;
import java.util.ArrayList;
import java.util.Iterator;


public class EscapeAnalysisTest {

    private static final long TIME_TO_TEST = 10L * 1000L; // 10s

    static class Timestamp {
        private long millis;

        public Timestamp(long millis) {
            this.millis = millis;
        }

        public long getTime() {
            return millis;
        }

        public void setTime(long time) {
            millis = time;
        }
    }

    public static void main(String[] args) {
        System.out.println("****");
        doIt();
        System.out.println("****");
        doIt();
        System.out.println("****");
        doIt();
        System.out.println("****");
        doIt();
        System.out.println("****");
    }

    private static void doIt() {
        final ThreadMXBean mxbean = (ThreadMXBean) ManagementFactory.getThreadMXBean();
        final long tid = Thread.currentThread().getId();
        long r = 0;
        final long allocPre = mxbean.getThreadAllocatedBytes(tid);
        r += test1();
        long alloc1 = mxbean.getThreadAllocatedBytes(tid);
        System.out.println("test1 - " + (alloc1 - allocPre));
        r += test2();
        final long alloc2 = mxbean.getThreadAllocatedBytes(tid);
        System.out.println("test2 - " + (alloc2 - alloc1));
        r += test3();
        final long alloc3 = mxbean.getThreadAllocatedBytes(tid);
        System.out.println("test3 - " + (alloc3 - alloc2));
        r += test4();
        final long alloc4 = mxbean.getThreadAllocatedBytes(tid);
        System.out.println("test4 - " + (alloc4 - alloc3));
        r += test5();
        final long alloc5 = mxbean.getThreadAllocatedBytes(tid);
        System.out.println("test5 - " + (alloc5 - alloc4));
        r += test6();
        final long alloc6 = mxbean.getThreadAllocatedBytes(tid);
        System.out.println("test6 - " + (alloc6 - alloc5));
        System.out.println(r);
    }

    public static long test1() {
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            r += new Timestamp(System.currentTimeMillis()).getTime();
        }
        return r;
    }

    public static long test2() {
        ArrayList<Integer> l = new ArrayList<Integer>(1000);
        for (int i = 0; i < 1000; ++i) {
            l.add(i);
        }
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            for (Iterator<Integer> it = l.iterator(); it.hasNext(); ) {
                r += it.next().longValue();
            }
        }
        return r;
    }

    public static long test3() {
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            Timestamp ts = new Timestamp(System.currentTimeMillis());
            ts.setTime(42);
            r += ts.getTime();
        }
        return r;
    }

    public static long test4() {
        ArrayList<Integer> l = new ArrayList<Integer>(1000);
        for (int i = 0; i < 1000; ++i) {
            l.add(i);
        }
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            Iterator<Integer> it = l.iterator();
            r += it.next().longValue();
            r += it.next().longValue();
            r += it.next().longValue();
            r += it.next().longValue();
        }
        return r;
    }

    public static long test5() {
        ArrayList<Integer> l = new ArrayList<Integer>(1000);
        for (int i = 0; i < 1000; ++i) {
            l.add(i);
        }
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            Iterator<Integer> it = l.iterator();
            for (int i = 0; i < l.size(); ++i) {
                r += it.next().longValue();
            }
        }
        return r;
    }

    public static long test6() {
        long r = 0;
        long start = System.currentTimeMillis();
        while (System.currentTimeMillis() - start < TIME_TO_TEST) {
            for (Timestamp ts = new Timestamp(System.currentTi());
                 ts.getTime() > 0;
                 ts.setTime(ts.getTime() + System.currentTimeMillis())) {
                r += ts.getTime();
            }
        }
        return r;
    }

}

which generates the following output when run with -server -XX:CompileThreshold=1

****
test1 - 109048
test2 - 89243416
test3 - 16664
test4 - 42840
test5 - 71982168
test6 - 1400
-5351026995119026839
****
test1 - 16432
test2 - 85921464
test3 - 16664
test4 - 42840
test5 - 66777600
test6 - 1368
7844020592566674506
****
test1 - 48
test2 - 18256
test3 - 272
test4 - 18264
test5 - 18264
test6 - 272
-2137858376905291730
****
test1 - 48
test2 - 18256
test3 - 272
test4 - 18264
test5 - 18264
test6 - 272
3273987624143297143
****

one danger here is that the compilation of this method has changed it more fundamentally, I haven't attempted to guard against this so some use of LogCompilation or PrintCompilation might be required to check.

0
the8472 On

Escape Analysis heavily relies on inlining of function calls.

Like with any other microbenchmark - especially on the server VM - warmup is required. If you remove -XX:CompileThreshold=1 and execute your main test in a loop you will notice that after 1-2 iterations it will stop collecting garbage because the compiler gathered enough profiling information to inline the methods and then perform escape analysis.

0
JanKanis On

I just investigated the same thing, but for Java 8. I put my answer in a duplicate question as I didn't find this one in time.

Summary from the full answer:

First of all, it's implementation dependent. This answer applies to OpenJDK 1.8 and probably also the Oracle JVM 1.8.

Secondly, as others have stated, stack allocation only happens when a method is compiled by the C2 compiler, which only happens once a method has been called enough times.

If so, objects can be stack allocated if

  • all method calls that use it are inlined
  • it is never assigned to any static or object fields, only to local variables (parameters to inlined method calls become local variables)
  • at each point in the program, which local variables contain references to the object must be JIT-time determinable, and not depend on any unpredictable conditional control flow.
  • If the object is an array, its size must be JIT-time constant and indexing into it must use JIT-time constants.

The inlining especially is not predictable if you don't know some of the specific quirks of Hotspot. See the linked answer for some details.

Edit: I tried running your test on java 8 (OpenJDK), and everything is inlined there. So there are differences in stack allocation between java 7 and 8.