I am having hard time to understand what is going on with this benchmark. I want to measure how my sample class StringBand
works in comparison to StringBuilder
. The idea with StringBand
is to concatenate strings at the toString()
, not on the append()
.
Sources
Here is the StringBand
source - stripped down for benchmark:
public class StringBandSimple {
private String[] array;
private int index;
private int length;
public StringBandSimple(int initialCapacity) {
array = new String[initialCapacity];
}
public StringBandSimple append(String s) {
if (s == null) {
s = StringPool.NULL;
}
if (index >= array.length) {
//expandCapacity();
}
array[index++] = s;
length += s.length();
return this;
}
public String toString() {
if (index == 0) {
return StringPool.EMPTY;
}
char[] destination = new char[length];
int start = 0;
for (int i = 0; i < index; i++) {
String s = array[i];
int len = s.length();
//char[] chars = UnsafeUtil.getChars(s);
//System.arraycopy(chars, 0, destination, start, len);
s.getChars(0, len, destination, start);
start += len;
}
return new String(destination);
}
}
This code uses: UnsafeUtil.getChars()
to actually get String
char[] without copying, see the code here. We can also use getChars()
and its still the same.
Here is the JMH test:
@State
public class StringBandBenchmark {
String string1;
String string2;
@Setup
public void prepare() {
int len = 20;
string1 = RandomStringUtil.randomAlphaNumeric(len);
string2 = RandomStringUtil.randomAlphaNumeric(len);
}
@GenerateMicroBenchmark
public String stringBuilder2() {
return new StringBuilder(string1).append(string2).toString();
}
@GenerateMicroBenchmark
public String stringBand2() {
return new StringBandSimple(2).append(string1).append(string2).toString();
}
}
Analyses
Here is my understanding of what is going on when adding two strings of 20 chars.
StringBuilder
new char[20+16]
is created (36 chars)arraycopy
is called to copy 20string1
chars toStringBuilder
- before second appending,
StringBuilder
expands the capacity, since 40 > 36 - therefore,
new char[36*2+2]
is created arraycopy
of 20 chars to new bufferarraycopy
of 20 chars to append sencondstring2
- finally,
toString()
returnsnew String(buffer, 0, 40)
StringBand
new String[2]
is created- both appending just keep strings in the internal buffer, until
toString()
is called length
is incremented twicenew char[40]
is created (total length of resulting string)arraycopy
of 20 first string chars (UnsafeUtil
provides realchar[]
buffer of a string)arraycopy
of 20 second string chars- finally, returns
new String(buffer, 0, 40)
Expectations
With StringBand
we have:
- one less
arraycopy
- what is the whole purpose of doing this - less allocation size:
new String[]
andnew char[]
vs. twonew char[]
- plus we don't have many checks as in
StringBuilder
methods (for size etc)
So I would expect that StringBand
works at least the same as StringBuilder
, if not faster.
Benchmark results
Im running benchmark on MacBookPro, mid 2013. Using JMH v0.2 and Java 1.7b45
Command:
java -jar build/libs/microbenchmarks.jar .*StringBand.* -wi 2 -i 10 -f 2 -t 2
Number of warmup iterations (2) is fine, as I can see that second iteration reaches the same performance.
Benchmark Mode Thr Count Sec Mean Mean error Units
j.b.s.StringBandBenchmark.stringBand2 thrpt 2 20 1 37806.993 174.637 ops/ms
j.b.s.StringBandBenchmark.stringBuilder2 thrpt 2 20 1 76507.744 582.131 ops/ms
Results are saying that StringBuilder
is twice faster. The same happens when I eg rise the number of threads to 16, or use explicitly BlackHole
s in the code.
Why?
Ok, as usual, "the owls are not what they seem". Reasoning about code performance by inspecting the Java code quickly gets weird. Reasoning by looking into the bytecode feels the same. Generated code disassembly should shed more light on this, even though there are minor cases where the assembly is too high-level to explain the phenomenon.
That is because platforms heavily optimize the code, at every level. Here is the hint where you should look. Running you benchmark at i5 2.0 GHz, Linux x86_64, JDK 7u40.
Baseline:
Yeah, surprising. Now, watch this. Nothing in my sleeves, except for...
-XX:-OptimizeStringConcat:
Forbidding VM from string optimizations yield the "expected" result, as laid out in the original analysis. HotSpot is known to have the optimizations around StringBuilders, effectively recognizing the usual idioms like
new StringBuilder().append(...).append(...).toString()
and producing more effective code for the statement.Disassembling and figuring out what exactly happened with the string optimization applied is left as exercise for the interested readers :)