I am having hard time to understand what is going on with this benchmark. I want to measure how my sample class StringBand works in comparison to StringBuilder. The idea with StringBand is to concatenate strings at the toString(), not on the append().
Sources
Here is the StringBand source - stripped down for benchmark:
public class StringBandSimple {
private String[] array;
private int index;
private int length;
public StringBandSimple(int initialCapacity) {
array = new String[initialCapacity];
}
public StringBandSimple append(String s) {
if (s == null) {
s = StringPool.NULL;
}
if (index >= array.length) {
//expandCapacity();
}
array[index++] = s;
length += s.length();
return this;
}
public String toString() {
if (index == 0) {
return StringPool.EMPTY;
}
char[] destination = new char[length];
int start = 0;
for (int i = 0; i < index; i++) {
String s = array[i];
int len = s.length();
//char[] chars = UnsafeUtil.getChars(s);
//System.arraycopy(chars, 0, destination, start, len);
s.getChars(0, len, destination, start);
start += len;
}
return new String(destination);
}
}
This code uses: UnsafeUtil.getChars() to actually get String char[] without copying, see the code here. We can also use getChars() and its still the same.
Here is the JMH test:
@State
public class StringBandBenchmark {
String string1;
String string2;
@Setup
public void prepare() {
int len = 20;
string1 = RandomStringUtil.randomAlphaNumeric(len);
string2 = RandomStringUtil.randomAlphaNumeric(len);
}
@GenerateMicroBenchmark
public String stringBuilder2() {
return new StringBuilder(string1).append(string2).toString();
}
@GenerateMicroBenchmark
public String stringBand2() {
return new StringBandSimple(2).append(string1).append(string2).toString();
}
}
Analyses
Here is my understanding of what is going on when adding two strings of 20 chars.
StringBuilder
new char[20+16]is created (36 chars)arraycopyis called to copy 20string1chars toStringBuilder- before second appending,
StringBuilderexpands the capacity, since 40 > 36 - therefore,
new char[36*2+2]is created arraycopyof 20 chars to new bufferarraycopyof 20 chars to append sencondstring2- finally,
toString()returnsnew String(buffer, 0, 40)
StringBand
new String[2]is created- both appending just keep strings in the internal buffer, until
toString()is called lengthis incremented twicenew char[40]is created (total length of resulting string)arraycopyof 20 first string chars (UnsafeUtilprovides realchar[]buffer of a string)arraycopyof 20 second string chars- finally, returns
new String(buffer, 0, 40)
Expectations
With StringBand we have:
- one less
arraycopy- what is the whole purpose of doing this - less allocation size:
new String[]andnew char[]vs. twonew char[] - plus we don't have many checks as in
StringBuildermethods (for size etc)
So I would expect that StringBand works at least the same as StringBuilder, if not faster.
Benchmark results
Im running benchmark on MacBookPro, mid 2013. Using JMH v0.2 and Java 1.7b45
Command:
java -jar build/libs/microbenchmarks.jar .*StringBand.* -wi 2 -i 10 -f 2 -t 2
Number of warmup iterations (2) is fine, as I can see that second iteration reaches the same performance.
Benchmark Mode Thr Count Sec Mean Mean error Units
j.b.s.StringBandBenchmark.stringBand2 thrpt 2 20 1 37806.993 174.637 ops/ms
j.b.s.StringBandBenchmark.stringBuilder2 thrpt 2 20 1 76507.744 582.131 ops/ms
Results are saying that StringBuilder is twice faster. The same happens when I eg rise the number of threads to 16, or use explicitly BlackHoles in the code.
Why?
Ok, as usual, "the owls are not what they seem". Reasoning about code performance by inspecting the Java code quickly gets weird. Reasoning by looking into the bytecode feels the same. Generated code disassembly should shed more light on this, even though there are minor cases where the assembly is too high-level to explain the phenomenon.
That is because platforms heavily optimize the code, at every level. Here is the hint where you should look. Running you benchmark at i5 2.0 GHz, Linux x86_64, JDK 7u40.
Baseline:
Yeah, surprising. Now, watch this. Nothing in my sleeves, except for...
-XX:-OptimizeStringConcat:
Forbidding VM from string optimizations yield the "expected" result, as laid out in the original analysis. HotSpot is known to have the optimizations around StringBuilders, effectively recognizing the usual idioms like
new StringBuilder().append(...).append(...).toString()and producing more effective code for the statement.Disassembling and figuring out what exactly happened with the string optimization applied is left as exercise for the interested readers :)