Are there any drawbacks in merging collections with Stream.concat()?

70 views Asked by At

The goal: to merge two collections into (a third) one

I like the fluency of this solution (it only takes one statement):

Stream.concat(collection.stream(), anotherCollection.stream())
    .collect(/* some Collector */)

However, my latest PR commit, though accepted and merged, was edited to replace that line with this (notice there are now twice as many statements):

Collection</* some type */> mergedCollection = new /* Collection implementation */<>(collection);

mergedCollection.addAll(anotherCollection);

I don't want to bother the maintainer who made the change so I decided to ask the SO community:

Are there any drawbacks in merging collections with Stream.concat()?

2

There are 2 answers

4
Nikolas Charalambidis On

Its main disadvantage to using Stream#concat is it only accepts 2 streams. It does what it says: concatenates two streams together into one.

If you need to concatenate three or more streams, use Stream#of, however, it comes with flat mapping:

final List<Object> list = Stream.of(collection1, collection2, collection3)
    .flatMap(Collection::stream)
    .toList();

Flat mapping itself is not bad but relies on a stream's existence for the concatenation. It is not ideal if there are a lot of nested structures as the result is an unnecessary number of streams created temporarily only for flat mapping and the performance might be hit with larger numbers.

What to do? The best option is Stream#mapMulti to avoid excessive creation of additional streams:

final List<Object> list = Stream.of(collection1, collection2, collection3)
    .mapMulti(Collection::forEach)
    .toList();

To understand it better, here is the expanded notation using a lambda expression instead of a method reference:

final List<Object> list = Stream.of(collection1, collection2, collection3)
    .mapMulti((collection, consumer) -> collection.forEach(consumer))
    .toList();

This was invented as of Java 16 and I wrote another answer regarding its use.

2
Didier L On

As stated by Holger in a comment, both approaches are fine, but the second one may give substantial performance benefits, if it matters for them.

Indeed, when the Stream approach copies elements one by one through the collection process, most Collection implementations have optimized addAll() implementations.

For example, ArrayList will do everything through System.arraycopy(), a native call that will typically rely on pure memory operations.