Exponentially Growing Stream
I have a Stream
that grows exponentially for creating permutations. So each call to addWeeks
increases the number of elements in the Stream
.
Stream<SeasonBuilder> sbStream = sbSet.stream();
for (int i = 1; i <= someCutOff; i++) {
sbStream = sbStream.map(sb -> sb.addWeeks(possibleWeeks))
.flatMap(Collection::stream);
}
// Collect SeasonBuilders into a Set
return sbStream.collect(Collectors.toSet()); // size > 750 000
Problems
- Each call to
addWeeks
returns aSet<SeasonBuilder>
and collecting everything into aSet
takes a while. addWeeks
is not static and needs to be called on eachSeasonBuilder
in the stream, each time through the looppublic Set<SeasonBuilder> addWeeks( final Set<Set<ImmutablePair<Integer, Integer>>> possibleWeeks) { return possibleWeeks.stream() .filter(containsMatchup()) // Finds the weeks to add .map(this::addWeek) // Create new SeasonBuilders with the new week .collect(Collectors.toSet());
Out of memory error..... when possible weeks has size = 15
Questions
- Should I be using a method chain other than
map
followed byflatmap
? - How can I modify
addWeeks
so that I don't have to collect everything into aSet
?- Should I return a
Stream<SeasonBuilder>
? Can Iflatmap
aStream
?
- Should I return a
Update:
Thanks for the help everyone!
I have put the code for the methods in a gist
Thanks to @Holger and @lexicore for suggesting returning a Stream<SeasonBuilder>
in addWeeks
. Minor performance increase, as was predicted by @lexicore
I tried using parallelStream()
and there was no significant change in performance
Context
I am creating all possible permutations of a Fantasy Football season, which will be used elsewhere for stats analysis. In a 4-team, 14-week season, for any given week, there could be three different possibilities
- (1 vs 2) , (3 vs 4)
- (1 vs 3) , (2 vs 4)
- (1 vs 4) , (2 vs 3)
To solve the problem, plug in the permutations, and we have all our possible seasons. Done! But wait... what if Team 1 only ever plays Team 2. Then the other teams would be sad. So there are some constraints on the permutations.
Every team must play each other roughly the same amount of times (i.e. Team 1 cannot play against Team 3 ten times in a single season). In this example - 4-teams, 14 weeks - each team is capped at playing another team 5 times. So some sort of filtering has to happen when creating permutations, to avoid non-valid seasons.
Where this gets more interesting is:
- 6 Team League -- 15 possible weeks
- 8 Team League -- 105 possible weeks
- 10 Team League -- 945 possible weeks
I am trying to optimize performance where possible, because there are a lot of permutations to create. A 4-team, 14-week season creates 756 756 (=14!/(5!5!4!)) possible seasons, given the constraints. 6-team or 8-team seasons just get crazier.
Your whole construction is very suspicious to begin with. If you're interested in performance it is unlikely that generating all permutations explicitly is a good approach.
I also don't believe that collecting to set and streaming again is the performance problem.
But nevertheless, to answer your question: why don't you return
Stream<SeasonBuilder>
fromaddWeeks
directly, why do you collect it to set first? Return the stream directy, without collecting:You won't need
map
/flatMap
then, just oneflatMap
:But this won't help your performance much anyway.