I am trying to understand Timsort algorithm, but I have trouble following the reason of implementing the stack invariant:
- A > B+C
- B > C
According to this document,
We would like to delay merging as long as possible in order to exploit patterns that may come up later, but we like even more to do merging as soon as possible to exploit that the run just found is still high in the memory hierarchy.
I understand that we want to do the merging as soon as possible in order to exploit the cache effect, but I don't understand the reason for why we want to delay it. What "patterns" does he mean by that?
 
                        
Patterns here is referring to patterns in the data being sorted. For example, Timsort is looking for increasing (or decreasing) runs in the data, which are either already sorted, of can be sorted trivially by reversing the run in-place. It then tries to use these runs for its mergesort partitions.
By the way, the Wikipedia article on Timsort may be useful to you: https://en.wikipedia.org/wiki/Timsort