I understand that to use std::sort(), the compare function must be strict weak order, otherwise it will crash due to accessing address out of bound. (https://gcc.gnu.org/ml/gcc-bugs/2013-12/msg00333.html)
However, why would std::sort() access out-of-bound address when the compare function is not strict weak order? What is it trying to compare?
Also I wonder if there are other pitfalls in STL that I should be aware of.
The first thing is that calling the algorithm with a comparator that does not comply with the requirements is undefined behavior and anything goes...
But other than that, I assume that you are interested in knowing what type of implementation might end up accessing out of bounds if the comparator is bad. Should the implementation not check the bounds before accessing the elements in the first place? i.e. before calling the comparator
The answer is performance, and this is just one of the possible things that could lead to this type of issues. There are different implementations of sorting algorithms, but more often than not,
std::sort
is built on top of a variant of quicksort that will degenerate on a different sorting algorithm like mergesort to avoid the quicksort worst case performance.The implementation of quicksort selects a pivot and then partitions the input around the pivot, then independently sorts both sides. There are different strategies for selection of the pivot, but a common one is the median of three: the algorithm gets the values of the first, last and middle element, selects the median of the three and uses that as the pivot value.
Conceptually partition walks from the left until it finds an element that is not smaller than the pivot, it then walks from the right trying to find an element that is smaller than the pivot. If the two cursors meet, partition completed. If the out of place elements are found, the values are swapped and the process continues in the range determined by both cursors. The loop walking from the left to find the element to swap would look like:
While in general partition cannot assume that the value of pivot will be in the range, quicksort knows that it is, after all it selected the pivot out of the elements in the range. A common optimization in this case is to swap the value of the median to be in the last element of the loop. This guarantees that
value(pos) < pivot
will be true beforepos == end
(worst case:pos == end - 1
). The implication here is that we can drop the check for the end of the range and we can use aunchecked_partition
(pick your choice of name) with a simpler faster condition:All perfectly good, except that
<
is spelledcomparator(value(pos), pivot)
. Now if thecomparator
is incorrectly implemented you might end up withcomparator(pivot,pivot) == true
and the cursor will run out of bounds.Note that this is just one example of optimization of the algorithm that can remove bounds check for performance: assuming a valid order, it is impossible to walk out of the array in the above loop if quicksort set the pivot to the last element before calling this modified partition.
Back to the question:
No, not if it removed bounds checking by proving that it won't walk out of the array, but that prove is built on the premise that the comparator is valid.