Hi I am trying to understand how efficient pd.DataFrame.idxmax
to see if it is worth replacing with a custom algorithm which might be more efficient (ie using binary search for example).
I would like to understand the algorithm behind this method or at least its complexity however I have had no luck so far. Any help would be appreciated, thanks.
According to the source, the authors state
This method is the DataFrame version of ndarray.argmax
. This methodargmax
have a time complexity ofO(N)
as shown here. It is then reasonable to assume thatpd.DataFrame.idxmax
would have the same time complexity.If you would like to implement your own search algorithm, keep in mind that binary search as you proposed requires a sorted array of items, that might not be the case for a
DataFrame
column.