I have to work with the learning history of a Keras model. This is a basic task, but I've measured the performance of the Python built-in min() function, the numpy.min() function, and the numpy ndarray.min() function for list and ndarray. The performance of the built-in Python min() function is nothing compared to that of Numpy for ndarray - numpy is 10 times faster (for list numpy is almost 6 times slower, but this is not the case of this question). However, the ndarray.min() method is almost twice as fast as numpy.min(). The ndarray.min() documentation refers to the numpy.amin() documentation, which according to the numpy.amin docs, is an alias for numpy.min(). Therefore, I assumed that numpy.min() and ndarray.min() would have the same performance. However, why is the performance of these functions not equal?
from timeit import default_timer
import random
a = random.sample(range(1,1000000), 10000)
b = np.array(random.sample(range(1,1000000), 10000))
def time_mgr(func):
tms = []
for i in range(3, 6):
tm = default_timer()
for j in range(10**i):
func()
tm = (default_timer()-tm) / 10**i * 10e6
tms.append(tm)
print(func.__name__, tms)
@time_mgr
def min_list():
min(a)
@time_mgr
def np_min_list():
np.min(a)
@time_mgr
def min_nd():
min(b)
@time_mgr
def np_min_nd():
np.min(b)
@time_mgr
def np_nd_min():
b.min()
output, time in mks:
min_list [520.7690014503896, 515.3326001018286, 516.221239999868]
np_min_list [2977.614998817444, 3009.602500125766, 3014.1312699997798]
min_nd [2270.1649996452034, 2195.6873999442905, 2155.1631700014696]
np_min_nd [22.295000962913033, 21.675399970263243, 22.30485000181943]
np_nd_min [14.261999167501926, 12.929399963468313, 12.935079983435571]
Lists are horribly inefficient as well as objects : they store FP number as objects and lists contains pointers to these objects. It prevent all optimizations and cause a very bad memory access pattern. Not to mention objects are reference counted and protected by the Global Interpreter Lock (GIL). Any code involving CPython objects in a hot loop is doomed to be very slow, especially when this loop is interpreted. Such a thing happens for
min_listandmin_nd.Also Numpy cannot operate on lists directly so it converts them to an array. This conversion is generally much slower than any basic Numpy array computation (which is done again and again in
np.min(a), hence the bad performance).As for
np.minvsb.min, they are both equivalent, though the first can be a bit slower on small array because of the function fetch overhead of CPython and a different path taken in Numpy. Numpy is optimized for large array. On small arrays you will see many overheads which are hard to understand without delving into the code of the Numpy implementation (since such overheads are clearly implementation dependent).On my machine (i5-9600KF CPU with CPython 3.8.1 and Numpy 1.24.4), the former takes 4.5 µs and the later 2.5 µs so a very small time. This is typically the overhead of Numpy (i.e 1-10 µs per call). With a 100x times bigger array, I get 80 µs versus 83 µs (+/- 2 µs) so the two are statistically not different. It also shows most of the time spent in your small array benchmark is just pure overhead (~60% to ~80%).
If you want to reduce these overheads, then you should use tools like Cython/Numba capable of compiling (specialized) Python codes, or just not use Python in the first place but a native language like C/C++/Rust.