SciPy cdist Speed Difference

Question

SciPy cdist Speed Difference

538 views Asked by slaw At 24 September 2020 at 02:17

I am curious as to why the following cdist differ so much in time even though they produce the same results:

import numpy as np
from scipy.spatial.distance import cdist

x = np.random.rand(10_000_000, 50)
y = np.random.rand(50)

result_1 = cdist(x, y[np.newaxis, :])

result_2 = cdist(x, y[np.newaxis, :], `minkowski`, p=2.)

The result_1 is significantly faster than result_2.

Original Q&A

There are 1 answers

**Alex** · Answer 1 · 2020-09-24T11:49:17+00:00

The C implementation of the Euclidean distance, source lines 50-66, uses multiplication and a sqrt() call while the Minkowski distance, source lines 381-391 is based on the much slower calls to the pow() function.

For reference, see discussion here and here comparing pow to multiplication and sqrt.

So despite the appearance that the Euclidean norm just calls the Minkowski norm, source line 614, cdist actually calls directly through to the C implementation where the code is different. The python euclidean function is not called in the actual execution.

TechQA.

SciPy cdist Speed Difference

There are 1 answers

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in SCIPY

Related Questions in SCIPY-SPATIAL

Popular Questions

Popular Tags

Trending Questions