my question regards the vectorization of my code. I have one array that holds 3D-coordinates and one array that holds the information of edges that connect the coordinates:
In [8]:coords
Out[8]:
array([[ 11.22727013, 24.72620964, 2.02986932],
[ 11.23895836, 24.67577744, 2.04130101],
[ 11.23624039, 24.63677788, 2.04096866],
[ 11.22516632, 24.5986824 , 2.04045677],
[ 11.21166992, 24.56095695, 2.03898215],
[ 11.20334721, 24.5227356 , 2.03556442],
[ 11.2064085 , 24.48479462, 2.03098583],
[ 11.22059727, 24.44837189, 2.02649784],
[ 11.24213409, 24.41513252, 2.01979685]])
In [13]:edges
Out[13]:
array([[0, 1],
[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6],
[6, 7],
[7, 8],], dtype=int32)
Now, I would like to calculate the sum of the euclidian distance between the coordinates in the edges array. E.g. Distance from coords[0] to coords[1] + distance from coords[1] to coords[2] .....
I have the following code, which does the job:
def networkLength(coords, edges):
from scipy.spatial import distance
distancesNetwork = np.array([])
for i in range(edges.shape[0]):
distancesNetwork = np.append(distancesNetwork, distance.euclidean(coords[edges[i, 0]], coords[edges[i, 1]]))
return sum(distancesNetwork)
I was wondering whether it is possible to vectorize the code, rather than doing a loop. What is the pythonian way to do it? Thanks a lot!!
Approach #1
We could slice out the first and second columns altogether for indexing into
coords
instead of iterating for each element along them and perform the euclidean distance computations that involves element-wise squaring and summing along each row and then getting the element-wise square-root. Finally, we need to sum all those values for one scalar as shown in the original code.Thus, one vectorized implementation would be -
There's a built-in in NumPy to do those distance computing operations as
np.linalg.norm
. In terms of performance, I would think it would be comparable to what we have just listed earlier. For the sake of completeness, the implementation would be -Approach #2
Tweaking the earlier approach, we could use
np.einsum
that in one step would perform bothsquaring
andsumming along each row
and as such would be a bit more efficient.The implementation would look something like this -
Runtime test
Function definitions -
Verification and Timings -