I have the following problem in Python I need to solve:
Given two coordinate matrices (NumPy ndarrays) A
and B
, find for all coordinate vectors a
in A
the corresponding coordinate vectors b
in B
, such that the Euclidean distance ||a-b||
is minimized. The coordinate matrices A
and B
can have different number of coordinate vectors (that is, different number of rows).
This method should return a matrix of coordinate vectors C
where the ith vector c
in C
is the vector from B
that minimizes the Euclidean distance with the ith coordinate vector a
in A
.
For example, lets say
A = np.array([[1,1], [3,4]])
and B = np.array([[1,2], [3,6], [8,1]])
The Euclidean distances between the vector [1,1]
in A
and the vectors in B
are:
1, 5.385165, 7
So the first vector in C
would be [1,2]
Similarly the distances for the vector [3,4]
in A
and the vectors in B
are:
2.828427, 2, 5.830952
So the second and last vector in C
would be [3,6]
So C = [[1,2], [3,6]]
How to code this efficiently in Python?
You could use
cdist
fromscipy.spatial.distance
to efficiently get the euclidean distances and then usenp.argmin
to get the indices corresponding to minimum values and use those to index intoB
for the final output. Here's the implementation -Sample run -