How to find the closest corresponding vectors between two coordinate matrices?

2.6k views Asked by At

I have the following problem in Python I need to solve:

Given two coordinate matrices (NumPy ndarrays) A and B, find for all coordinate vectors a in A the corresponding coordinate vectors b in B, such that the Euclidean distance ||a-b|| is minimized. The coordinate matrices A and B can have different number of coordinate vectors (that is, different number of rows).

This method should return a matrix of coordinate vectors C where the ith vector c in C is the vector from B that minimizes the Euclidean distance with the ith coordinate vector a in A.

For example, lets say

A = np.array([[1,1], [3,4]]) and B = np.array([[1,2], [3,6], [8,1]])

The Euclidean distances between the vector [1,1] in A and the vectors in B are:

1, 5.385165, 7

So the first vector in C would be [1,2]

Similarly the distances for the vector [3,4] in A and the vectors in B are:

2.828427, 2, 5.830952  

So the second and last vector in C would be [3,6]

So C = [[1,2], [3,6]]

How to code this efficiently in Python?

1

There are 1 answers

1
Divakar On BEST ANSWER

You could use cdist from scipy.spatial.distance to efficiently get the euclidean distances and then use np.argmin to get the indices corresponding to minimum values and use those to index into B for the final output. Here's the implementation -

import numpy as np
from scipy.spatial.distance import cdist

C = B[np.argmin(cdist(A,B),1)] 

Sample run -

In [99]: A
Out[99]: 
array([[1, 1],
       [3, 4]])

In [100]: B
Out[100]: 
array([[1, 2],
       [3, 6],
       [8, 1]])

In [101]: B[np.argmin(cdist(A,B),1)]
Out[101]: 
array([[1, 2],
       [3, 6]])