I have two 2D points sets A and B. I want to find the first nearest neighbor in A for each point in B.
However, I am dealing with uncertain points (i.e. a point has a mean (2D vector) and a 2*2 covariance matrix).
I thus would like to use the Mahalanobis distance, but in scikit-learn (for example), I cannot pass a covariance matrix for each point, as it expects a single covariance matrix.
Currently, considering only the average locations (i.e. mean of my 2D normal distribution), I have:
nearest_neighbors = NearestNeighbors(n_neighbors=1, metric='l2').fit(A)
distance, indices = nearest_neighbors.kneighbors(B)
With my uncertain points, instead of using the L2 norm as a distance, I would rather compute (between a point a in A and a point b in B, their Mahalanobis distance:
d(a, b) = sqrt( transpose(mu_a-mu_b) * C * (mu_a-mu_b))
where C = inv(cov_a + cov_b)
where mu_a (resp mu_b) and cov_a (resp. cov_b) are the 2D mean and 2*2 covariance matrix of uncertain point a (resp. b).
I ended up using a custom distance:
Thus a point has 4 features:
xandycoordinatesxandyvariances (covariance matrix is diagonal in my case)