I'm trying to calculate the distance matrix between histograms. I can only find the code for calculating the distance between 2 histograms and my data have more than 10. My data is a CSV file and histogram comes in columns that add up to 100. Which consist of about 65,000 entries, I only run with 20% of the data but the code still does not work.
I've tried the distance_matrix from scipy.spatial.distance_matrix but it ignore the face that data are histogram and treat them as normal numerical data. I've also tried wasserstein distance but the error was object too deep for desired array
from scipy.stats import wasserstein_distance
distance = wasserstein_distance (df3,df3)
I expected the result to be somewhat like this :
0 1 2 3 4 5 6
0 0.000000 259.730341 331.083554 320.302997 309.577373 249.868085
1 259.730341 0.000000 208.368304 190.441382 262.030304 186.033572
2 331.083554 208.368304 0.000000 112.255111 256.269253 227.510879
3 320.302997 190.441382 112.255111 0.000000 246.350482 205.346804
4 309.577373 262.030304 256.269253 246.350482 0.000000 239.642379
but it was an error instead
ValueError: object too deep for desired array