Lets say that I have 10 datasets, 30 elements each. We can simulate it as:
A = rand(30, 10);
so each dataset is in one column. Now, I want to find set of n
datasets which are correlated (or uncorrelated, whatever...).
For n=2
I can simply use R = corr(A)
and find out that i.e. columns 1 and 3 show the highest correlation between each other. But what if I want to find set of three, or four correlated (or uncorrelated) datasets between each other? Is there a function for that or do I have to loop it somehow?
Thanks!
You can treat this as a random simulation problem. You pick three (four) datasets and find the largest cross-correlation score, which I define as sum of pairwise correlation score.
Althought it is not a deterministic process, it doesn't take long to converge and give you global optimal.