I have a data set of 60 sensors making 1684 measurements. I wish to decrease the number of sensors used during experiment, and use the remaining sensor data to predict (using machine learning) the removed sensors.
I have had a look at the data (see image) and uncovered several strong correlations between the sensors, which should make it possible to remove X sensors and use the remaining sensors to predict their behaviour.
How can I “score” which set of sensors (X) best predict the remaining set (60-X)?
Are you familiar with Principal Component Analysis (PCA)? It's a child of Analysis of Variance (ANOVA). Dimensionality Reduction is another term to describe this process.
These are usually aimed at a set of inputs that predict a single output, rather than a set of peer measurements. To adapt your case to these methods, I would think that you'd want to begin by considering each of the 60 sensors, in turn, as the "ground truth", to see which ones can be most reliably driven by the remainder. Remove those and repeat the process until you reach your desired threshold of correlation.
I also suggest a genetic method to do this winnowing; perhaps random forests would be of help in this phase.