A scientific publication published a pancreatic cancer classifier and I want to use this classifier on my own expression set. The only information that they provide is a data frame with centroids (rows: genes x columns: subtypes)(https://doi.org/10.1053/j.gastro.2018.08.033, supplementary table 2). Up until now I haven’t figured out to reproduce this classification model for prediction.
All packages that I found, they calculate the centroids from expression data and labels, and output a models to predict a new set. Unfortunately the labels are not published with this article; recalculating the centroids is not possible.
Question: How can I use centroids to classify an other expression set?
You can use k-Nearest Neighbors with only the centroids. Just use the centroids as the training data and k = the number of centroids. Since you do not provide any data, I will give an example using the iris data. The specific centroids don't matter here, but they must be in a data frame with the same format as the data that you wish to classify. You can call the classes whatever you want. I just called them A,B and C.