I have 4 topics and 10 keywords representing each of those 4 topics. I now want to classify all the documents in my dataset in one of these 4 topics using the keywords extracted for each topic.
topic0 = ["gene","rna","expression","mouse","assay","activity","concentration","target","ace","lung"]
topic1 = ["age","pneumonia","hospital","risk","outcome","incidence","diagnosis","strain","lung","child"]
topic2 = ["intervention","wuhan","city","contact","people","scenario","peak","confirmed_case","quarantine","daily"]
topic3 = ["sequence","genome","host","structure","gene","specie","rna","read","strain","mutation"]
These are the keywords for each topic and I have 1200 documents in my datatset. How do I classify them now?
Maybe some sort of similarity algorithm can be used for this. Please help!! Im confused