Visualizing Author Topic Similarities: t-SNE and Cluster Labeling

50 views Asked by At

I am working on a dataframe containing abstracts from various NLP conferences, along with information on information on the respective authors (names) and the keywords they've associated with their abstracts; e.g.

  • abstract
  • author1, author2
  • kw1, kw2, kw3

My objective is to cluster authors who frequently write about similar topics, as indicated by shared keywords. For the visualisation I am thinking of using t-SNE. However, I am unsure about specifying 'cluster labels' without manual intervention. Which algorithms would be suitable for this task* ?

*e.g. would K-means be a viable option given that number of clusters should be provided in advance? or should I opt for methods such as DBSCAN or Affinity Propagation ? Should I consider keywords as clusters (risking of producing an explosion of clusters -because of the large number of keywords-)

0

There are 0 answers