I am trying to plot the Network of the Word Distributions Over Topics (Topic Relation). using this code [source]:
post <- topicmodels::posterior(ldaOut)
cor_mat <- cor(t(post[["terms"]]))
cor_mat[ cor_mat < .05 ] <- 0
diag(cor_mat) <- 0
graph <- graph.adjacency(cor_mat, weighted=TRUE, mode="lower")
graph <- delete.edges(graph, E(graph)[ weight < 0.05])
E(graph)$edge.width <- E(graph)$weight*20
V(graph)$label <- paste("Topic", V(graph))
V(graph)$size <- colSums(post[["topics"]]) * 15
par(mar=c(0, 0, 3, 0))
set.seed(110)
plot.igraph(graph, edge.width = E(graph)$edge.width,
edge.color = "orange", vertex.color = "orange",
vertex.frame.color = NA, vertex.label.color = "grey30")
title("Strength Between Topics Based On Word Probabilities", cex.main=.8)
Sample of cor_mat
data:
1 2 3 4 5 6 7 ...
1 0.00000000 0.00000000 0.00000000 0.09612831 0.00000000 0.17248020 0.00000000
2 0.00000000 0.00000000 0.07206496 0.00000000 0.00000000 0.05755187 0.00000000
3 0.00000000 0.07206496 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
4 0.09612831 0.00000000 0.00000000 0.00000000 0.08459681 0.00000000 0.06895900
5 0.00000000 0.00000000 0.00000000 0.08459681 0.00000000 0.00000000 0.00000000
6 0.17248020 0.05755187 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
7 0.00000000 0.00000000 0.00000000 0.06895900 0.00000000 0.00000000 0.00000000
8 0.00000000 0.00000000 0.00000000 0.00000000 0.54849308 0.00000000 0.00000000
9 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.09745720 0.00000000
10 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
11 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
12 0.00000000 0.00000000 0.00000000 0.10329825 0.00000000 0.14057310 0.00000000
13 0.14664201 0.00000000 0.00000000 0.00000000 0.05803984 0.00000000 0.00000000
14 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
15 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
16 0.00000000 0.00000000 0.10290656 0.00000000 0.00000000 0.00000000 0.06293238
17 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
18 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
19 0.00000000 0.00000000 0.00000000 0.00000000 0.33483481 0.00000000 0.00000000
20 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
21 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
22 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.27720724 0.00000000
23 0.12487435 0.14806837 0.00000000 0.10355990 0.00000000 0.05086977 0.00000000
24 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.06622769 0.00000000
Unfortunately, the plot looks like this:
Any idea on how to make the network of topics more elegant, showing the links between topics rather than making them overlap each other?
The simple solution is to change the numbers w
eight*20
andcolSums(post[["topics"]])*15
to smaller numbers in order to avoid overlap issue. The code could be like thisAnd the result of that,