Adjacent topics graphs

73 views Asked by At

I am trying to plot the Network of the Word Distributions Over Topics (Topic Relation). using this code [source]:

post <- topicmodels::posterior(ldaOut)

cor_mat <- cor(t(post[["terms"]]))
cor_mat[ cor_mat < .05 ] <- 0
diag(cor_mat) <- 0

graph <- graph.adjacency(cor_mat, weighted=TRUE, mode="lower")
graph <- delete.edges(graph, E(graph)[ weight < 0.05])

E(graph)$edge.width <- E(graph)$weight*20
V(graph)$label <- paste("Topic", V(graph))
V(graph)$size <- colSums(post[["topics"]]) * 15

par(mar=c(0, 0, 3, 0))
set.seed(110)
plot.igraph(graph, edge.width = E(graph)$edge.width, 
    edge.color = "orange", vertex.color = "orange", 
    vertex.frame.color = NA, vertex.label.color = "grey30")
title("Strength Between Topics Based On Word Probabilities", cex.main=.8)

Sample of cor_mat data:

          1          2          3          4          5          6          7       ...
1  0.00000000 0.00000000 0.00000000 0.09612831 0.00000000 0.17248020 0.00000000
2  0.00000000 0.00000000 0.07206496 0.00000000 0.00000000 0.05755187 0.00000000
3  0.00000000 0.07206496 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
4  0.09612831 0.00000000 0.00000000 0.00000000 0.08459681 0.00000000 0.06895900
5  0.00000000 0.00000000 0.00000000 0.08459681 0.00000000 0.00000000 0.00000000
6  0.17248020 0.05755187 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
7  0.00000000 0.00000000 0.00000000 0.06895900 0.00000000 0.00000000 0.00000000
8  0.00000000 0.00000000 0.00000000 0.00000000 0.54849308 0.00000000 0.00000000
9  0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.09745720 0.00000000
10 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
11 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
12 0.00000000 0.00000000 0.00000000 0.10329825 0.00000000 0.14057310 0.00000000
13 0.14664201 0.00000000 0.00000000 0.00000000 0.05803984 0.00000000 0.00000000
14 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
15 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
16 0.00000000 0.00000000 0.10290656 0.00000000 0.00000000 0.00000000 0.06293238
17 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
18 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
19 0.00000000 0.00000000 0.00000000 0.00000000 0.33483481 0.00000000 0.00000000
20 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
21 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
22 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.27720724 0.00000000
23 0.12487435 0.14806837 0.00000000 0.10355990 0.00000000 0.05086977 0.00000000
24 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.06622769 0.00000000

Unfortunately, the plot looks like this: enter image description here

Any idea on how to make the network of topics more elegant, showing the links between topics rather than making them overlap each other?

1

There are 1 answers

0
Sultan On

The simple solution is to change the numbers weight*20 and colSums(post[["topics"]])*15 to smaller numbers in order to avoid overlap issue. The code could be like this

...    
E(graph)$edge.width <- E(graph)$weight* 5
V(graph)$label <- paste("Topic", V(graph))
V(graph)$size <- colSums(post[["topics"]]) * 2
...

And the result of that, enter image description here