Why does cluster_infomap in igraph R give different communities each time?

1k views Asked by At

I am using the cluster_infomap function from igraph in R to detect communities in a undirected, unweighted, network with ~19,000 edges, but I get a different number of communities each time I run the function. This is the code I am using:

   clusters <- list()
   clusters[["im"]] <- cluster_infomap(graph)
   membership_local_method <- membership(clusters[["im"]])
   length(unique(membership_local_method))

The result of the last line of code ranges from 805-837 in the tests I have performed. I tried using set.seed() in case it was an issue of random number generation, but this does not solve the problem.

My questions are (1) why do I get different communities each time, and (2) is there a way to make it stable?

Thanks!

1

There are 1 answers

1
lukeA On BEST ANSWER

cluster_infomap (see ?igraph::cluster_infomap for help) finds a

community structure that minimizes the expected description length of a random walker trajectory

Whenever you deal with random number generation, then you get different results on each run. Most of the time, you can override this by setting a seed using set.seed (see ?Random for help) beforehand:

identical(cluster_infomap(g), cluster_infomap(g))
# [1] FALSE
identical({set.seed(1);cluster_infomap(g)},{set.seed(1);cluster_infomap(g)})
# [1] TRUE

or graphically:

library(igraph)
set.seed(2)
g <- ba.game(150)
coords <- layout.auto(g)
par(mfrow=c(2,2))

# without seed: different results
for (x in 1:2) {
  plot(
    cluster_infomap(g), 
    as.undirected(g), 
    layout=coords, 
    vertex.label = NA, 
    vertex.size = 5
  )
}

# with seed: equal results
for (x in 1:2) {
  set.seed(1)
  plot(
    cluster_infomap(g), 
    as.undirected(g), 
    layout=coords, 
    vertex.label = NA, 
    vertex.size = 5
  )
}