When trying to apply Axelrod Model for Dissemination of Culture with the R igraph package, I get performance issues on the time of executions.
After generating an LxL lattice, I assign a vector of F random numbers ranging from 1 to q to every node.
Then I iterate by:
- choosing a random node s
- choosing a random neighbor n of node s
- calculate how many numbers they have in common (equals) as a percentage
- with probability equal to the previous percentage, I take one of n's feature that's different from s and assign that to s
Repeat that until convergence (i.e. every node has neighbours either completely equal or completely similar).
Here the code (F, q, R are all given non-critical values present in other cells):
calc_edges_weights <- function(g){
edge_weights <- replicate(length(E(g)), 1, simplify=FALSE)
for(i in 1:length(E(g))){
edge <- E(g)[i]
##Extract the two nodes
first_node <- ends(g, edge)[1]
second_node <- ends(g, edge)[2]
compatibility <- chance_interaction(V(g)$cultures[[first_node]], V(g)$cultures[[second_node]])
## Assign the values for representation
edge_weights[i] <- compatibility
}
return(edge_weights)
}
chance_interaction <- function(cul1, cul2){
common <- sum(cul1==cul2)
total <- length(cul1)
perc_eq <- common/total
return(perc_eq)
}
epoch <- 0
N <- L**2
g <- make_lattice(length = L, dim = 2)
V(g)$cultures <- replicate(N, sample(1:q, F, replace=TRUE), simplify=FALSE)
# plot_graph(g, labels=FALSE)
edge_weights <- calc_edges_weights(g)
while(TRUE){
epoch <- epoch + 1
##Choose one vertex at random
chosen <- sample(1:N, 1, replace=TRUE)
##Take neighbours of that vertex, pick one at random
neighbours <- neighborhood(g, chosen, order=R)[[1]]
chosen_neighbour <- neighbours[sample(1:length(neighbours), 1, replace=TRUE)]
##Calculate chance of interaction
chance_change <- chance_interaction(V(g)$cultures[[chosen]], V(g)$cultures[[chosen_neighbour]])
##If success in chance, update one of the remaining traits
if((runif(1, min=0, max=chance_change) < chance_change) && chance_change!=1){
loop_flag = TRUE
while(loop_flag){
feat_index <- sample(1:F, 1, replace=TRUE)
if(V(g)$cultures[[chosen]][feat_index] != V(g)$cultures[[chosen_neighbour]][feat_index]){
new_attr <- V(g)$cultures[[chosen]]
new_attr[feat_index] <- V(g)$cultures[[chosen_neighbour]][feat_index]
V(g)$cultures[[chosen]] <- new_attr
loop_flag <- FALSE
}
}
}
##Stop conditions
if(all(edge_weights %in% c(0, 1))){ break }
if(epoch%%(200) == 0){
edge_weights <- calc_edges_weights(g)
n_alpha <- sum(edge_weights != 0 & edge_weights != 1)/length(edge_weights)
cat(epoch, ",", n_alpha, "\n")
}
}
Any cat command is for showing purposes when testing.
You can use these as example values:
L <- 10
q <- 40
F <- 10
R <- 1
The problem here is that convergence is waaay slow and time of convergence grows exponentially with Lattice size. For example, with L = 10 I get a convergence after 30 seconds, whereas with L = 22 I get it after 2 hours (!!).
I tried changing the operations, finding other functions in the library but I can't seem to improve the efficiency.
Time analysis confirms an exponential increase with L size (as expected by the theory) but it should not be this slow.
I tried reducing the amount of times the edge_weights are calculated (since they are only useful to STOP the evolution) but to no avail.
Is there a better way to approach this issue?
An approach that can simulate the
L = 22case in a matter of seconds:By keeping track of the probability of the next trait transfer occurring across each edge, we can avoid looping through epochs that result in no transfers. This will increase performance, and we can still keep track of the number of epochs that pass using a negative binomial random variate (sampled using
rnbinom) to simulate the number of epochs resulting in no transfers between the epochs where transfers occur.The function below uses a
data.tableapproach. Each row of thedata.tablerepresents a directed edge. The column labeledwis the probability that the edge is selected during the next epoch and the edge is selected to transfer a trait. However, given the sampling method, an edge can be selected to transfer a trait when there are no traits eligible for transfer (because the receiving vertex already has all the traits present in the originating vertex), so we also keep track of the number of traits eligible for transfer across each edge (the column labeledn).The function:
Testing with timing. The
L = 10, q = 40, F = 10, R = 1case takes a couple seconds to simulate nearly 18k epochs.Changing
Rto2:The
L = 22case takes less than 20 seconds to simulate more than 100k epochs.Stress-test with
L = 100. It simulates more than 2M epochs in about 30 minutes.