Modeling valued directed ERGM using transitiveweights in R

93 views Asked by At

I am currently using the ergm.count package in R to model a valued and directed network. The adjacency matrix can be downloaded through this link. This network consists of 84 nodes and 75 weighted ties.

library(ergm.count)
mat <- as.matrix(read.csv("./mat.csv",header=TRUE)[,-1])
net<-as.network(mat , directed=TRUE, matrix.type="a", 
                      ignore.eval=FALSE, names.eval="cnts"
                     )
m1<-ergm(net~ transitiveweights("min", "max", "min"),
         reference=~Poisson,
         response="cnts",
         control=control.ergm(parallel=6, parallel.type="PSOCK",
                              main.method="MCMLE")
)

However, when I run the above code, the progress feels like it is stuck forever, and the cpu load is really high.

cpu load

I've no idea what happend and what's wrong with my code. Is there any way I can speed up or optimize the progress. By the way, my R version is 4.3.2, ergm version is 4.5.0 and here is the fitting output...

Starting contrastive divergence estimation via CD-MCMLE:
Iteration 1 of at most 60:
Convergence test P-value:1.5e-04
The log-likelihood improved by 0.07426.
Iteration 2 of at most 60:
Convergence test P-value:6.2e-03
The log-likelihood improved by 0.03396.
Iteration 3 of at most 60:
Convergence test P-value:5.9e-01
Convergence detected. Stopping.
The log-likelihood improved by 0.001075.
Finished CD.
Starting Monte Carlo maximum likelihood estimation (MCMLE):
Iteration 1 of at most 60:

edit:

After three hours waiting with log printed, the process stopped and show this:

> source("/root/transfer_analysis/test.R", encoding = "UTF-8")
Evaluating network in model.
Initializing unconstrained Metropolis-Hastings proposal: ‘ergm.count:MH_DiscTNT’.
Initializing model...
Model initialized.
Using initial method 'CD'.
Fitting initial model.
Starting contrastive divergence estimation via CD-MCMLE:
Iteration 1 of at most 60:
Convergence test P-value:1.5e-06
The log-likelihood improved by 0.1578.
Iteration 2 of at most 60:
Convergence test P-value:2.6e-02
The log-likelihood improved by 0.02119.
Iteration 3 of at most 60:
Convergence test P-value:4.6e-01
The log-likelihood improved by 0.002026.
Iteration 4 of at most 60:
Convergence test P-value:1e-01
The log-likelihood improved by 0.01116.
Iteration 5 of at most 60:
Convergence test P-value:1.4e-01
The log-likelihood improved by 0.007586.
Iteration 6 of at most 60:
Convergence test P-value:1e+00
Convergence detected. Stopping.
The log-likelihood improved by < 0.0001.
Finished CD.
Starting Monte Carlo maximum likelihood estimation (MCMLE):
Density guard set to 10000 from an initial count of 75 edges.

Iteration 1 of at most 60 with parameter:
transitiveweights.min.max.min 
                    -1.483437 
Starting unconstrained MCMC...
Back from unconstrained MCMC.
Error in ergm.MCMLE(init, s, s.obs, control = control, verbose = verbose,  : 
  Unconstrained MCMC sampling did not mix at all. Optimization cannot continue.
Additional warning message:
In ergm_MCMC_sample(s, control, theta = mcmc.init, verbose = max(verbose -  :
  Unable to reach target effective size in iterations alotted.

I've checked out another question about 'Unable to reach target effective size in iterations alotted', but this fitting model and formula seems pretty simple

Any help would be greatly appreciated!

1

There are 1 answers

0
Michał On BEST ANSWER

I suspect the estimation gets stuck because of model specification. I advise you explore your data before you fit complex models such as one with transitiveweights. I don't know what is the system you are trying to model, so obviously what's below is substance-agnostic:

  1. Look at the shape of the distribution of edge values. Is the Poisson reference measure a correct specification?
  2. Try fitting simpler models, starting with a homogeneous one, i.e. with a constant only (e.g. sum). Do those fit to data, if not, how?
  3. Perhaps the model with transitiveweights should have sum as well.
  4. Do you have any nodal covariates that could be used to model the network next to triadic effect you have?