sna R package efficiency function inconsistency

110 views Asked by At

I have three different matrix and their Krackhardt efficiency seem wrong to me. The first two matrices are topologically equivalent, but their efficiency is different. Anyone has an explanation of the inconsistency?

For the first matrix, efficiency is 1:

A <- matrix(c(0,1,0,0,0,0,0,0,0,0,
          0,0,1,0,0,0,0,0,0,0,
          0,0,0,1,0,0,0,0,0,0,
          0,0,0,0,1,0,0,0,0,0,
          0,0,0,0,0,1,0,0,0,1,
          0,0,0,0,0,0,1,0,0,0,
          0,0,0,0,0,0,0,1,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,1,0),ncol=10)
A_net <- network(A,directed=TRUE)
g_eff <- efficiency(A_net)
plot.network(A_net, vertex.col = "white", vertex.border = col_ama, 
         usearrows=FALSE, edge.col=col_gri, vertex.lwd = 3.5,
         vertex.cex = 3.5)
title(paste("Efficiency =",round(g_eff,3)))

Equivalent matrix with different efficiency:

A <- matrix(c(0,0,0,0,1,0,0,0,0,0,
          1,0,1,0,0,0,0,0,0,0,  
          0,0,0,0,0,0,1,0,0,0,
          0,0,1,0,0,0,0,0,0,0,
          0,0,0,0,0,1,0,0,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,1,0,
          0,0,0,1,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,0,1,
          0,0,0,0,0,0,0,0,0,0),ncol=10)
A_net <- network(A,directed=FALSE)
g_eff <- efficiency(A_net)
plot.network(A_net, vertex.col = "white", vertex.border = col_ama, 
         usearrows=FALSE, edge.col=col_gri, vertex.lwd = 3.5,
         vertex.cex = 3.5)
title(paste("Efficiency =",round(g_eff,3)))

This third matrix has two components with the minimum number of edges (n_i-1), but their efficiency is not one. It doesn't match neither the formula in the help:

(1 - [ |E(G)| - Sum(N_i-1,i=1,..,n) ]/[ Sum((N_i-1)^2,i=1,..,n) ]   = 1-[8-(2+4)]/[4+16] = .9)

Third matrix:

A <- matrix(c(0,0,0,0,1,0,0,0,0,0,
          1,0,0,0,0,0,0,0,0,0,  
          0,0,0,0,0,0,1,0,0,0,
          0,0,1,0,0,0,0,0,0,0,
          0,0,0,0,0,1,0,0,0,0,
          0,0,0,0,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,1,0,
          0,0,0,1,0,0,0,0,0,0,
          0,0,0,0,0,0,0,0,0,1,
          0,0,0,0,0,0,0,0,0,0),ncol=10)
A_net <- network(A,directed=FALSE)
g_eff <- efficiency(A_net)
g_eff
plot.network(A_net, vertex.col = "white", vertex.border = col_ama, 
         usearrows=FALSE, edge.col=col_gri, vertex.lwd = 3.5,
         vertex.cex = 3.5)
title(paste("Efficiency =",round(g_eff,3)))
1

There are 1 answers

0
Carter On

It appears that your example is in error: in the first case, you are working with a directed network, and in the second an undirected network (check your network coercion statement - BTW, you can just use matrices). Here is a demonstration that the first two are in fact equivalent:

> A <- matrix(c(0,1,0,0,0,0,0,0,0,0,
+           0,0,1,0,0,0,0,0,0,0,
+           0,0,0,1,0,0,0,0,0,0,
+           0,0,0,0,1,0,0,0,0,0,
+           0,0,0,0,0,1,0,0,0,1,
+           0,0,0,0,0,0,1,0,0,0,
+           0,0,0,0,0,0,0,1,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,1,0),ncol=10)
> A
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    0    0     0
 [2,]    1    0    0    0    0    0    0    0    0     0
 [3,]    0    1    0    0    0    0    0    0    0     0
 [4,]    0    0    1    0    0    0    0    0    0     0
 [5,]    0    0    0    1    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    0    0    0    1    0    0    0     0
 [8,]    0    0    0    0    0    0    1    0    0     0
 [9,]    0    0    0    0    0    0    0    0    0     1
[10,]    0    0    0    0    1    0    0    0    0     0
> B<-matrix(c(0,0,0,0,1,0,0,0,0,0,
+           1,0,1,0,0,0,0,0,0,0, 
+           0,0,0,0,0,0,1,0,0,0,
+           0,0,1,0,0,0,0,0,0,0,
+           0,0,0,0,0,1,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,1,0,
+           0,0,0,1,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,1,
+           0,0,0,0,0,0,0,0,0,0),ncol=10)
> A
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    0    0     0
 [2,]    1    0    0    0    0    0    0    0    0     0
 [3,]    0    1    0    0    0    0    0    0    0     0
 [4,]    0    0    1    0    0    0    0    0    0     0
 [5,]    0    0    0    1    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    0    0    0    1    0    0    0     0
 [8,]    0    0    0    0    0    0    1    0    0     0
 [9,]    0    0    0    0    0    0    0    0    0     1
[10,]    0    0    0    0    1    0    0    0    0     0
> B
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    1    0    0    0    0    0    0    0     0
 [2,]    0    0    0    0    0    0    0    0    0     0
 [3,]    0    1    0    1    0    0    0    0    0     0
 [4,]    0    0    0    0    0    0    0    1    0     0
 [5,]    1    0    0    0    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    1    0    0    0    0    0    0     0
 [8,]    0    0    0    0    0    0    0    0    0     0
 [9,]    0    0    0    0    0    0    1    0    0     0
[10,]    0    0    0    0    0    0    0    0    1     0
> efficiency(A)
[1] 1
> efficiency(B)
[1] 1

If you symmetrize - which is what you are doing in your example to one of the networks by setting "directed=FALSE" - then you get an efficiency of 0.889. Note that this matches what we should get:

1 - (18 - 9)/(choose(10,2)*2-9) = 0.889

(Remember that Krackhardt efficiency treats all networks as digraphs, so mutual edges count as two edges. Also, I note that you seem to have miscopied the formula from the man page, which may be part of your confusion.)

Your third matrix is again efficiency 1, since it has no excess edges:

> C<-matrix(c(0,0,0,0,1,0,0,0,0,0,
+           1,0,0,0,0,0,0,0,0,0, 
+           0,0,0,0,0,0,1,0,0,0,
+           0,0,1,0,0,0,0,0,0,0,
+           0,0,0,0,0,1,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,1,0,
+           0,0,0,1,0,0,0,0,0,0,
+           0,0,0,0,0,0,0,0,0,1,
+           0,0,0,0,0,0,0,0,0,0),ncol=10)
> C
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    1    0    0    0    0    0    0    0     0
 [2,]    0    0    0    0    0    0    0    0    0     0
 [3,]    0    0    0    1    0    0    0    0    0     0
 [4,]    0    0    0    0    0    0    0    1    0     0
 [5,]    1    0    0    0    0    0    0    0    0     0
 [6,]    0    0    0    0    1    0    0    0    0     0
 [7,]    0    0    1    0    0    0    0    0    0     0
 [8,]    0    0    0    0    0    0    0    0    0     0
 [9,]    0    0    0    0    0    0    1    0    0     0
[10,]    0    0    0    0    0    0    0    0    1     0
> efficiency(C)
[1] 1

Your problem stems from symmetrizing (coercing C into a network object using directed=FALSE). This adds extra edges, leading to

1- (16 - 3 - 5) / (choose(4,2)*2 - 3 + choose(6,2)*2 - 5) = 0.7647059

which is equivalent to what sna gives you:

> efficiency(symmetrize(C))
[1] 0.7647059

Hope that clears things up!