How to get R Igraph values for a weighted PageRank to match Gephi

286 views Asked by At

I am working on getting my PageRank values from igraph in R to match those I get from Gephi. I have followed this example: https://www.briggsby.com/personalized-pagerank and my igraph values match the weighted values this example has. But Gephi produces a different value for weighted PageRank and I'm unsure why. When I run this as an unweighted PageRank, I get the same results between igraph and Gephi.

The network I'm importing is simple to get the math correct -

Source Target Weight
A B 1.0
B C 1.0
C B 1.0
C A 0.5
A C 1.0
C D 0.1
D A 0.5

The code I'm using is as follows:

library(igraph);
library(plyr);
set.seed(123);
mydf <- data.frame(from=TestPageRank$Source, to=TestPageRank$Target);
mygraph <- graph.data.frame(mydf, directed = T);
c<-data.frame(users=V(mygraph)$name, page_rank = page_rank(mygraph, directed = T, damping = 0.85, weights = TestPageRank$Weight)$vector, degree=degree(mygraph));

The PageRanks I'm returning are as follows:

Node igraph Weighted PageRank Gephi Weighted PageRank
A 0.1960 0.2373
B 0.3373 0.2761
C 0.4075 0.3732
D 0.0591 0.1133

In this example, the ranking is at least the same, but when I apply this to my larger networks with thousands of nodes, the node ranking by PageRank is very different. Any thoughts on why this might be? Or how I can modify my R code to match the Gephi PageRank values?

Here's the updated code with import:

df <- structure(list(Source = c("A", "B", "C", "C", "A", "C", "D"), 
                     Target = c("B", "C", "B", "A", "C", "D", "A"), 
                     Weight = c(1,1, 1, 0.5, 1, 0.1, 0.5)), 
                class = "data.frame", row.names = c(NA, -7L))

g <- graph_from_data_frame(df)
page_rank(g, weights = E(g)$Weight, directed = T, damping = 0.85)$vector
degree(g)

And the output from the above:

         A          B          C          D 
0.19602465 0.33730560 0.40752024 0.05914951 
1

There are 1 answers

6
Szabolcs On

I am not able to reproduce your results with igraph. Please provide a minimal reproducible example, with copyable code. You will find guidance here.

Here is your datafile as copyable CSV:

Source,Target,Weight
A,B,1.
B,C,1.
C,B,1.
C,A,0.5
A,C,1.
C,D,0.1
D,B,0.5

We get this after using read.csv:

df <- structure(list(Source = c("A", "B", "C", "C", "A", "C", "D"), 
    Target = c("B", "C", "B", "A", "C", "D", "B"), Weight = c(1, 
    1, 1, 0.5, 1, 0.1, 0.5)), class = "data.frame", row.names = c(NA, 
-7L))
g <- graph_from_data_frame(df)
page_rank(g, weights = E(g)$Weight)
$vector
         A          B          C          D 
0.14857410 0.37354978 0.41816130 0.05971482 

Using the ARPACK method, which is an entirely distinct algorithm, we get the same:

> page_rank(g, weights = E(g)$Weight, algo = 'arpack')
$vector
         A          B          C          D 
0.14857410 0.37354978 0.41816130 0.05971482 

These numbers differ from what you quote, but I cannot tell why without a reproducible example.

I should note that I worked on igraph's PageRank code and I believe that it is exceedingly unlikely that it would give incorrect results.