Sankey Diagram for transitions

559 views Asked by At

I am trying to replicate the code and issue from below stack overflow question Sankey diagram in R

Adding some sample data

head(links) #Data.frame

Source   Target  Weight 
 Fb        Google  20 
 Fb         Fb      2
 BBC        Google 21
 Microsoft  BBC    16 

head(nodes) 
Fb
BBC
Google
Microsoft 

Code for building a sankey transition flow

sankeyNetwork(Links = links, 
              Nodes = nodes, 
              Source = "Source",
              Target = "Target", 
              Value = "value", 
              fontSize = 12, 
              nodeWidth = 30)

The above mentioned stack overflow posts mentions that the source and target should be indexed at 0. However if I try the same syntax, I get NA's in my Source and Target. What could be causing this error?

2

There are 2 answers

0
CJ Yetman On BEST ANSWER

You can convert your Source and Target variables in your links data frame to the index of the nodes in your nodes data frame like so...

links <- read.table(header = T, text = "
Source   Target  Weight
Fb        Google  20
Fb         Fb      2
BBC        Google 21
Microsoft  BBC    16
")

nodes <- read.table(header = T, text = "
name
Fb
BBC
Google
Microsoft
")

# set the Source and Target values to the index of the node (zero-indexed) in
# the nodes data frame
links$Source <- match(links$Source, nodes$name) - 1
links$Target <- match(links$Target, nodes$name) - 1

print(links)
print(nodes)

# use the name of the column in the links data frame that contains the values
# for the value you pass to the Value parameter (e.g. "Weight" not "value")
library(networkD3)
sankeyNetwork(Links = links, Nodes = nodes, Source = "Source", 
              Target = "Target", Value = "Weight",
              fontSize = 12, nodeWidth = 30)
0
lawyeR On

This code produced the plot at the bottom. See my comments for the explanation of changes from your code. And, a wonderful resource is here: several methods with R to create Sankey (river) plots.

library(networkD3)  

# change to numeric index starting at 0.  I assigned Fb to zero, and so on
links <- data.frame(Source = c(0, 0, 1, 2),
                     Target = c(3, 0, 3, 1),
                     Weight = c(20, 2, 21, 16))

# a nodes dataframe (or dataframe element of a list, as in the help) is needed
nodes <- data.frame(name = c("Fb", "Google", "BBC", "MS"))

sankeyNetwork(Links = links, 
              Nodes = nodes, 
              Source = "Source",
              Target = "Target", 
              Value = "Weight",   # changed from "value"
              fontSize = 12, 
              nodeWidth = 30)   

enter link description here