I have two datasets that show the flow of people between districts in 3 different time periods, where the middle time period (t2) is the unique link between the periods. Now I would like to draw a alluvial diagram in r, using ggplot2. The code below works, but makes an ugly plot because the flows dont come properly from the nodes (by lack of better words). Ideally, the flow would start at the same level from the node as it comes into it, so to say. How can I reformat the data, to make this happen? In the current example "D" is twice the size it needs to be.
library(dplyr)
library(ggalluvial)
library(tidyr)
data_1 <- tibble(t1 = c("A", "B", "B","C"),
t2 = c("D", "E", "F", "G"),
value = c(99, 50, 50, 100))
data_2 <- tibble(t2 = c("D", "E", "F", "G"),
t3 = c("H", "H", "I", "J"),
value = c(100, 100, 80, 100))
data_1_long <- data_1 %>%
mutate(flow = 1:n()) %>%
pivot_longer(-c(flow, value), names_to = "time", values_to = "district")
data_2_long <- data_2 %>%
mutate(flow = (max(data_1_long$flow)+1):(max(data_1_long$flow)+nrow(data_2))) %>%
pivot_longer(-c(flow, value), names_to = "time", values_to = "district")
data_long <- bind_rows(data_1_long,
data_2_long)
plot_alluvial <-ggplot(data = data_long ,
aes(x = time,
stratum = district,
alluvium = flow,
y = value ,
label = district)) +
geom_flow(stat = "alluvium", lode.guidance = "backfront",
color = "darkgray") +
geom_stratum() +
geom_text(stat = "stratum") +
theme(legend.position = "bottom")
plot_alluvial

I think the problem is that your data isn't really suited to an alluvial plot, which shows how a fixed number of entities switch categories. You have variable numbers entering a category and leaving that category. In other words, the size of the inputs to t2 is different from the size of the outputs. For example, "E" has 50 going in, but 100 going out.
In cases like this, you may be better with a weighted directional graph: