I have some population information from multiple cohorts in a list. Each cohort covers an overlapping time period. The data looks like the following:
> raw.data
$`1`
Year Pop
1 1920 1927433
2 1921 1914551
3 1922 1900776
$`2`
Year Pop
1 1921 1915576
2 1922 1902075
3 1923 1887613
$`3`
Year Pop
1 1922 1902111
2 1923 1887862
3 1924 1872695
I would like to convert it into a single data frame where the column names are the years and the diagonals are population data. The output should look like the following:
> resulting.data
1920 1921 1922 1923 1924
1 1927433 1915576 1902111 NA NA
2 NA 1914551 1902075 1887862 NA
3 NA NA 1900776 1887613 1872695
You can find example data for the input and desired output below:
raw.data <- structure(list(`1` = structure(list(Year = 1920:1922, Pop = c(1927433L, 1914551L, 1900776L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame"), `2` = structure(list(Year = 1921:1923, Pop = c(1915576L, 1902075L, 1887613L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame"), `3` = structure(list(Year = 1922:1924, Pop = c(1902111L, 1887862L, 1872695L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame")), .Names = c("1", "2", "3"))
resulting.data <- structure(list(X1920 = c(1927433, NA, NA), X1921 = c(1915576, 1914551, NA), X1922 = c(1902111, 1902075, 1900776), X1923 = c(NA, 1887862, 1887613), X1924 = c(NA, NA, 1872695)), .Names = c("X1920", "X1921", "X1922", "X1923", "X1924"), row.names = c(NA, -3L), class = "data.frame")
I have looked at this link which offers a similar question but I was unsuccessful in adapting it to my needs. I have also attempted to use plyr to try to obtain the diagonals first and then combine them but then I'm not sure how to do the combining.
Use
do.call()
withrbind()
to convert the data to a single data frame, thenreshape2::dcast()
for the reshaping: