R dataset from long to wide - under a specific condition

45 views Asked by At

I want to transform a long and chronological dataset into a wide but chronological dataset by ID Let's see an example:

ID Product Date
1 Bike 1/1/2000
1 Tire 2/1/2000
2 Car 15/2/2000
2 Seat 17/2/2000
1 Chronometer 20/2/2000

Into the following table:

ID 1st 2nd 3rd etc
1 Bike Tire Chronometer
2 Car Seat

The order of the products bought must not be changed.

Can you help me guys?

Thanks a lot!

3

There are 3 answers

1
Ronak Shah On BEST ANSWER

arrange the data for each ID and Date, give a unique row number for each ID and cast the data to wide format.

library(dplyr)

df %>%
  mutate(Date = as.Date(Date, '%d/%m/%Y')) %>%
  arrange(ID, Date) %>%
  group_by(ID) %>%
  mutate(row = row_number()) %>%
  tidyr::pivot_wider(names_from = row, values_from = c(Product, Date))

#     ID Product_1 Product_2 Product_3   Date_1     Date_2     Date_3    
#  <int> <chr>     <chr>     <chr>       <date>     <date>     <date>    
#1     1 Bike      Tire      Chronometer 2000-01-01 2000-01-02 2000-02-20
#2     2 Car       Seat      NA          2000-02-15 2000-02-17 NA        

data

df <- structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike", 
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000", 
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame", row.names = c(NA, -5L))
0
akrun On

We can use dcast from data.table

library(data.table)
dcast(setDT(df), ID ~ rowid(ID), value.var = c('Product', 'Date'))
#     ID Product_1 Product_2   Product_3    Date_1    Date_2    Date_3
#1:  1      Bike      Tire Chronometer  1/1/2000  2/1/2000 20/2/2000
#2:  2       Car      Seat        <NA> 15/2/2000 17/2/2000      <NA>

data

df <- structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike", 
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame",
row.names = c(NA,
-5L))
0
ThomasIsCoding On

A base R option using reshape

reshape(
  transform(
    df,
    q = ave(1:nrow(df), ID, FUN = seq_along)
  ),
  direction = "wide",
  idvar = "ID",
  timevar = "q"
)

gives

  ID Product.1    Date.1 Product.2    Date.2   Product.3    Date.3
1  1      Bike  1/1/2000      Tire  2/1/2000 Chronometer 20/2/2000
3  2       Car 15/2/2000      Seat 17/2/2000        <NA>      <NA>

If you don't want to keep Date, you can try this

reshape(
  transform(
    subset(df, select = -Date),
    q = ave(1:nrow(df), ID, FUN = seq_along)
  ),
  direction = "wide",
  idvar = "ID",
  timevar = "q"
)

which gives

  ID Product.1 Product.2   Product.3
1  1      Bike      Tire Chronometer
3  2       Car      Seat        <NA>

Data

> dput(df)
structure(list(ID = c(1L, 1L, 2L, 2L, 1L), Product = c("Bike", 
"Tire", "Car", "Seat", "Chronometer"), Date = c("1/1/2000", "2/1/2000",
"15/2/2000", "17/2/2000", "20/2/2000")), class = "data.frame", row.names = c(NA,
-5L))