Duplicating rows in dataframe based on column value

Question

Duplicating rows in dataframe based on column value

227 views Asked by Danielle Edwards At 25 April 2021 at 13:59

I am trying to duplicate rows based on the value of a column. My dataframe (df) currently looks like:

Species name	Visits
Apis m	4
Bombus l	7

And so on (there are 34 more columns which all need to be repeated) I want it to look like:

Species name
Apis m
Apis m
Apis m
Apis m
Bombus l
Bombus l
Bombus l
Bombus l
Bombus l
Bombus l
Bombus l

This a fairly large dataset of 1767 observations already, there are 190 'Species Name' and each one has been visited several hundred times.

I'm very new to R (and coding!) so everything is very 'trial and error'. I found a solution on Stack Overflow using "splitstackshape" but am getting the error

"Error in .subset2(x, i, exact = exact) : recursive indexing failed at level 2".

This is my code:

expandRows(df, df$Visits, 
           count.is.col = TRUE, drop = TRUE)

There are questions for other instances of this error but note related to the 'expand rows' function. The column is stored as an integer and I've removed any null values from the 'Visits' column.

Any pointers as to what my issue might be or other ideas of how to do this would be much appreciated.

Danielle

Edit: Reprex below, I'm not sure what 'could not find function' relates to as it appeared to run the code without the Reprex? Also, not in here that it includes the actual column names and df, I simplified in the example above.

expandRows(BombusL, BombusL$No.of.Interaction.Records, count.is.col = TRUE, 
    drop = TRUE)
#> Error in expandRows(BombusL, BombusL$No.of.Interaction.Records, count.is.col = TRUE, : could not find function "expandRows"

Original Q&A

There are 2 answers

Ran K On 25 April 2021 at 14:54

You can try uncount from the tidyr/tidyverse package

library(tidyr)

data <- data.frame(Species = c("Apis m","Nimbus"),Visits = c(4,7))
data %>% 
  uncount(Visits)
#>     Species
#> 1    Apis m
#> 1.1  Apis m
#> 1.2  Apis m
#> 1.3  Apis m
#> 2    Nimbus
#> 2.1  Nimbus
#> 2.2  Nimbus
#> 2.3  Nimbus
#> 2.4  Nimbus
#> 2.5  Nimbus
#> 2.6  Nimbus

^{Created on 2021-04-25 by the reprex package (v2.0.0)}

**TarJae** · Accepted Answer · 2021-04-25T15:11:00+00:00

Update (as uncount is already mentioned):

With your code:

df.expanded <- df[rep(row.names(df), df$Visits), 1:2]

Or: You could use slice with seq_len(n())

library(dplyr)
df %>%  
  slice(rep(seq_len(n()), Visits)) %>% 
  select(-Visits)

Output:

   Species.name
   <chr>       
 1 Apis m      
 2 Apis m      
 3 Apis m      
 4 Apis m      
 5 Bombus l    
 6 Bombus l    
 7 Bombus l    
 8 Bombus l    
 9 Bombus l    
10 Bombus l    
11 Bombus l

TechQA.

Duplicating rows in dataframe based on column value

There are 2 answers

Related Questions in R

Related Questions in DATAFRAME

Related Questions in SPLITSTACKSHAPE

Popular Questions

Trending Questions