Using R to Randomly Assign Treatment and Control Groups by single IDs

456 views Asked by At

I would like to use R for solving a problem of experimental design in which I will randomly assign my experimental units to treatment or control groups. The problem is the following:

Let's say that I have 120 plants with unique IDs (subsetted in 4 different clones), 3 time points, 2 pathogens, 2 control groups. Therefore, for each time point I would like to assign for each clone: -3 of pathogen A, 3 of pathogen B, 2 of control A and 2 of control B.

clones <- c(rep("clone A", 30), rep("clone B", 30), rep("clone C", 30), rep("clone D", 30))
IDs <- 1:120
plants <- data.frame(IDs = IDs, 
                     clones = clones)

# How can I randomly assign the following for each IDs? 
control <- c("control A", "control B")
pathogen <- c("pathogen A", "Pathogen B")
time_point <- c("T1", "T2", "T3")

Thanks for the help!

1

There are 1 answers

2
danlooo On BEST ANSWER
library(tidyverse)

# ensure to use the same kind of randomness
# required for reproducibility
set.seed(1)

group <- c(
  rep("Pathogen A", 3),
  rep("Pathogen B", 3),
  rep("control A", 2),
  rep("control B", 2)
)

sampling <-
  # Every clone has all groups
  expand_grid(
    group,
    clone = c("Clone A", "Clone B", "Clone C", "Clone D"),
    time = c("T1", "T2", "T3")
  ) %>%
  arrange(clone) %>%
  mutate(id = row_number()) %>%
  # random group assignment stratified for each clone and time
  group_by(clone, time) %>%
  mutate(group = group %>% sample())

sampling
#> # A tibble: 120 × 4
#> # Groups:   clone, time [12]
#>    group      clone   time     id
#>    <chr>      <chr>   <chr> <int>
#>  1 control B  Clone A T1        1
#>  2 Pathogen A Clone A T2        2
#>  3 Pathogen B Clone A T3        3
#>  4 Pathogen B Clone A T1        4
#>  5 Pathogen A Clone A T2        5
#>  6 control B  Clone A T3        6
#>  7 control A  Clone A T1        7
#>  8 Pathogen B Clone A T2        8
#>  9 Pathogen A Clone A T3        9
#> 10 Pathogen A Clone A T1       10
#> # … with 110 more rows

sampling %>%
  group_by(clone) %>%
  summarise(
    min_id = min(id),
    max_id = max(id)
  )
#> # A tibble: 4 × 3
#>   clone   min_id max_id
#>   <chr>    <int>  <int>
#> 1 Clone A      1     30
#> 2 Clone B     31     60
#> 3 Clone C     61     90
#> 4 Clone D     91    120

sampling %>%
  filter(clone == "Clone A")
#> # A tibble: 30 × 4
#> # Groups:   clone, time [3]
#>    group      clone   time     id
#>    <chr>      <chr>   <chr> <int>
#>  1 control B  Clone A T1        1
#>  2 Pathogen A Clone A T2        2
#>  3 Pathogen B Clone A T3        3
#>  4 Pathogen B Clone A T1        4
#>  5 Pathogen A Clone A T2        5
#>  6 control B  Clone A T3        6
#>  7 control A  Clone A T1        7
#>  8 Pathogen B Clone A T2        8
#>  9 Pathogen A Clone A T3        9
#> 10 Pathogen A Clone A T1       10
#> # … with 20 more rows

Created on 2022-04-15 by the reprex package (v2.0.1)