How to create a for loop of simulated correlation coefficients in R?

45 views Asked by At

I plan to run 10,000 simulations on an existing dataset, and want to find the correlation coefficient of each simulation and save it into a dataframe (theoretically ending up with 10,000 correlation coefficients). How can I write code to run the simulation and print out the coefficient each time in a for loop?

This is the latest code I tried:

for (i in 1:10000) {
  bnsamp <- data.frame(mvrnorm(n = 48, mu = mean_combined, Sigma = sigma))
  colnames(bnsamp) <- c("Amyg_Avg", "Sys_Jus")
  cor_coeff <- cor.test(x= bnsamp$Amyg_Avg, y= bnsamp$Sys_Jus, 
                        method= "pearson", alternative= "two.sided")
   cor_coef_est <- list(cor_coeff$estimate == i)
   cor_coef_p <- list(cor_coeff$p.value == i)
   coeff_list <- data.frame(cor_coef_est, cor_coef_p)
   print(coeff_list)
}

Ultimately, my problem is how to loop the recalculation of the correlation and print it every single time (for later to be plotted into a distribution). Thank you!

1

There are 1 answers

1
Rui Barradas On BEST ANSWER

Here are two ways of simulating the wanted values.

  1. With a for loop:
  2. with replicate

From help('replicate'), my emphasis:

replicate is a wrapper for the common use of sapply for repeated evaluation of an expression (which will usually involve random number generation).

library(MASS)

mean_combined <- c(1, 2)
sigma <- matrix(c(2, 1, 3, 2), ncol = 2L)

# replicates, but not the 10,000 in the question,
# only 1,000 is enough for code tests
R <- 1000L

# with a for loop:
# 1. create the results matrix beforehand
# 2. simulate from multivariate normal
# 3. run the test
# 4. and assign the results
set.seed(2023)
coeff_mat <- matrix(nrow = R, ncol = 2,
                    dimnames = list(NULL, c("Amyg_Avg", "Sys_Jus")))

for(i in seq.int(R)) {
  bnsamp <- mvrnorm(n = 48L, mu = mean_combined, Sigma = sigma)
  cor_coeff <- cor.test(x = bnsamp[, 1L], y = bnsamp[, 2L], 
                        method= "pearson", alternative= "two.sided")
  cor_coef_est <- cor_coeff$estimate
  cor_coef_p <- cor_coeff$p.value
  coeff_mat[i, ] <- c(cor_coef_est, cor_coef_p)
}

# with help('replicate')
# 1. simulate from multivariate normal
# 2. run the test
# 3. and assign the results
# 4. the final pipe to t() puts the matrix in the wanted form
set.seed(2023)
res <- replicate(R, {
  bnsamp <- mvrnorm(n = 48L, mu = mean_combined, Sigma = sigma)
  cor_coeff <- cor.test(x = bnsamp[, 1L], y = bnsamp[, 2L], 
                        method= "pearson", alternative= "two.sided")
  c(Amyg_Avg = unname(cor_coeff$estimate), Sys_Jus = cor_coeff$p.value)
}) |> t()

identical(res, coeff_mat)
#> [1] TRUE

head(res)
#>       Amyg_Avg      Sys_Jus
#> [1,] 0.3999366 4.856605e-03
#> [2,] 0.6007393 6.353057e-06
#> [3,] 0.6641624 2.652077e-07
#> [4,] 0.6094018 4.287226e-06
#> [5,] 0.4825884 5.132400e-04
#> [6,] 0.5514744 4.854691e-05

Created on 2023-11-06