How to use for loop to perform Pearson correlation in r

50 views Asked by At

I have a dataset 673 x 232. I want to use for loop to perform Pearson correlation and save p-value and estimate. My code is

vec_lipid <- colnames(df3[,9:232])
df4 <- data.frame(vec_lipid)
df4$p.value <- "0"
df4$estimate <- "0"

for (i in length(vec_lipid)) {
  cortest[i] <- cor.test(df3$CSF_Ab42, df3$vec_lipid[i], method = "pearson")
  df4[i,2] <- cortest[i]$p.value
  df4[i,3] <- cortest[i]$estimate
  i = i+1
  }

However it doesn't work. Can you please help to adjust the code? Thanks a lot in advance.

2

There are 2 answers

0
jkd On

If I understood your code correctly, you are only interested in the correlation between columns 9 to 232 and CSF_Ab42. Then you could use following code based on sapply, avoiding the use of a loop.

df4 <- sapply(df3[,-1:-8], \(x) {
  ct <- cor.test(df3$CSF_Ab42, x)
  unlist(ct[c("estimate","p.value")])
}) |> t() |> as.data.frame()

The df3 rownames are the corresponding column names of df3, for which the correlation with CSF_Ab42 was computed. If you want to put it inside a column simply add: df4$lipid <- rownames(df4).

0
margusl On

You might start by chunking that problem into smaller pieces, make sure everything works as you expect and only then combine it all together. For example that loop only gets to run once because length(vec_lipid) returns a single value; df3$vec_lipid[i] will not work, though df3[[vec_lipid[i]]] should; for cortest[i] <- ... to work, cortest must be created first; and generally you don't want to modify for-loop control variables with something like i = i+1

Using iris dataset as an example, a slight variation of the previous answer might look like this:

head(iris)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa

iris[2:4] |>
  lapply(cor.test, iris$Sepal.Length) |>
  do.call(rbind, args = _) |>
  subset(select = c(estimate, p.value))
#>              estimate   p.value     
#> Sepal.Width  -0.1175698 0.1518983   
#> Petal.Length 0.8717538  1.038667e-47
#> Petal.Width  0.8179411  2.325498e-37

Created on 2023-11-14 with reprex v2.0.2