Selecting factor loadings above threshold in R

629 views Asked by At

I'm performing factor analysis using the following command from psych package in R.

fa <- fa(convAll[-1], nfactors=5, rotate = "promax", fm="pa")

it generates factor loadings which can be saved in a CSV file. A sample of such file is provided below.

    PA5 PA3 PA1 PA2 PA4
adv 0.083567828 0.26568194  0.051709392 0.145763195 -0.118456783
adv_down    0.073272749 0.031264884 -0.082601123    0.196925251 0.012515693
adv_place   0.028650579 0.195255276 -0.028781995    0.011087121 -0.075995905
adv_time    0.140393013 0.256528641 -0.079074986    -0.049583628    0.077695781
amplifr -0.005328985    0.043233732 0.176904981 -0.026720026    -0.090867507
att_vb_other    0.09240641  -0.035350749    0.223306084 0.017628218 -0.014419588
comm_vb_other   0.063530526 -0.013204134    0.105246297 -0.028007553    0.558798415
conj_advl   0.048185731 0.11380117  0.014882315 0.329070824 -0.049132805
contract    0.379176251 0.103187601 0.173065276 -0.213728905    -0.074295022
coord_conj_cls  0.099132548 0.235969867 0.086063555 0.191272967 -0.047419106
coord_conj_phrs -0.094208803    -0.039195575    0.042876041 0.085711817 -0.072005987
disc_particle   0.23693194  0.063337377 0.020130766 -0.195263816    0.064033528
do_pro  0.328570052 -0.043968998    0.093690313 -0.074335324    -0.078537628
emphatic    0.115773696 0.183956168 0.198039834 -0.068159604    -0.127846385
fact_vb_other   0.059866245 -0.037114568    0.298395774 -0.079350697    0.053288398
hedge   0.014631137 0.114725108 0.060555295 -0.009892361    0.000415616
infinitive  -0.007423406    0.017473329 0.1534992   0.133033783 0.050682644
jj_attr -0.355339091    -0.379083698    -0.063350973    0.023637592 -0.220351424
jj_pred 0.174898501 0.002472112 0.075689444 0.102759711 -0.056187374
likely_vb_other 0.01709907  0.038883434 0.263396208 0.143448431 -0.041417434
mod_necess  0.233491105 -0.036824461    0.027589775 0.090104444 0.065779138
mod_poss    0.392744267 -0.053985013    -0.022362104    0.024825812 0.036541161
mod_pred    0.43496355  -0.030372919    -0.129436799    0.05482249  0.024503805
nn_abstact  0.050477208 -0.284252513    0.019715273 0.147725317 -0.038579005

I want to extract only those variables which have factor loadings above .295 either plus or minus. For that purpose I have written the following function which takes factor loadings object as an input and writes each factor to a CSV file after removing below the threshold values.

write.factors <- function(loadings, cutoff_p = 0.295, cutoff_n = -0.295, file_name = "factors.csv"){
  f <- data.frame(unclass(loadings))
  for(c in 1:ncol(f)){
    variables <- rownames(f)
    ff <- data.frame(variables, f[,c])
    colnames(ff)[2] <- colnames(f)[c]
    nd <- subset(data.frame(ff, ff[,2] > cutoff_p | ff[,2] < cutoff_n))
    write.csv(file  = file_name, nd, append = TRUE)
    write.csv(file  = file_name, "\r\n", append = TRUE)
  }
}
write.factors(fa$loadings)

As you can see the logic seems pretty simple, but I am unable to get the output as there is warning about append being ignored. The objects I create within the function appear to be list objects, but I, as you can see, am trying to create data frames so that I can later on remove rows below the threshold and save them into CSV one by one. Your helpful comments would be highly appreciated.

1

There are 1 answers

0
Shakir On BEST ANSWER

After consulting various online sources I have made the following changes in my function. So now it incrementally writes the output from the loop.

write.factors <- function(loadings, cutoff_p = 0.295, cutoff_n = -0.295, file_name = "factors.csv"){
  f <- data.frame(unclass(loadings))
  sink(file_name)
  for(c in 1:ncol(f)){
    variables <- rownames(f)
    ff <- data.frame(variables, f[,c])
    colnames(ff)[2] <- colnames(f)[c]
    nd <- subset(ff, ff[,2] > cutoff_p | ff[,2] < cutoff_n)
    nd <- droplevels(nd)
    write.csv(nd)
    cat('____________________________')
    cat('\n')
  }
  sink()
}
write.factors(fa$loadings)