Reading and naming multiple .txt files in R

787 views Asked by At

I want to read and name multiple .txt files in R. To be more clear (sample): I have 2 subfolders, each one with three .txt files (they have the same name). Subfolder 'test' has 3 .txt files with names 'alpha.txt','bita.txt','gamma.txt' and subfolder 'train' has 3 .txt files with names 'alpha.txt','bita.txt','gamma.txt'. I am using the following code:

files <- dir(recursive=TRUE,pattern ='\\.txt$')
List <- lapply(files,read.table,fill=TRUE)

which gives a List with 6 elements, each one a data frane. I know that the first element is the 'alpha' from test folder, the second element the 'bita' from the test folder and so on. But as the files are more I would like to read the data in order to have in the environment variables: 'test_alpha','test_bita','test_gamma','train_alpha','train_bita','train_gamma'. Is there a way to do it?

1

There are 1 answers

0
Pierre L On BEST ANSWER

I created two folders in my working directory /train and /test. We create two arrays and write them one to each folder.

df1 <- data.frame(matrix(rnorm(9), 3, 3))
df2 <- data.frame(matrix(runif(12), 4,3))
write(df1, './test/alpha.txt')
write(df2, './train/alpha.txt')

We run your code:

files <- dir(recursive=TRUE,pattern ='\\.txt$')
List <- lapply(files,read.table,fill=TRUE)

files
[1] "test/alpha.txt"  "train/alpha.txt"

It works to isolate the files we need. Next we take out the forward slash and file extension.

newnames <- gsub('/', '_', files)
newnames1 <- gsub('\\.txt', '', newnames)
newnames1
[1] "test_alpha"  "train_alpha"

This vector can now be assigned to List to name each array.

names(List) <- newnames1
List
$test_alpha
          V1          V2         V3         V4        V5
1 -0.6594299 -0.01881557  0.7076588 -0.7096888 0.3629274
2 -1.4401000  1.59659000 -1.9041430  2.3079960        NA

$train_alpha
         V1        V2        V3        V4        V5
1 0.9307107 0.6257928 0.6903179 0.5143920 0.6798936
2 0.3652738 0.9297527 0.1902556 0.7243708 0.4541548
3 0.5565041 0.5276907        NA        NA        NA