Create a table from a list of strings in python (.fastq files path mapping)

42 views Asked by At

I have created a 'kind of' table from a list of file paths and filenames, but I think that the way I did was not the best. I would like to ask you how my code should be, and also how I could actually create a dataframe with headers linking this code to pandas.

My code:

import os
current_dir = os.getcwd()
files_in_dir = os.listdir(current_dir)
for file_name in files_in_dir:
    file_path = os.path.join(current_dir, file_name)
    if "R1" in file_path:
        sample = file_name.split('_')[0]
        file_path2 = file_path.replace("R1", "R2")
        print(sample + '\t' + file_path + '\t' + file_path2)

output (not a proper table):

AG16      /home/user/folder1/AG16_R1.fastq    /home/user/folder1/AG16_R2.fastq<p>
AG13      /home/user/folder1/AG13_R1.fastq    /home/user/folder1/AG13_R2.fastq<p>
AG2       /home/user/folder1/AG2_R1.fastq     /home/user/folder1/AG2_R2.fastq<p>
...<p>

What I would like to add:

code above plus:

import pandas as pd
df = pd.DataFrame()
data = [sample, file_path, file_path2]
df = pd.DataFrame(data, columns=['sample', 'r1', 'r2'])
print(df)

Thank you very much!

1

There are 1 answers

2
TheHungryCub On BEST ANSWER

Try this:

import os
import pandas as pd

current_dir = os.getcwd()
files_in_dir = os.listdir(current_dir)

data = []

for file_name in files_in_dir:
    file_path = os.path.join(current_dir, file_name)
    if "R1" in file_path:
        sample = file_name.split('_')[0]
        file_path2 = file_path.replace("R1", "R2")
        data.append([sample, file_path, file_path2])

df = pd.DataFrame(data, columns=['sample', 'r1', 'r2'])
print(df)