Multiple input files as single file output from biopython AlignIO

200 views Asked by At

I'm writing a code to convert alignments from multiple files to phylip format, and then output all alignments to a single file. I can't seem to find a good way to have AlignIO.write() take multiple input files and produce a single output file. The following code works on a single file:

import glob
from Bio import AlignIO

path = "alignment?.nexus"

for filename in glob.glob(path):
    for alignment in AlignIO.parse(filename, "nexus"):
        AlignIO.write(alignment, "all_alignments", "phylip-relaxed")
1

There are 1 answers

0
Chris_Rands On BEST ANSWER

You can use .write() to effectively append to the output file by writing to the file handle rather than a string file name:

with open("all_alignments", "w") as output_handle: 
    for filename in glob.glob(path):
        for alignment in AlignIO.parse(filename, "nexus"):
            AlignIO.write(alignment, output_handle, "phylip-relaxed")

The alternative would be to yield all alignments (or store them in a list or similar) and then call .write() once afterwards with the iterable and string file name (and format) as arguments:

def yield_alignments():
    for filename in glob.glob(path):
        for alignment in AlignIO.parse(filename, "nexus"):
            yield alignment

AlignIO.write(yield_alignments(), "all_alignments", "phylip-relaxed")

The 2nd one feels more invasive to your current structure, but might be slightly more performant, on older Biopython versions at least.