Python3 write selected zones in a new file

37 views Asked by At

I had to browse my dictionary to get all my gene's taxon and everytime I encounter a taxon, I open the .fasta file with the same name and in this fasta file I had to look for the geneID I encountered during the taxon research. This is possible because in the file with which I made my dictionary has "taxon1|geneID1, taxon1|geneID2, taxon2|geneID1,...". And so, when I meet the specific geneID in the specific fasta file, I had to stock all the lines under the specific geneID to be able to write it in a new file. The fasta file looks like:

>taxon|gene1
ACTGCATCGCTAGCTAGAAATCGCTA
TACGATCAAACCTAGCGATCTTACGA
>taxon|gene2
TAGCTAGCTAGCTAGAATATCCCGAT
GCTAGCAATGCTCTTCCGGTAGCTAT

So when i meet the right geneID in the fasta file, the lines i have to copy/stock are the following 2 lines with all the ATGC. I did a few functions to make everything I said, but i'm stuck and the write part of the file from datas that i stocked. Here is a part of my code :

def readFastaFile(taxonomy, searchGeneId):
    fastaFile = open(taxonomy + ".fasta", "r")
    banco = False
    geneIdSequence = ""
    for line in fastaFile:
        if line[0] == ">":
            elements = line.split("|")
            taxonomy = elements[0]
            geneID = elements[1]
            if geneID == searchGeneId:
                banco = True
            else:
                banco = False
        else:
            if banco == True:
                geneIdSequence += line
    fastaFile.close()
    return geneIdSequence

def getSequencesFromFastasAndWriteThemInNewFastas(dictio):
    for groupName in dictio:
        taxonAndGene = dictio[groupName]
        groupFastaFile = open(groupName + ".fasta", "w")
        for taxon in taxonAndGene:
            geneIDs = taxonAndGene[taxon]
            #print("search" + str(geneIDs) + " in " + taxon + ".fasta")
            for geneId in geneIDS:
                readFastaFile(taxon, geneId)
                groupFastaFile.write()         #HERE is the part where i'm stuck
                groupFastaFile.write("\n")
        groupFastaFile.close()

My problem is in the lasts lines (marked by the #HERE comment): I don't know what to write between the () to write the data from the fasta files into my new file.

Thank you for your answers.

1

There are 1 answers

0
Arnaud 'KaRn1zC' On

Anyway, I found a solution to my problem.

I added lines :

tempoGeneID   = elements[1]
geneID        = tempoGeneID[0:-1]

in my function :

def readFastaFile(taxonomy, searchGeneId):

to remove the "\n" at the end of the geneIDs and then i added lines :

for geneID in geneIDs:
    groupFastaFile.write(">" + taxon + "|" + geneID + "\n" + readFastaFile(taxon, geneID))

in my function :

def getSequencesFromFastasAndWriteThemInNewFastas(dictio):

So it works perfectly as expected. I'm sorry if my post took you some time yo read. Have a great day.