iterating on vcf file using cyvcf2

111 views Asked by At

I want to create a new csv file that will have some rows of the vcf file (chrom pos ref alt and GT from its samples). But I have troubles getting the GT values. When i do the print commented in the code below, I get an error, even tho I used this webpage (https://pyvcf.readthedocs.io/en/v0.4.6/INTRO.html) to create it. What should I write in the vcf_records.append() after alt so that I have all the GT values there? I would appreciate any help!! Thanks!

    def foo(vcf_file, input_csv_file, output_csv_file, num_lines=10000):
    
        csv_df = pd.read_csv(input_csv_file, sep=';')
    
        vcf_reader = cyvcf2.VCF(vcf_file)
        total_records_vcf = num_lines
    
        vcf_records = []
        for i, vcf_record in tqdm(enumerate(vcf_reader), total=total_records_vcf):
            chrom = vcf_record.CHROM
            pos = vcf_record.POS
            ref = vcf_record.REF
            alt = vcf_record.ALT[0]
            for sample in vcf_record.samples:
                print(sample['GT']) #this throws an error: AttributeError: 'cyvcf2.cyvcf2.Variant' object has no attribute 'samples'

    
            vcf_records.append({'chrom': chrom, 'pos': pos, 'ref': ref, 'alt': alt, ...#here i need to add the GT values for all samples})

... rest of the code
        
0

There are 0 answers