BCFtools merge command collapsing non-identical variants to single record

103 views Asked by At

I am trying to combine variant call files (VCFs) from 60 different individuals using BCFtools merge. The problem is that the resultant file has records with non-identical variants collapsed together in a single line if they share the same start position. The --collapse none argument seems to address this for other commands, such as isec, but isn't available for merge. I would just pull those records apart manually after merging but there doesn't seem to be a way to tell which allelic ratio accords to which variant following the merge.

My command - bcftools merge –file-list vcf_filenames -Oz -o merged_files.vcf.gz

Desire non-identical variants to be put to separate records (lines).

Instead resulted in non-identical variants being output to the same record without a way to parse which allelic ratios for each patient belong to which variant call.

Single record for multiple unique events and unclear allelic ratio assignment

1

There are 1 answers

0
ekerde On

Have you looked at the -m option ? In the manual:

-m none .. no new multiallelics, output multiple records instead So it will create what you want : non-identical variants in different lines.

Another option is to give an ID to each SNP (can be 'position:REF:ALT') and merge positions with same ID:

-m id .. merge by ID