I have a .fam, .bed and .bim file with markers for few individuals. I would need to convert it into a VCF file.
Could someone help to create a VCF file. Are there any opensource tools which can do this?
I have a .fam, .bed and .bim file with markers for few individuals. I would need to convert it into a VCF file.
Could someone help to create a VCF file. Are there any opensource tools which can do this?
You could try PlinkSeq, or see this post: http://bhoom.wordpress.com/2012/04/06/convert-plink-format-to-vcf/
Briefly, the post lists user code for turning plink files into vcf format:
#!/bin/sh
##-- SCRIPT PARAMETER TO MODIFY--##
PLINKFILE=csOmni25
REF_ALLELE_FILE=csOmni25.refAllele
NEWPLINKFILE=csOmni25Ref
PLINKSEQ_PROJECT=csGWAS
## ------END SCRIPT PARAMETER------ ##
#1. convert plink/binary to have the specify reference allele
plink --noweb --bfile $PLINKFILE --reference-allele $REF_ALLELE_FILE --make-bed --out $NEWPLINKFILE
#2. create plink/seq project
pseq $PLINKSEQ_PROJECT new-project
#3. load plink file into plink/seq
pseq $PLINKSEQ_PROJECT load-plink --file $NEWPLINKFILE --id $NEWPLINKFILE
#4. write out vcf file, as of today 4/6/2012 using vcftools version 0.1.8, although the documentation says that you can write out a compressed vcf format using --format BGZF option, vcftools doesn't recognize what this option is. So, I invented my own solution
pseq $PLINKSEQ_PROJECT write-vcf | gzip > $NEWPLINKFILE.vcf.gz
You can perform this operation with plink2 (https://www.cog-genomics.org/plink2/) with the following command:
See here for more options: https://www.cog-genomics.org/plink2/data#recode However, this will not generate a properly formatted VCF, as plink2 does not keep information about what the reference allele is, while VCF format expects the first allele to be the reference allele. Indels are also often coded differently, though there is no guideline for how to code them in plink format.
For more advanced ways to perform the conversion, a combination of "bedtools getfasta" and "bcftools norm" can help you overcome the above shortcomings.