I would like read all the calls in a specific region of the genome. regardless of the genotype (equal to the reference genome or alternate, coding or non coding region). Assuming that all genome was sequenced. Which of the following tables should I look at?
I am using Google BigQuery genomics data and need explanation on the differences between the following files extensions: *.genome_calls *.variants *.multisample_variants *.single_sample_genome_calls
Many thanks, Eila
genome_calls - all genomic calls - not only variants. might have calls with low quality
multisample_variants - variants by samples - where each variant will have all the samples that harbor this mutation in one variant row
single_sample_genome_calls - variants by samples matrix. Variants that exists in multiple samples will have a row per sample