Google genomics bigQuery difference tables data description

40 views Asked by At

I would like read all the calls in a specific region of the genome. regardless of the genotype (equal to the reference genome or alternate, coding or non coding region). Assuming that all genome was sequenced. Which of the following tables should I look at?

I am using Google BigQuery genomics data and need explanation on the differences between the following files extensions: *.genome_calls *.variants *.multisample_variants *.single_sample_genome_calls

Many thanks, Eila

1

There are 1 answers

0
eilalan On

genome_calls - all genomic calls - not only variants. might have calls with low quality

multisample_variants - variants by samples - where each variant will have all the samples that harbor this mutation in one variant row

single_sample_genome_calls - variants by samples matrix. Variants that exists in multiple samples will have a row per sample