Fig. 5: Distribution of SNV and small INDEL variants within tandem repeats throughout GRCh38 using the HG002 Q100 variant benchmark. | Nature Communications

Fig. 5: Distribution of SNV and small INDEL variants within tandem repeats throughout GRCh38 using the HG002 Q100 variant benchmark.

From: The GIAB genomic stratifications resource for human reference genomes

Fig. 5

a The distribution of the number of variants per repeat. Y-axis shows the number of tandem repeats and x-axis is the number of variants in each tandem repeat. b, c Among repeats with only one variant, the fraction of the variant class by chromosome b and the distribution of intersecting variants classified by type according to repeat length (c) INDEL2; INDELs with length <= 2, INDEL49; INDELs with length > 2 and length <= 49, SNV; single nucleotide variants, SV; structural variants. d Number of variants in tandem repeats, segmental duplications and all other regions. e Variant density in regions of tandem repeats and segmental duplications. f Performance within new stratifications using HG002 Q100 benchmark and an Illumina DeepVariant query callset for tandem repeat regions with different number of variants inside. g Performance within regions with different genomic distance between variants. h Performance within regions with different coverage values for a variant set called from a BAM file with mean coverage of 40×. For (fh) each bar represents the mean of the given metric. Error bars are 95% binomial confidence intervals computed with the Wilson method (see Methods).

Back to article page