Extended Data Fig. 4: Performance with different filters. | Nature Genetics

Extended Data Fig. 4: Performance with different filters.

From: Aberrant splicing prediction across human tissues

Extended Data Fig. 4

Precision-recall curve comparing the overall prediction performance on all GTEx tissues of SpliceAI, SpliceAI using SpliceMap annotation, SpliceAI using SpliceMap annotation along with quantitative reference levels of splicing, MMSplice using GENCODE annotation, MMSplice using SpliceMap annotation, MMSplice using SpliceMap annotation along with quantitative reference levels of splicing, and the integrative model AbSplice-DNA, using different filters for aberrantly spliced genes. a, Filter 1: FRASER default cutoffs (|ΔΨ| > 0.3, FDR < 0.05, 126,308 aberrant events) b, Filter 2: same as a, but restricting to genes that are aberrantly spliced in at least two different tissues from the same individual (32,886 aberrant events). c, Filter 3: same as a, but restricting to genes that have a rare variant within 250 bp of the splice sites (22,766 aberrant events). While the results are best with Filter 3, the relative improvements in terms of precision at the same recall between the methods is the same as with Filter 2. In particular, having restricted to variants 250 bp away from any detected split read boundary (Filter 3) did not bias our analysis for the splice-site centric method MMSplice over SpliceAI. d, After applying Filter 3, outliers were stratified into ‘replicated’ (14,030 aberrant events), that is appearing in at least two different tissues of the same individual, and ‘not replicated’ (8,736 aberrant events). All models showed a significantly higher performance for aberrant splicing events replicated in two or more samples compared to those reported in a single sample only.

Back to article page