Extended Data Fig. 10: Genome size variation in the ScRAP.

a. Scatter plots showing the genome size of each strain, split by dataset, as a function of the number of Y’ elements (left), Ty elements (middle and Y’ + Ty elements (right). b. Number of TE sequences per strain across the 142 haploid/collapsed genome assemblies. All sequences from the 5 Ty families are pooled together by category. c. Number of complete Ty elements per strain across the 142 haploid/collapsed genome assemblies. d. Distribution of the 126 insertion sites across the 100 haploid or homozygous genomes considering either the complete Ty elements or all types of TE sequences (complete, truncated and solo-LTRs). e. Scatter plot between number of solo-LTRs per insertion site and the number of strains sharing an insertion site. The Pearson correlation coefficient and its associated two-tailed t-test p-value were calculated using the stat_cor method in R. f. Map of the de novo insertions of complete Ty elements across the 100 homozygous explored strains. The map shows the 61 insertion sites in which only complete elements are found and never soloLTRs, which strongly suggests that these sites correspond to recent insertions. Strains are organized according to the nuclear phylogenetic tree (Fig. 1).