Fig. 1: MSA of 239 primate species.
From: Identification of constrained sequence elements across 239 primate genomes

a, Cladogram of primate species included in the MSA. The number of sampled species per family is given in parenthesis. b, Ideogram of the human genome depicting the average number of species covered by the MSA at 500-kb resolution. Telomeric, centromeric and heterochromatic regions (light blue) are indicated. c, Cumulative primate species coverage of the human genome in the 239-way primate MSA. d, Per-base mismatch rate between newly generated short-read contigs and species with previously published high-quality reference assemblies. A linear regression fit with a corresponding 95% confidence interval ribbon is shown. e, Enrichment of primate phastCons elements for coding and noncoding genomic elements. The size of the circle represents the fraction of the human genome. The dashed grey line indicates an odds ratio (OR) of 1. CDS, coding sequence; TF, transcription factor; UTR, untranslated region. (f) Codon periodicity in the mean primate phyloP scores across 482 protein-coding exons exactly 130 nucleotides in length. Coding sequences are shown in dark blue and flanking intronic sequences in beige.