Figure 28: Proportion of the human genome under selection and the probability of a genomic window to be under selection on the basis of conservation score.
From: Initial sequencing and comparative analysis of the mouse genome

a, The genome-wide density of conservation scores, Sgenome (dark blue), was decomposed into a mixture of two component densities: Sneutral (red) and Sselected (light blue and grey). Sgenome is derived from the conservation scores S(R) for all windows of 50 bp in the human genome with at least 45 bases aligning to mouse. Sneutral is a scaled version of the Sneutral density from the blue curve in Fig. 23 for the 50-bp windows in ancestral repeats, representing neutrally evolving DNA. Sselected is the difference between the blue density and the red component, and thus represents a scaled version of Sselected, the predicted density for conservation scores of 50-bp windows in the human genome that are evolving under selection. The scaling factors are the estimated mixture coefficients, which are p0 = 0.792 for Sneutral, and 1 - p0 = 0.208 for Sselected. The coefficient p0 is calculated as the minimum of the ratio between Sgenome(S) and Sneutral(S) for all values of S, giving a conservative estimate that maximizes the share of the mixture attributed to Sneutral. b, The probability, Pselected(S), that a 50-bp window is under selection as a function of its conservation score S = S(R). This function is derived from the mixture decomposition by setting Pselected(S) = 1 - p0Sneutral(S)/Sgenome(S).