Extended Data Fig. 2: Features and application of the SGREELI algorithm.
From: Single-step discovery of high-affinity RNA ligands by UltraSelex

a, Number of HTS reads per RNA species in the UltraSelex input library. Three independent replicates. b, Decreasing sequence diversity with increasing group number e1–11 in the SiR-B UltraSelex library. The x axis indicates the log2 of the number of HTS reads per species, while the y axis denotes the number of RNA species with the specified abundance. The partitioning eluate (e) number is individually denoted. c, Percentage of RNA species with ≥2 HTS reads in the different partitioning groups (e1–11) of UltraSelex library SiR-B. The dashed gray line represents 1% as a reference. d, Formula and simulation of γ functions, assigning γi values to different partitioning groups gi for i ≥ 2. c = 0.5, eluate_nums, number of partitioning eluates (here 11). Colors in the line plot represent different exponents θ from 0.1 to 10. e, Illustration of the SGREELI calculation protocol, comprising seven steps. After sequencing and quality control, fold-change (FC) values were calculated for each eluate (e2–11 for FC2–FC11) by normalizing read counts to their abundance in eluate e1. An RNA species that ranks at exactly the top 1% by FC value in each eluate is defined as the reference RNA (FCref), with this step performed for each eluate group (ref2–ref11). The fold-change values of these reference RNAs are log2-transformed (log2 FCref). Each eluate group is then assigned a specific γ value, here based on a linear function with exponent θ = 1, reflecting the expected elution patterns of strong binders in later fractions. The log2 FCref value is then divided by γ to derive a normalization factor for each eluate group. All RNA species in a given eluate group have their FC values log2 transformed and adjusted by this normalization factor, yielding intra- and inter-group adjusted fold-change (aFC) values. Finally, the area under the curve (AUC) is computed by integration of the aFC values for each species, using the corresponding γ values as the x-axis parameters.