Fig. 4: Adenine base editing of FXN GAA repeats in vitro.

a, Illustration of adenine base editing at GAA repeats (top) and schematic of the editing strategy (bottom). A smaller cartoon illustrates multiple binding opportunities for the Cas9-sgGAA complex at GAA repeats, with a magnified view showing a single binding event. b, Optimization of adenine base editing in HEK293T cells. Sequences below the bar plot indicate the NNNN PAM sequences compatible with sgGAA spacer. Data are mean ± s.d. of biological triplicates. c, Comparison of AGAA PAM-targeting ABE8e strategies in FXN-mESCs, evaluated across 30 (FXN-30GAA-mES) and 50 (FXN-60-GAA-mES) GAA repeats. Data are shown as mean ± s.d. of biological triplicates. d,e, CIRCLE-seq off-target hits in the human genome classified by the identity of the targeted region annotated with HOMER (d) and the number of mismatches with the sgGAA (e). f,g, Alternative target and off-target editing at CIRCLE-seq sites in HEK293T cells, confirmed by WGS (>0.5% editing) and classified based on the number (f) and the ___location (g) of mismatches relative to the sgGAA spacer. Horizontal lines in f mark median and quartiles calculated for all loci in a specific group. Editing for each locus is a mean of triplicates. Mean editing (%) in g represents base editing frequency across genomic sites meeting the specified mismatch criteria. Mismatch category A includes the five nucleotides proximal to PAM (positions 1–5), category B represents positions 6–10 and category C spans the last ten, PAM-distal nucleotides (positions 11–20) of the protospacer. A0–A5, B0–B5 and C0–C10 indicate the number of mismatches (0–5) between the sgGAA and a target site in categories A, B and C. Each square shows the number of loci with >0.5% editing in each mismatch subgroup. h, Editing frequencies at CIRCLE-seq sites with 0–4 mismatches between the sgGAA and a target site in HEK293T cells, measured by HTS. Each dot shows mean editing at a unique locus; diamonds indicate protein-coding sites. Data are mean ± s.d. of all loci in each category. Data in f–h represent biological triplicates. Illustration in a was created using BioRender.com. TSS, transcription start site; UTR, untranslated region.