Extended Data Fig. 5: Features of the chromosomal integration sites of intact proviruses from elite controllers after counting clonal sequences individually.
From: Distinct viral reservoirs in individuals with spontaneous control of HIV-1

a, Heat map indicating the relative proportion of proviral integration sites of intact proviruses in each chromosome in elite controllers, relative to corresponding data from long-term ART-treated individuals14. Proviral integration site data from previous publications9,15,17 are shown for comparison; integration sites from intact and defective proviruses were not distinguished in these studies. Contributions of each chromosome to the total number of genes (first row) and to the total size of the human genome (second row) are included as references. b, c, Proportion of near-full-length intact proviruses located in the indicated genomic regions. Data from near-full-length intact proviral sequences in long-term ART-treated individuals are shown as a reference14; chromosomal integration sites from unselected (intact and defective) proviral sequences in elite controllers9 and in ART-treated individuals15,17 are also shown for comparison. d, SPICE diagrams59 showing the proportion of intact proviruses with the indicated chromosomal integration site features in elite controllers and ART-treated individuals. e, f, Chromosomal distance between integration sites of intact proviruses and the most proximal transcriptional start sites (determined by RNA-seq) (e) or to the most proximal ATAC-seq peak (f) in autologous total, central memory and effector memory CD4+ T cells and in the Genome Browser (GB). Horizontal lines show the geometric mean. g, Proportions of proviral sequences located in structural compartments A and B, as determined using previously published Hi-C-seq data29. Chromosomal integration regions not covered in the previous study29 were excluded from the analysis. f, g, Sequences in genomic regions included in the blacklist for functional genomics analysis identified by the ENCODE and modENCODE consortia28 were excluded owing to the absence of reliable ATAC-seq and Hi-C-seq reads in such repetitive regions. a–g, All members of clonal clusters were included as individual sequences. ****P < 0.0001, ***P < 0.001, **P < 0.01, *P < 0.05; FDR-adjusted two-sided Fisher’s exact tests were used for data shown in b and c; two-sided Fisher’s exact tests were used for data shown in d and g; FDR-adjusted two-tailed Mann–Whitney U-tests were used for data shown in e and f; all comparisons were made between elite controllers and reference groups.