Abstract
Identifying individuals at high risk of type 1 diabetes (T1D) is crucial as disease-delaying medications are available. Here we report a microRNA (miRNA)-based dynamic (responsive to the environment) risk score developed using multicenter, multiethnic and multicountry (‘multicontext’) cohorts for T1D risk stratification. Discovery (wet and dry lab) analysis identified 50 miRNAs associated with functional β cell loss, which is a hallmark of T1D. These miRNAs measured across n = 2,204 individuals from four contexts (4C: Australia, Denmark, Hong Kong SAR People’s Republic of China, India) led to a four-context, miRNA-based dynamic risk score (DRS) that effectively stratified individuals with and without T1D. Generative artificial intelligence was used to create an enhanced four-context, miRNA-based DRS, which offered good predictive power (area under the curve = 0.84) for T1D stratification in a separate multicontext validation dataset (n = 662), and accurately predicted future exogenous insulin requirement at 1 hour of islet transplantation. In a clinical trial assessing the imatinib drug therapy, baseline miRNA signature, rather than clinical characteristics, distinguished drug responders from nonresponders at 1 year. This study harnessed machine learning/generative artificial intelligence approaches, identifying and validating a miRNA-based DRS for T1D discrimination and treatment efficacy prediction.
Similar content being viewed by others
Main
Functional loss of islet β cells is a common feature of progression to clinical diagnosis of type 1 diabetes (T1D), in type 2 diabetes (T2D) and after islet cell transplantation1. Islet cell death commences many years before clinical diagnosis of T1D, which usually occurs when more than 60% of islet β cells have been lost2,3 to autoreactive immune cells. Autoantibodies4 and human leukocyte antigen (HLA) genotypes5,6 are well-recognized biomarkers in T1D. However, autoantibodies already mark the start of progression to T1D, reminding us of ‘the streetlight effect’7. Although a strong genetic link to the development of T1D is known8,9, 80% of individuals who develop T1D have no first-degree or second-degree relatives with T1D10. Discordance for T1D between monozygotic twins11,12 implies that nongenetic (that is, variable or dynamic) factors influence the outcome of genetic predisposition to T1D. The identification of new drugs13,14,15,16,17,18, such as the anti-CD3 monoclonal antibody teplizumab13, and other immunotherapies (including rituximab, abatacept, alefacept, antithymocyte globulin, imatinib14, golimumab15 and baricitinib18) demonstrate that the course of T1D progression can be changed, necessitating the development and validation of biomarkers that capture dynamic changes in T1D risk19.
The Global Platform for the Prevention of Autoimmune Diabetes-Primary Oral Insulin Trial (GPPAD POInT) study20 demonstrated that children progressing to stage 1 T1D showed a sharp and sustained rise in postprandial blood glucose ~2 months before seroconversion. Recent studies also underscored the link between HbA1c and progression to autoimmunity in T1D21,22. These studies revealed that dynamic biomarkers of β cell function may be important candidates for developing a score that captures T1D risk relative to a changing environment19. MicroRNAs (miRNAs) are functional23,24, dynamic biomarkers25,26, and regulators27,28,29 of islet cell function. They are differentially expressed in immune cell subtypes30,31 and in the plasma and serum of individuals with and without T1D (Supplementary Table 1). Mature miRNAs are short, noncoding, single-stranded RNA molecules that regulate gene expression posttranscriptionally29. MiRNAs are valuable biomarkers that can offer readouts of β cell function over time24,32, and can be reliably measured33 from stored clinical samples.
In recent years, many biomarkers have been tested for their potential to stratify individuals with and without T1D19,34,35. In this study, we demonstrate the development of a dynamic risk score (DRS)19 based on circulating miRNA levels of individuals from a multicontext PREDICT T1D (Plasma RNA Evaluation for Diagnosing Incident risk of Clinical Type 1 Diabetes) study cohort. We present the discovery, multicontext validation, and usability of this miRNA-based DRS in T1D risk stratification, T1D misdiagnosis, and in predicting the efficacy of existing and emerging T1D therapies.
Results
Study populations and strategy
This study includes miRNA analysis of participants’ plasma and tissue samples from four continents across seven countries: Australia (AUS); Canada (CAN); Denmark (DNK); Hong Kong SAR China (HKG); India (IND); New Zealand (NZL); and the United States of America (USA). Processing times were similar (≤1 freeze–thaw) across all samples. The demographic characteristics (age at sample collection and sex) of the study participants are provided in Supplementary Table 2. Age at T1D diagnosis, T1D duration, genetic risk score (GRS) and autoantibody data were available for a subset of participants from DNK and HKG.
Study participants were from diverse backgrounds (countries), influenced by factors such as ethnicity, geography, culture, beliefs, socioeconomic status and urbanization. While understanding ethnic diversity is important to identify differences in biomarkers, focusing on country level offers a more practical and inclusive way to develop context-specific health policies. Therefore, we defined participants from a country as a ‘context’, using the three-letter country code to represent diversity in age, ethnicity and socioeconomic status.
Discovering a set of 50 PREDICT T1D miRNAs
A stepwise discovery analysis was carried out by profiling 754 known and validated miRNAs from the developing human pancreas (n = 55)23, human islet and nonpancreatic tissues (total, n = 189)27, and in plasma samples from patients with recent-onset T1D (n = 5) and age-matched and sex-matched controls (n = 5) (Fig. 1a). We identified miRNAs whose expression was significantly different between islet and nonpancreatic tissues, during β cell development, and in T1D versus control plasma samples (Fig. 1b and Supplementary Table 3). We also included five miRNAs that were not statistically significant in our analysis but were in previous studies (Fig. 1b). This set of 50 miRNAs is henceforth referred to as PREDICT T1D miRNAs.
a, Schematic flow chart of the four elements of this study (shown using block arrows at the top), details of the discovery cohorts, DRS generation and validation datasets, and their application in T1D therapy datasets. GAI-aided eDRS4C methods are detailed in the text. b, Significant miRNAs identified through our wet lab discovery analyses across n = 254 human samples (including plasma from n = 5 controls and n = 5 participants with newly diagnosed T1D) are shown at the top. Additionally, miRNAs from the literature that were reported to be significant but not reaching significance in our wet lab discovery analyses were also included in this PREDICT T1D miRNA panel. Details of miRNA spike-in controls, internal and positive controls, and negative controls are also provided. The Sankey plot shows the miRNA categories based on our discovery analyses (top) and all published reports (Supplementary Table 1) wherein these miRNAs have been associated with HLA, autoantibody, early diagnosis or T1D versus control (indicated by the filled blue circles), emphasizing the reason for these miRNAs to be included in the panel.
We also confirmed that 21 studies, published until 2020 (Supplementary Table 1), reported a majority of PREDICT T1D miRNAs to be significantly and differentially expressed in the plasma and serum of n = 2,583 individuals (1,609 T1D and 974 Controls). Continued analyses of PREDICT T1D miRNAs in articles published subsequently (from 2021 to 2024) confirmed that more than 90% miRNAs in our wet lab discovery dataset are recognized across other studies and laboratories to be associated with HLA (genetic risk for T1D), autoantibody expression, T1D status or early onset (<15 years of age; Supplementary Table 1). Our miRNA panel also included negative controls, internal controls (known cellular and housekeeping miRNAs) and stage-specific spike-in controls (shown in Fig. 1b and Supplementary Table 4). We also confirmed (Extended Data Fig. 1) that these miRNAs were detected in the serum-free culture supernatants of human islets exposed to increasing concentrations of a nitric oxide donor (sodium nitroprusside). Islet cell death was maximum at a concentration of 10 mM sodium nitroprusside, which resulted in the highest abundance of PREDICT T1D miRNAs in the supernatant.
Development of a miRNA-based DRS for T1D
We profiled PREDICT T1D miRNAs in the plasma of n = 2,204 study participants (controls without T1D n = 752, participants with T1D n = 1,452) from four contexts (AUS, DNK, HKG and IND). Each of the PREDICT T1D miRNAs, except for the three negative controls (miR-558, miR-188-3p and miR-220c), were significantly different between participants with T1D and controls either within a single context or across all four contexts (4C) together (Extended Data Fig. 2). Genomic mapping of PREDICT T1D miRNAs indicated their localization to several chromosomes, with only one miRNA (miR-30a-5p) located on chromosome 6, albeit not within the HLA locus (Extended Data Fig. 3 and Supplementary Table 4). None of the PREDICT T1D miRNAs were located on the Y chromosome. Our feature selector algorithm did not identify participants’ sex as a covariate in discriminating participants with T1D from controls.
Some of these miRNAs correlated (P < 0.05, r > 0.9) with each other within individual contexts (Extended Data Fig. 3). In DNK siblings, almost all miRNAs correlated negatively with the autoantibody measurements (IA-2A, GADA, ZnT8R and ZnT8W), whereas in DNK participants with T1D, most of the miRNAs demonstrated positive and significant correlation with three of the measured autoantibodies (IA-2A, ZnT8R and ZnT8W) (Extended Data Fig. 4). In another subanalysis, age at T1D onset had an inverse significant correlation with most of the miRNAs in the DNK and HKG contexts, whereas duration of diabetes was positively and significantly correlated with a subset of miRNAs in the HKG T1D group only (Extended Data Fig. 5).
A four-context, miRNA-based risk score (DRS4C) was generated using a machine learning approach employing random forest algorithm with data randomized to the training and testing datasets (Fig. 2a). DRS4C yielded a good receiver operating characteristic (ROC) curve (Fig. 2b) on the testing dataset (n = 661). Model development assessed participants’ age at sampling (in years) and all 50 PREDICT T1D miRNAs. The model performance metrics are provided in Fig. 2c while the top ten model variables are presented in Fig. 2d.
a, Details of the study samples, their T1D status across the four contexts and the number of samples randomized for a training dataset (to generate the DRS4C model) and a testing dataset (to assess the performance of this model). b, ROC curve demonstrating the performance of DRS4C on the multicontext testing dataset (n = 661). The dashed black diagonal line is the discriminatory line; the solid blue and green lines show the ROC for class 0 (control) and 1 (T1D), respectively, while the dashed green lines represent the micro-average and macro-average ROCs (0.78–0.81). c, The key performance measures for DRS4C. d, A SHAP value beeswarm plot indicating the most important variables contributing to the performance of the DRS4C. Each point on the plot represents the SHAP value for study participants, with the position along the x axis representing the magnitude of the SHAP value and the color representing the weight for the feature value.
We also developed single-context-specific risk models to identify the key variables (between age and the 50 miRNAs) that contribute to efficient discrimination of participants with T1D from controls/siblings in each context (Supplementary Table 5). We used the percentage Mean Decrease Gini (MDG), an estimate of feature importance in random forest workflows to rank these variables according to their relative importance (weights). As expected, certain miRNAs showed greater importance in specific contexts, capturing their capacity to identify distinct ancestries, ethnicities and environments (Extended Data Fig. 6). Context-specific (AUS, DNK, HKG and IND) miRNA-based risk scores performed best when tested on the same context (area under the curve (AUC) range = 0.70–1.00; Supplementary Table 5) but not on a combined testing dataset from all four contexts together (AUC range = 0.48–0.56; Supplementary Table 6).
Generative artificial intelligence for model enhancement
Although the DRS4C model was generated on a reasonably large dataset (n = 1,543), the levels of expression for each variable followed a distribution leaving several probabilities missing for miRNA expression or age (Fig. 3a). Generative artificial intelligence (GAI) was used to create synthetic datasets (Fig. 3b) that augmented the original training dataset, ensuring comprehensive coverage of miRNA expression level and age probabilities across all contexts. A total of n = 1,000, n = 10,000 and n = 100,000 synthetic control participants were created using age and miRNA expression levels across four-context control samples in the training dataset (Fig. 3b). These 1,000, 10,000 and 100,000 synthetic control samples were added to the training (control + T1D) samples to create augmented (training + synthetic) datasets for generating three enhanced (e)DRS4C models (1,000, 10,000 and 100,000). The potential of eDRS4C to discriminate participants with T1D from controls was assessed in an independent validation dataset (n = 662 individuals) from AUS, CAN, IND, NZL and the USA. All three eDRS4C models offered a ROC AUC of more than 0.797 (Fig. 3c), with the eDRS4C-100,000 offering the highest accuracy (0.779), F1 score (0.770) and AUC (0.841).
a, Rationale for using GAI. Left, A theoretical distribution of a single variable across individuals from four contexts is shown. GAI allows the creation of synthetic samples that can maximize all probabilities of variable expression (gray filled plots on the right), while preserving the original data distribution. b, We used the Gaussian copula Synthetic Data Vault (SDV) workflow to create 1,000, 10,000 or 100,000 synthetic control samples. The principal component analysis (PCA) plots present the distribution of real (control and T1D) and synthetic (gray) datasets. These augmented (real + synthetic) datasets containing 1,000, 10,000 or 100,000 synthetic samples were used to develop the eDRS4C model. c, Performance characteristics of the eDRS4C models developed from the augmented datasets containing 1,000, 10,000 and 100,000 synthetic samples on 662 samples (controls n = 364 and T1D n = 298) in an independent validation cohort. a.u., arbitrary unit.
The approach to minimize the number of variables for T1D discrimination was also assessed. Based on the SHapley Additive exPlanations (SHAP) values (Fig. 2d), we developed a risk score that assessed the potential of the top ten important variables in discriminating participants with T1D from controls (validation dataset, n = 662). This eDRS4C10 (based on the top ten variables) showed a comparable accuracy (0.675), F1 score (0.711) and AUC (0.810) to those of the eDRS4C-1,000 model.
Application of the eDRS4C in predicting insulin independence
Next, we assessed the application of this risk score in human clinical islet transplant settings (Fig. 4a), a recognized T1D therapy. Many transplanted islets die in the first 24 h of islet transplantation in recipients with T1D36. We calculated the eDRS4C-100,000 in plasma samples from a longitudinal cohort of n = 31 individuals before transplantation and 1 h and 24 h after transplantation (Fig. 4b) to predict insulin independence at 1 month after transplantation (Fig. 4c). Participants were considered as responders if their 1-month exogenous insulin requirement was less than 0.10 U kg−1 d−1 (n = 13), while those with higher insulin requirement (more than 0.12 U kg−1 d−1, n = 18) were nonresponders. The performance of the eDRS4C model (Fig. 4d) was superior at 1 h after transplantation than before transplantation or 24 h after transplantation, indicating that the best prediction of responders to T1D therapy can be made at an hour from transplantation. As our signature was derived from and is representative of functional β cells (Fig. 1b), the 1-h measurement captures the miRNAs released from stressed, dying or dead β cells after islet isolation and transplantation. Dead or stressed β cells also demonstrated increased release of PREDICT T1D miRNAs as seen during in vitro sodium nitroprusside-induced islet cell death (Extended Data Fig. 1). The predictive accuracy of eDRS4C was not matched by baseline (before transplantation) clinical and biochemical variables, individually (Extended Data Fig. 7a–h) or when combined (Extended Data Fig. 7i).
a, A cohort of Canadian (CAN) participants, who had their plasma samples before transplantation and 1 h and 24 h after transplantation available for the biomarker analyses, was included. b, The eDRS4C model based on 100,000 synthetic data (Fig. 3) was assessed before transplantation, and 1 h and 24 h after transplantation, and used to predict diabetes status (insulin dependence versus no or low insulin requirement). c, Of the n = 31 participants, n = 18 were on exogenous insulin (≥0.12 U kg−1 d−1) and n = 13 were on no or low exogenous insulin (<0.10 U kg−1 d−1) 1 month after transplantation. d, The performance characteristics of the eDRS4C model are provided to compare the changes in prediction during the 24 h after transplantation.
Application of eDRS4C in T1D drug therapy
Recently, a multicenter, double-blind, placebo-controlled, phase 2 randomized clinical trial (registration no. NCT01781975) assessed the safety and efficacy of imatinib14, a tyrosine kinase inhibitor, to preserve β cell function in participants with recent-onset T1D (n = 64; imatinib arm n = 43 and placebo arm n = 21) after a 26-week intervention. Drug intervention recipients were stratified into the upper quartile (UQ) (n = 11) and lower quartile (LQ) (n = 11) of response to therapy (C-peptide measurement) as identified at the end of 12 months (Fig. 5a). Uniform manifold approximation and projection (UMAP) dimensionality reduction analyses segregated UQ participants from LQ participants based on miRNA expression at the study baseline (Fig. 5b). The four miRNAs that contributed to UQ and LQ segregation (Fig. 5c) were significantly different at study baseline (Fig. 5d). However, each of the baseline clinical and biochemical parameters, including baseline C-peptide, were not different between UQ and LQ participants (Extended Data Fig. 8a–h), and could not segregate drug responders when used together (Extended Data Fig. 8i). However, 17 PREDICT T1D miRNAs were significantly and highly abundant in plasma from UQ than LQ participants at the study baseline (Extended Data Fig. 8j).
a, PREDICT T1D miRNAs were profiled in the imatinib study participants from the drug intervention arm at the study baseline. Individuals were stratified to the UQ and LQ of response to therapy (C-peptide levels assessed at 1 year). b, The UMAP dimensionality reduction algorithm was applied to the expression profile of all 50 miRNAs across UQ and LQ study participants, leading to two broad clusters representing drug response at 1 year. c, Decision tree for key variables from our eDRS4C that aid in the segregation of UQ and LQ responders to imatinib therapy. d, The expression of the top four miRNAs in the decision tree was assessed between UQ and LQ participants using a one-sided Welch’s t-test. e, Key pathways along with their Gene Ontology (GO) IDs (y axis), which are targeted by the top four miRNAs (x axis) in the decision tree, are shown using a bubble plot. The size of the bubble indicates the number of target genes (written next to each bubble), while the color of the bubble denotes the significance of the miRNA target pathway interaction as obtained using the miRPathDB tool. A color scale for the adjusted P value is also provided. A hypergeometric test with Benjamini–Hochberg adjustment was used for the enrichment analyses on miRPathDB. f, Venn diagram presenting 36 genes in the tyrosine kinase pathway that are targeted by hsa-miR-27b-3p (GO:0004712) and nine genes involved in the tyrosine kinase pathway inhibited by imatinib.
The four miRNAs identified in the UMAP decision tree analysis (Fig. 5c,d) target several genes across the tyrosine kinase, immune, apoptosis, oxidative stress and cell cycle pathways (Fig. 5e). Indeed, the eDRS4C-100,000 based on these key miRNAs alone offered a discriminatory AUC of 0.711 with an F1 score of 0.741 and a recall of 0.909. If age at sampling was included, then an AUC of 0.769 (F1 score = 0.769, recall = 0.909) was achieved. Intriguingly, the most important variable (hsa-miR-27b-3p) in stratifying UQ from LQ responders, targets 36 genes in the tyrosine kinase pathway (Fig. 5f), which are distinct from nine other genes targeted by imatinib37,38,39,40,41,42,43,44,45.
Comparison of eDRS4C with other T1D risk factors
Autoantibodies and GRS are currently used for T1D risk stratification. In TEDDY (The Environmental Determinants of Diabetes in the Young) study9, T1D stratification with autoantibodies (AUC = 0.75), family history (AUC = 0.51–0.56) and GRS2 (AUC = 0.73) has been reported. Although we did not have GRS and autoantibody data for all the PREDICT T1D participants, we present examples of the usability of miRNA-based eDRS4C in predicting T1D progression. In an individual who was clinically diagnosed with T1D when aged 5 years, eDRS4C was high (>0.60) at 14 months, increased to more than 0.75 by 18 months and remained high at all time points until stage 3 T1D diagnosis by 60 months. Autoantibodies showed a rapid increase at 39 months, signifying an earlier risk stratification with eDRS4C (Extended Data Fig. 9a,b). We also analyzed the eDRS4C in the DNK sibling cohort where four of the siblings progressed to stage 3 T1D within ~12 years of sample collection. Here, the eDRS4C provided comparable predictive potential to existing (GRS, autoantibody, and age) biomarkers of T1D progression (Extended Data Fig. 9c,d). GRS also discriminates T1D from T2D46. As eDRS4C was trained on individuals without, with or at risk of T1D, it did not offer a good stratification to separate T1D from T2D (AUC = 0.636). However, PREDICT T1D miRNAs (and age) were significantly different between participants with T1D and T2D (Supplementary Fig. 1a and Supplementary Table 7). Therefore, we trained new models to discriminate participants with T2D from participants with T1D or controls (Supplementary Fig. 1b). The miRNAs that are important in discriminating T2D are different to those important in identifying T1D from controls (Supplementary Fig. 1c–e).
Discussion
This study compiles, trains, tests, validates and applies a miRNA-based DRS generated through the analysis of a large (n = 5,983), multicontext (seven countries, four continents), cross-sectional and longitudinal cohort of individuals without, at risk of or with T1D. The study uses GAI-driven approaches to enhance risk stratification and prediction of cell (islet transplantation) and drug (imatinib) therapy efficacy in T1D.
Our discovery analysis (Fig. 1) was designed to include miRNAs associated with human β cell development, adult human islet function and those that are differentially expressed in the plasma of participants with new-onset T1D. As miRNAs are produced within cells, having pancreatic islet cells in the discovery cohort captured tissue-enriched miRNA signatures. Since miRNAs are required for pancreatic development47, and as regenerative processes after islet destruction are known to recapitulate pancreas development48,49, inclusion of the developing human fetal pancreas was important for miRNA discovery. This heterogeneous discovery sample dataset provides a comprehensive view of the miRNAs associated with human islet β cell function. Compared to other studies (Extended Data Fig. 10) only 4% (n = 10) of our wet lab discovery dataset were plasma samples from study participants newly diagnosed with T1D. However, we also confirmed that our selected set of PREDICT T1D study miRNAs were significantly and differentially expressed across plasma and serum from n = 2,583 individuals (1,609 with T1D and 974 controls) across 21 previously published studies (Supplementary Table 1). This approach provides a holistic view of miRNAs that are (1) enriched in human islets, (2) found in developing β cells, (3) released in the circulation after islet β cell damage or death and (4) identified to be significantly dysregulated across several other previously published study samples. Studies in recent years also reported several of the PREDICT T1D miRNAs to be important in other larger cohorts of participants with T1D50, targeting different biological and functional pathways, including apoptosis, insulin synthesis, inflammation, NF-κB and the JAK–STAT pathway31,51, inflammasome components52, immune cell development and function, and central and peripheral tolerance30,53.
Apart from miRNAs, age at sample collection was identified as an important variable in the development of DRS. Early onset of T1D is more aggressive54 and children with T1D are less likely to retain detectable residual C-peptide than adults with T1D55. Several studies56,57,58 demonstrated the role of age in the clinical progression and severity of T1D. In our study, age at sample collection was a key feature that was selected in the top ten variables ranked in the DRS4C (Fig. 2d) and single-context-based risk scores (Extended Data Fig. 6). As both age and miRNA expression can change over time or in response to the environment, we refer to our signature as a ‘DRS’.
The miRNA-based risk score is dynamic and therefore capable of predicting future changes in diabetes status. Static risk scores (for example, GRS or polygenic risk score (PRS)) are hardwired to our genome (that is, they remain constant over a lifetime) and may not, on their own, be able to predict change in the clinical progression of T1D. The capacity of PREDICT T1D miRNAs to prioritize islet transplantation recipients for a subsequent transplant, and to stratify drug responders to T1D intervention at the study baseline, implies an exclusive opportunity for T1D therapy and trial participants. Indeed, similar dynamic biomarkers, such as those assessed using the circulating proteome, have been recently shown to outperform PRS and HbA1c for T2D prediction59. We also found that a different subset of PREDICT T1D miRNAs efficiently stratified individuals with T2D from controls and individuals with T1D (Supplementary Fig. 1). This underscores the capacity of the PREDICT T1D miRNAs to assess β cell function in T1D and T2D.
In our hands, the random forest workflow was the most suitable approach (Supplementary Table 9); however, other machine learning workflows may be effective for some clinical datasets60. Other techniques, such as recursive feature elimination, could also be used for variable minimization and selection, to improve feature reduction without the loss of predictive power. Although our model training data from four contexts demonstrate the diversity across ethnically and geographically diverse cohorts, the use of GAI allows for a cost-efficient approach to mimic larger sampling that follows the same distribution as real-life samples.
The PREDICT T1D signature of miRNAs and the eDRS4C risk score have much to offer in the clinical management of T1D. In our study, 1 h after transplantation miRNAs offered a better prediction of future exogenous insulin independence than before transplantation or 24 h after transplantation. A previous report36 also identified a 1-h time point after transplantation with a distinctive increase in β cell-specific insulin cell-free DNA (a marker of β cell death) in circulation. As islets die in the immediate period after infusion because of hypoxia or through other mechanisms, including instant blood-mediated inflammatory response61,62, dynamic biomarkers, including β cell-enriched miRNAs or cell-free DNA, are likely to be the most abundant 1 h after transplantation. With several centers involved in islet transplantation and efforts to develop new engineered islet therapies (registration no. NCT06239636), early prediction of transplant outcome is highly desired. Similarly, considering many existing and emerging T1D drug therapies, the capacity to identify the best responders to a drug would pave the way to personalized medicine for T1D drug therapies. In the imatinib clinical trial (registration no. NCT01781975), we identified that the best responders to imatinib had significantly higher levels of 17 different PREDICT T1D miRNAs (Extended Data Fig. 8j). The key miRNA (miR-27b-3p) identified to stratify responders and nonresponders targeted a separate set of 36 different genes in the tyrosine kinase pathway (Fig. 5f) than those targeted by imatinib. These findings suggest a synergistic mechanism between circulating miRNAs and drug targets leading to better drug response. As the miRNAs in the PREDICT T1D signature target several other pathways (for example, the JAK–STAT pathway), we aim to extend our findings from the imatinib trial to other T1D drug intervention trials. These studies will lead to identifying personalized therapies for T1D.
Our study limitations include the lack of other contexts (for example, Africa, South America, the Middle East), which we plan to build in the coming years. However, as GAI enables capturing other probable expression levels and ages of participants, one could anticipate that the eDRS4C models would work equally well in newer contexts. As demonstrated, the eDRS4C, which is based on the AUS, DNK, HKG and IND contexts efficiently stratified participants with T1D in the validation dataset (n = 662), which also included individuals from CAN, NZL and the USA. Although clinical (for example, age at diagnosis) and biochemical (for example, autoantibody) data were unavailable for all study participants, our PREDICT T1D miRNA signature includes miRNAs that were associated with HLA, early-onset, autoantibody and T1D status (Fig. 1b). We also find that available baseline clinical and biochemical measurements alone, or together, did not stratify the responsiveness to islet transplantation (Extended Data Fig. 7a–i) or drug intervention (Extended Data Fig. 8a–i) therapies in T1D. Future studies should assess the importance of autoantibodies, GRS or other clinical, biochemical and molecular parameters along with miRNAs to generate combined multiomic risk scores to stratify T1D. This study was not designed to profile T1D endotypes63, which is an ongoing interest. We also plan to extend our T1D and T2D stratification studies in other cohorts, given the increasing recognition of T1D presentation in adulthood and potential T1D misdiagnosis64. We included all clinical trial participants from the imatinib trial14, but a larger number of multicenter participants is highly desirable. However, in the imatinib trial, we demonstrate the use of a miRNA-based DRS to predict treatment efficacies for T1D therapies, and its usability at a single-patient or cohort-level risk stratification (Extended Data Fig. 9).
Although our studies include individuals at risk of T1D (DNK siblings), we lacked longitudinal data on individuals progressing to autoimmunity in T1D. Recently, in TEDDY study65, longitudinal analysis using whole-transcriptome sequencing in a nested case-control cohort was carried out to identify differentially expressed gene transcripts associated with autoantibody development. Future studies in longitudinal cohorts (for example, the Environmental Determinants of Islet Autoimmunity study), and screening programs, (for example, the European action for the Diagnosis of Early Non-clinical Type 1 diabetes For disease Interception) aimed at identifying individuals at risk of T1D, might also benefit from miRNA-based risk scores. In the future, we aim to incorporate other dynamic biomarkers (for example, cell-free DNA, metabolites, proteins, cytokines, lipids), to systematically evaluate their contribution, paving the way for personalized medicine in T1D.
This study sets a benchmark in biomarker research for T1D: (1) it combines the discoveries from tissue, plasma and the literature to identify a set of miRNAs that not only aid in T1D discrimination but also T1D misdiagnosis; (2) it presents a miRNA biomarker panel across four contexts, generating a multiethnic miRNA-based DRS for T1D; (3) it integrates GAI workflows to enhance risk predictions; (4) it validates the GAI/machine learning-based DRS in an independent validation set (n = 662) of individuals across newer contexts; (5) it presents the capacity to apply these autoantibody-associated, HLA-associated and early-onset-associated miRNAs in predicting treatment outcomes for cell and drug therapy in T1D; and (6) it demonstrates comparable performance and correlation to existing biomarkers (autoantibodies, GRS, family history, age at onset, where available) in predicting T1D progression. While the applicability of PREDICT T1D miRNAs needs to be assessed in larger cohorts, this study presents a panel of miRNAs (Fig. 1) that are highly relevant to β cell function, and a miRNA-based DRS, for use in clinical risk stratification and drug efficacy testing for T1D.
For individuals living with T1D, application of this miRNA-based DRS in assessing diabetes status or treatment efficacy in T1D therapies implies that treatments can be tailored to individual needs. Access to personalized risk scores and the ability to track changes over time allows people with T1D to take an active role in managing their condition.
Methods
Study population
The methods for plasma collection across several collaborative centers were checked to be as uniform as possible, and all plasma samples were freeze-thawed ≤1 time before this analysis. Details of the samples (number, age, sex) included in different datasets of this study are provided in Supplementary Table 2.
The wet lab discovery set of samples included developing (fetal) human pancreatic tissues (n = 55), adult human islets and nonpancreatic tissues (total, n = 189), plasma from individuals within 96 h of acute presentation with T1D (n = 5) and age, as well as sex-matched healthy controls (n = 5). The miRNAs identified from the literature are provided in Fig. 1b and Supplementary Table 1.
The plasma samples for model development were from individuals without T1D (controls), or siblings (genetic risk) of individuals with T1D (siblings) or individuals with T1D (T1D) from four contexts: AUS (total n = 728, T1D n = 519, controls n = 209); DNK (total n = 608, T1D n = 316, siblings n = 292), HKG (total n = 238, T1D n = 120, controls n = 118) and IND (total n = 630, T1D n = 497, controls n = 133), as presented in Fig. 2a. None of the siblings had stage 3 T1D. Most of these were stage 1 T1D and ~2% of the siblings developed T1D at ~12 years from sample collection.
Diabetes diagnosis was carried out pragmatically for all study participants through their teams of clinicians. The diagnostic criteria for T1D varied across the four geographical contexts but adhered to established standards: AUS: diagnosis was based on the presence of autoantibodies and clinical presentation, including diabetic ketoacidosis, with genetic analyses conducted if the patient presented under 1 year of age; DNK: the diagnostic criteria were similar to those used in AUS, with the inclusion of autoantibody data for patients with T1D and their siblings; HKG: diagnosis was based on the presence of diabetic ketoacidosis or the need for continuous insulin use within 1 year of diagnosis. Because of the low prevalence of anti-GAD and anti-islet cell antigen antibodies among Chinese patients with T1D66, autoantibodies were not a prerequisite for defining T1D; IND: diagnosis involved testing for glucose and C-peptide levels as well as islet autoantibodies, including GAD65, IA2 and ZNT8.
The performance of the risk score in a separate case-control validation dataset (total n = 662, with T1D n = 298, without T1D n = 364) from AUS, CAN, IND, NZL and the USA was assessed; 4% of participants without T1D in this validation dataset had gestational diabetes and none had T2D.
The eDRS4C models (1,000, 10,000 and 100,000) developed through this study were tested on plasma samples from CAN islet transplant recipients (31 recipients before transplantation, 1 h after transplantation and 24 h after transplantation samples) to assess future T1D status. PREDICT T1D miRNAs were also assessed for their capacity to stratify drug efficacy in the imatinib intervention study14, where individuals with recent-onset T1D were treated with imatinib (n = 43) or placebo (n = 21) for 26 weeks14, where outcomes were assessed at the end of 1 year.
Subanalyses for benchmarking the PREDICT T1D signature were carried out on the DNK cohort with T1D (n = 96 of 316) and siblings where data for GRS (n = 163 of 292) and autoantibodies (n = 168 of 292) were available. Duration of diabetes and age at onset in the HKG cohort of individuals with T1D (n = 101 of 120), and the DNK cohort of participants with T1D (n = 237 of 316), were analyzed. Subanalyses for discriminating T1D from T2D were carried out on participants with T2D from HKG (n = 119) and AUS (n = 12) with the participants with T1D as described in the ‘T1D versus T2D discrimination’ section below.
Ethical approval
Each study population was recruited and consented (oral and written consent) by the lead collaborators at each site according to the Declaration of Helsinki and approved by their respective institutional human ethics committees. Since samples have been collected at different times from 2000 until before 2024, the relevant version of the Declaration of Helsinki at the time of approvals for collection was followed; for most samples, this was version 2013. Ethical approvals for the use of developing human pancreas, adult islets and nonpancreatic tissues were obtained from the human ethics committees of the National Centre for Cell Science (India) and Sydney Local Health District, as detailed in our earlier publications27,67,68. Profiling of circulating miRNAs was approved through the human ethics committees at St Vincent’s Hospital Melbourne (HREC A052-10), Sydney Local Health District University of Sydney (HREC/13/RPAH/83) and Western Sydney University (H14213).
miRNA profiling
Tissue and plasma samples were collected using standardized procedures, received coded (re-identifiable) and stored frozen at −80 °C until use. Plasma samples were thawed and a 100-µl aliquot was used for RNA isolation using an automated QIAcube HT platform (QIAGEN)26,69. Samples with extensive (visible) hemolysis were excluded. Initially, 500 μl of TRIzol reagent (Thermo Fisher Scientific), 10 µg of glycogen, spike-in miRNA control (ath-miR-172a) and 100 µl chloroform (all from Merck Millipore) were added to each sample. After centrifugation, the aqueous phase was transferred to an RNeasy column (RNeasy HT Kit, QIAGEN) for total RNA isolation. For the solid tissue samples (islet, non-islet pancreas and developing pancreas) a TRIzol-based manual RNA isolation workflow was implemented23,27,68. RNA was quantitated on a NanoDrop spectrophotometer (Thermo Fisher Scientific); 10 ng or 100 ng of RNA was used for reverse transcription using the microRNA RT kit (Thermo Fisher Scientific), spike-in miRNA control (ath-miR-159a) and Megaplex RT primer pools A and B (discovery samples profiled for 754 miRNAs). Custom reverse transcription primers were used for all samples profiled for the signature of 50 selected miRNAs and six (negative, internal and spike-in) control miRNAs. Preamplification and TaqMan-based miRNA quantitative PCR (qPCR) were performed using the low-sample input protocol recommended by the manufacturer and described earlier26 with TaqMan OpenArray Human miRNA inventoried (754 miRNAs) or custom panels (specifically designed for the 56-miRNA panel) on the QuantStudio 12K Flex platform (Thermo Fisher Scientific). MiRNAs for hemolysis and repeats were assessed independently. Raw data files obtained after the qPCR were checked for sample loading and uploaded to the ThermoConnect tool for global normalization (for the discovery set). MiRNA data that did not pass quality control (with an amplification score of less than 1.24 or a cycle value confidence interval (CI) of less than 0.6) were converted to undetectable (Ct = 39). Other samples assessed on custom panels (training-testing, validation, T1D therapy) were normalized using spike-in controls added during RNA isolation and reverse transcription. Normalized Ct values were converted to fold-over-detectable (FOD) or abundance values70 and were used in further analyses.
Data analysis
Discovery set
The relative abundance of the miRNAs across two groups (islet versus non-islet; T1D versus controls) in the discovery samples was calculated using \({2}^{{\Delta\Delta}{\rm{C}}_{\rm{t}}}\) (FOD difference) and significantly different miRNAs (P ≤ 0.05) were identified using a Welch’s t-test. Although lenient, this analytical approach selected a set of 50 most significant miRNAs from our discovery cohort (n = 254 wet lab discovery samples and n = 2,583 participants from previous studies) that were tested and validated in a multicontext set of samples using Random Forest machine learning workflow. Random forest algorithm (used in developing the DRS4C) is inherently robust in feature selection. Additionally, testing these selected features (miRNAs) in the test dataset (n = 661), a separate multicontext validation set (n = 662) and in two longitudinal T1D clinical studies and trials further endorsed the validity of our selected miRNA signature. Methods to address class imbalance, for example, using the synthetic minority over-sampling technique (SMOTE) and sampling bias (for example, bootstrapping) were used to retain the desired analytical rigor.
Training and testing datasets
Samples from four different contexts
The miRNA abundance of the 50 PREDICT T1D miRNAs was compared between participants with T1D and controls (for AUS, HKG and IND), or participants with T1D and siblings (for DNK) and for all contexts together, using a Kruskal–Wallis test followed by an uncorrected Dunn’s test comparing the mean ranks.
Model training and testing
This analysis was performed on samples from the four different contexts using the 50 PREDICT T1D miRNA signature data along with the participant’s age at sample collection as continuous independent variables. Participants’ T1D status was used as a categorical dependent variable, where ‘controls/siblings’ was set as the reference/control group and ‘T1D’ was set as a case group. Datasets from each context were individually randomized, aggregating to 70% training (n = 1,543) and 30% testing (n = 661) samples (Fig. 2a) using the sample.split() command in the caTools (v.1.18.2) R package71. The samples selected in the training and testing datasets were kept the same throughout the analyses in R and Python and for independent validation (see below). Classification models were built on the training datasets of four contexts together using the random forest workflow, using the packages randomForest (v.4.7-1.1) and Caret (v.6.0-94)72 in R (v.4.2.2)73 and RStudio (v.2023.12.1-402)74. To settle class imbalances in the training datasets, four different sampling methods, that is, up, down, SMOTE and random over-sampling examples, were assessed with three different cross-validations (fivefold, tenfold or 50-fold). Fifty-fold cross-validation, using SMOTE sampling and 2,000 decision trees, was selected as the optimally tuned workflow for eDRS4C-100,000, based on accuracy, the AUC, sensitivity, specificity, and F1 score (scripts made available via GitHub/islet biology). Predicted probabilities were calculated for individuals in the training datasets using the predict() command from the package stats in R. The ROC curve analyses, AUCs and CIs were computed using the roc() and ci.auc() commands from the pROC package in R to obtain threshold values. The best threshold value for classification was determined using the Youden index method, using the coords() command from the pROC package in R. Individuals with a predicted probability greater than the Youden index threshold were classified as T1D, while those with lower probabilities were classified as non-T1D. The importance of each feature and variable used in the prediction model was assessed using MDG, which was computed using the importance() command in randomForest72,75. Model performance was also independently validated in Python. The PyCaret (https://pycaret.org/) framework in the Jupyter Notebook environment or Visual Studio release 15.9.67 was used for model training, tuning and evaluation. The trained model was evaluated on the testing dataset using several performance metrics provided by the PyCaret and scikit-learn libraries such as accuracy, precision, recall, F1 score and AUC. AUC plots and detailed classification metrics were generated using the evaluate_model() function in PyCaret and scikit-learn. The validation was performed using StratifiedKFold cross-validation with tenfold cross-validation. To visualize the impact of each feature on the predictions made by the model, SHAP values were computed and visualized. A SHAP explainer object was created for the trained random forest model using the explainer class of the shap package and any outliers were removed. The AUC and model accuracies were not notably different with the outliers removed or retained. SHAP values were visualized using the shap.plots.beeswarm function.
GAI-based workflows
T1D-free (control) participants from the training datasets of four contexts (AUS, DNK, HKG, IND) were used to generate three different synthetic datasets of age and miRNA expression profiles for 1,000, 10,000 or 100,000 synthetic individuals using the SingleTableMetadata module from the SDV library. A Gaussian Copula Synthesizer model from the SDV library was used. The model was then trained on the original control dataset to capture the underlying dependencies and distributions of the data. Synthetic data for the control dataset were generated using the trained model, ensuring compliance with the specified constraints, including age limitations between 2 and 64 years. The three negative control miRNAs were confirmed and set to be undetectable in the synthetic dataset. The suitability of synthetic data for downstream tasks such as model training and evaluation was confirmed with TableEvaluator. These synthetic controls were pooled with the original real-life training samples from the four contexts (AUS, DNK, HKG, IND) to create an augmented training dataset for the generation of eDRS4C models. Random forest models were trained on these three augmented datasets using the same tuning parameters mentioned above.
Parallelization and independent data validation
All analyses were independently validated by collaborating data scientists using different platforms. To reduce computational time in model training, we used the doParallel package (v.1.0.17) and its dependencies in R to parallelize model training across 20 central processing units (CPUs) on an Ubuntu server (total memory = 510 GB, CPUs = 62). A cluster of 20 CPUs (except for the eDRS4C-100,000 model, where the cluster was of six CPUs) was created using makePSOCKCluster(), which was registered using the registerDoParallel() function. Upon completion of model training, the connections to these CPUs were closed using the stopCluster() function. Output agreement was validated across at least three users, who worked on separate workstations using R or Python.
Validation dataset
The different models of eDRS4C (from the 1,000, 10,000 and 100,000 synthetic sample sets) were used to classify individuals as T1D and non-T1D in an independent validation dataset (n = 662). Predicted probabilities and the class for this dataset were calculated using the methodologies described above.
T1D therapy dataset
Islet transplant samples
The eDRS4C risk score was used to predict T1D management (need for exogenous insulin) based on the eDRS4C (100,000) before transplantation, and 1 h and 24 h after transplantation. Based on exogenous insulin requirement at 1-month after transplantation, baseline clinical parameters were compared using a Welch’s t-test. Dimensionality reduction was performed on all available baseline clinical parameters in R using the uwot package (v.0.2.2) (UMAP = 20 threads, seed = 1), and data were plotted using the ggplot2 package (v.3.5.1).
Imatinib trial
Baseline miRNA expression of the 50 PREDICT T1D miRNAs was used in a dimensionality reduction UMAP algorithm using the Entropy criterion and Minkowski metric for a best-fit method with the BioVinci data visualization package (BioTuring). We also assessed the performance of an eDRS4C-100,000 model (with or without age) based on the key miRNAs associated with imatinib therapy (Fig. 5c) at the study baseline to predict responders (UQ) and nonresponders (LQ) to therapy. Baseline clinical parameters and miRNAs were compared across UQ and LQ participants using a Welch’s t-test. Dimensionality reduction was performed on all available baseline clinical parameters in R using the uwot package (v.0.2.2) (UMAP = 20 threads, seed = 1), and data were plotted using the ggplot2 package (v.3.5.1).
T1D versus T2D discrimination
The FOD values of all 50 miRNAs and age (in years) for individuals with T1D (n = 1,314) or with T2D (n = 131) were imported into RStudio using the package openxlsx. The distributions of each miRNA were compared between individuals using a Wilcoxon rank-sum test; the adjusted P values were generated using Bonferroni correction (package rstatix). Data were visualized as sina plots using the packages ggforce (v.0.4.2) and ggprism (v.1.0.5). The eDRS4C-100,000 risk score was used to stratify participants with T1D from those with T2D. Two additional risk scores were derived using these datasets wherein a control versus T2D (training dataset n = 527 controls, n = 92 T2D; testing dataset n = 364 controls and n = 39 T2D) or a T1D versus T2D (training dataset n = 1,016 T1D and n = 92 T2D; testing dataset n = 298 T1D and n = 39 T2D) risk score was derived using all 50 miRNAs and age at sampling using the random forest algorithm; SHAP values were plotted as mentioned above.
Genomic ___location and collinearity
The genomic ___location spanning from the position of the starting base pair for each miRNA (based on assay IDs from Thermo Fisher Scientific) and verified through the NCBI Gene database (Supplementary Table 4) was used to mark the start of the gene encoding the corresponding miRNAs. The shinyCircos76 tool (https://venyao.xyz/shinycircos/) was used to visualize the genomic locations for the 50 PREDICT T1D miRNAs. Links connecting these miRNAs represent their context-specific coexpression (Spearman rank correlation coefficient greater than 0.9 and P < 0.05), which is colored separately for each context. The width of each connecting line is arbitrarily set for visual enhancement.
Autoantibody measurements
Autoantibodies were measured using labeled recombinant proteins according to standard radioimmunoassay protocols described elsewhere77.
Genotyping and GRS
Participants were genotyped on the Illumina Infinium Global Screening Array (v.24 and v2.0). Before quality control, the Infinium Global Screening Array had 722,659 single-nucleotide polymorphisms (SNPs). After quality control and frequency pruning (--geno 0.02 --maf 0.05 --hwe 0.001 --mind 0.02) using PLINK v.1.90, the genotyping rate was 0.99 and included 298,994 SNPs. The T1D GRS (GRS1) was calculated using 29 of 30 T1D-associated SNPs, as identified earlier78. For the missing SNP, rs11594656, a proxy SNP analysis was performed using linkage disequilibrium patterns in the European population through LDlink79. However, no suitable proxies with r2 > 0.8 were identified, leading to the exclusion of this SNP from the analysis. The GRS for the remaining 27 non-HLA SNPs was calculated using the score function in PLINK. For the two HLA SNPs, rs2187668 and rs7454108, which capture the DR3 and DR4–DQ8 haplotypes, genotype weights for the highest-risk heterozygous combinations were assigned based on evidence from a previous study5. These weights were incorporated into the final aggregate GRS.
Anomaly detection
The isolation forest anomaly detection machine learning approach was used to further stratify individuals at high risk for T1D into progressors and non-progressors. Isolation forests were generated for existing risk factors (including age, autoantibody data and GRS) or for the top ten features (age and top nine miRNAs; Fig. 2d). MiRNA FOD data and other values (age in years, autoantibody data and GRS) were used as obtained. The entire pipeline was implemented in R. Dimensionality reduction was carried out using the uwot package (v.0.2.2). Isolation forests were built using the isotree package (v.0.6.1-1); the anomaly cutoff was set at 0.9 (borderline 10% of anomalous data points). Data were plotted using the ggplot2 package (v.3.5.1) and forests were visualized as contour plots.
PCA, correlation and Sankey plots
The PCA was carried out in R using the prcomp() command in the stats package. PCA plots were generated using the ggplot2 and ggfortify packages in R. Correlation plots were created using Prism v.10.4.1 (GraphPad Software). Sankey plots were created using SankeyMATIC (https://sankeymatic.com/).
Sodium nitroprusside exposure
Human cadaveric islets (n = 6 different biological preparations) were resuspended in serum-free defined Connaught Medical Research Laboratories medium containing 2% fatty-acid-free BSA within 24–60 h of isolation. Islets were then exposed to different concentrations (0 mM, 1 mM and 10 mM) of freshly prepared sodium nitroprusside for 24 h. Experimental culture supernatants (without contamination from islet cells) were collected for miRNA analysis. RNA isolation and miRNA qPCR with reverse transcription were performed as described above. Normalized Ct values were used to perform supervised clustering analyses with the ComplexHeatmap package in R based on Euclidean distance.
Pathway analysis
MiRNAs that distinguished the UQ of drug responders from the LQ of drug responders (Fig. 5c) in the imatinib trial were assessed for targeting pathways using miRBase, AmiGO2 and miRPathDB. We evaluated the four key miRNAs (miR-24-3p, miR-25-3p, miR-27b-3p and miR-92a-3p) using miRPathDB 2.0, which offers comprehensive resources for miRNA target pathway analyses. These four miRNAs targeted distinct sets of genes with as little as 2.7% overlap between miR-27b-3p and miR-24-3p targeting pathways (Jaccard coefficient = 0.027), to a higher overlap (Jaccard index = 0.252) between miR-25-3p and miR-92a-3p. As the drug used in the imatinib trial targets the tyrosine kinase pathway, the analysis focused on tyrosine kinase and some of the other relevant pathways, as illustrated in Fig. 5e. All computationally identified genes targeted by these miRNAs were compared to the known imatinib targets37,38,39,40,41,42,43,44,45.
Further details of the products and software package versions are provided in Supplementary Tables 10 and 11, respectively.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The datasets presented in this study are not available publicly at present because of ongoing and unrelated (to this manuscript) analyses. Data from this study will be made available for collaborative research to bona fide academic researchers upon reasonable written request with an official research proposal and analysis plan. Deidentified individual-participant-level data will be provided upon request, after approval from the PREDICT T1D review committee. Related study documents, including laboratory standard operating procedures and ethics documents, will be available through the study’s principal investigator (A.A.H.). Requests should be emailed to A.A.H. ([email protected]).
Code availability
Python and R codes will be available on our GitHub/Islet Biology page or through the study’s principal investigator (A.A.H.). Requests should be emailed to A.A.H. ([email protected]). A copy of the code, and the training and validation datasets, are available via Code Ocean at https://doi.org/10.24433/CO.4476520.v1 (ref. 80).
References
Mathis, D., Vence, L. & Benoist, C. β-Cell death during progression to diabetes. Nature 414, 792–798 (2001).
Borchers, A. T., Uibo, R. & Gershwin, M. E. The geoepidemiology of type 1 diabetes. Autoimmun. Rev. 9, A355–A365 (2010).
Atkinson, M. A. & Gianani, R. The pancreas in human type 1 diabetes: providing new answers to age-old questions. Curr. Opin. Endocrinol. Diabetes Obes. 16, 279–285 (2009).
Ziegler, A. G. et al. Seroconversion to multiple islet autoantibodies and risk of progression to diabetes in children. JAMA 309, 2473–2479 (2013).
Barker, J. M. et al. Two single nucleotide polymorphisms identify the highest-risk diabetes HLA genotype: potential for rapid screening. Diabetes 57, 3152–3155 (2008).
Sharp, S. A. et al. Development and standardization of an improved type 1 diabetes genetic risk score for use in newborn screening and incident diagnosis. Diabetes Care 42, 200–207 (2019).
Battaglia, M. & Atkinson, M. A. The streetlight effect in type 1 diabetes. Diabetes 64, 1081–1090 (2015).
Pociot, F. & Lernmark, Å. Genetic risk factors for type 1 diabetes. Lancet 387, 2331–2339 (2016).
Ferrat, L. A. et al. A combined risk score enhances prediction of type 1 diabetes among susceptible children. Nat. Med. 26, 1247–1255 (2020).
Steck, A. K. & Rewers, M. J. Genetics of type 1 diabetes. Clin. Chem. 57, 176–185 (2011).
Redondo, M. J. et al. Genetic determination of islet cell autoimmunity in monozygotic twin, dizygotic twin, and non-twin siblings of patients with type 1 diabetes: prospective twin study. BMJ 318, 698–702 (1999).
Hyttinen, V., Kaprio, J., Kinnunen, L., Koskenvuo, M. & Tuomilehto, J. Genetic liability of type 1 diabetes and the onset age among 22,650 young Finnish twin pairs: a nationwide follow-up study. Diabetes 52, 1052–1055 (2003).
Herold, K. C. et al. An anti-CD3 antibody, teplizumab, in relatives at risk for type 1 diabetes. N. Engl. J. Med. 381, 603–613 (2019).
Gitelman, S. E. et al. Imatinib therapy for patients with recent-onset type 1 diabetes: a multicentre, randomised, double-blind, placebo-controlled, phase 2 trial. Lancet Diabetes Endocrinol. 9, 502–514 (2021).
Quattrin, T. et al. Golimumab and beta-cell function in youth with new-onset type 1 diabetes. N. Engl. J. Med. 383, 2007–2017 (2020).
Rigby, M. R. et al. Alefacept provides sustained clinical and immunological effects in new-onset type 1 diabetes patients. J. Clin. Invest. 125, 3285–3296 (2015).
Russell, W. E. et al. Abatacept for delay of type 1 diabetes progression in stage 1 relatives at risk: a randomized, double-masked, controlled trial. Diabetes Care 46, 1005–1013 (2023).
Waibel, M. et al. Baricitinib and β-cell function in patients with new-onset type 1 diabetes. N. Engl. J. Med. 389, 2140–2150 (2023).
Joglekar, M. V., Kaur, S., Pociot, F. & Hardikar, A. A. Prediction of progression to type 1 diabetes with dynamic biomarkers and risk scores. Lancet Diabetes Endocrinol. 12, 483–492 (2024).
Warncke, K. et al. Elevations in blood glucose before and after the appearance of islet autoantibodies in children. J. Clin. Invest. 132, e162123 (2022).
Salami, F. et al. HbA1c as a time predictive biomarker for an additional islet autoantibody and type 1 diabetes in seroconverted TEDDY children. Pediatr. Diabetes 23, 1586–1593 (2022).
Helminen, O. et al. HbA1c predicts time to diagnosis of type 1 diabetes in children at risk. Diabetes 64, 1719–1727 (2015).
Joglekar, M. V., Joglekar, V. M. & Hardikar, A. A. Expression of islet-specific microRNAs during human pancreatic development. Gene Expr. Patterns 9, 109–113 (2009).
Poy, M. N. et al. A pancreatic islet-specific microRNA regulates insulin secretion. Nature 432, 226–230 (2004).
Wei, R. et al. Dynamic expression of microRNAs during the differentiation of human embryonic stem cells into insulin-producing cells. Gene 518, 246–255 (2013).
Joglekar, M. V. et al. Circulating microRNAs from early childhood and adolescence are associated with pre-diabetes at 18 years of age in women from the PMNS cohort. J. Dev. Orig. Health Dis. 13, 806–811 (2022).
Wong, W. K. M. et al. Machine learning workflows identify a microRNA signature of insulin transcription in human tissues. iScience 24, 102379 (2021).
Asirvatham, A. J., Gregorie, C. J., Hu, Z., Magner, W. J. & Tomasi, T. B. MicroRNA targets in immune genes and the Dicer/Argonaute and ARE machinery components. Mol. Immunol. 45, 1995–2006 (2008).
Joglekar, M. V., Parekh, V. S. & Hardikar, A. A. New pancreas from old: microregulators of pancreas regeneration. Trends Endocrinol. Metab. 18, 393–400 (2007).
Scherm, M. G., Serr, I., Kaestner, K. H. & Daniel, C. The role of T cell miRNAs for regulatory T cell induction in islet autoimmunity. Mol. Metab. 27S, S122–S128 (2019).
Dotta, F., Ventriglia, G., Snowhite, I. V. & Pugliese, A. MicroRNAs: markers of β-cell stress and autoimmunity. Curr. Opin. Endocrinol. Diabetes Obes. 25, 237–245 (2018).
Liu, R. et al. MicroRNA-21 promotes pancreatic β cell function through modulating glucose uptake. Nat. Commun. 13, 3545 (2022).
Wong, W., Farr, R., Joglekar, M., Januszewski, A. & Hardikar, A. Probe-based real-time PCR approaches for quantitative measurement of microRNAs. J. Vis. Exp. 98, e52586 (2015).
Sims, E. K., Evans-Molina, C., Tersey, S. A., Eizirik, D. L. & Mirmira, R. G. Biomarkers of islet beta cell stress and death in type 1 diabetes. Diabetologia 61, 2259–2265 (2018).
Brenu, E. W., Harris, M. & Hamilton-Williams, E. E. Circulating biomarkers during progression to type 1 diabetes: a systematic review. Front. Endocrinol. 14, 1117076 (2023).
Gala-Lopez, B. L. et al. Beta cell death by cell-free DNA and outcome after clinical islet transplantation. Transplantation 102, 978–985 (2018).
Dewar, A. L. et al. Macrophage colony-stimulating factor receptor c-fms is a novel target of imatinib. Blood 105, 3127–3132 (2005).
Druker, B. J. et al. Effects of a selective inhibitor of the Abl tyrosine kinase on the growth of Bcr–Abl positive cells. Nat. Med. 2, 561–566 (1996).
Hoemberger, M., Pitsawong, W. & Kern, D. Cumulative mechanism of several major imatinib-resistant mutations in Abl kinase. Proc. Natl Acad. Sci. USA 117, 19221–19227 (2020).
Hu, X. et al. Imatinib inhibits CSF1R that stimulates proliferation of rheumatoid arthritis fibroblast-like synoviocytes. Clin. Exp. Immunol. 195, 237–250 (2019).
Krause, D. S. & Van Etten, R. A. Tyrosine kinases as targets for cancer therapy. N. Engl. J. Med. 353, 172–187 (2005).
Lee, K. et al. Lck is a key target of imatinib and dasatinib in T-cell activation. Leukemia 24, 896–900 (2010).
O’Brien, S. G. et al. Imatinib compared with interferon and low-dose cytarabine for newly diagnosed chronic-phase chronic myeloid leukemia. N. Engl. J. Med. 348, 994–1004 (2003).
Salah, E. et al. Crystal structures of ABL-related gene (ABL2) in complex with imatinib, tozasertib (VX-680), and a type I inhibitor of the triazole carbothioamide class. J. Med. Chem. 54, 2359–2367 (2011).
Seeliger, M. A. et al. c-Src binds to the cancer drug imatinib with an inactive Abl/c-Kit conformation and a distributed thermodynamic penalty. Structure 15, 299–311 (2007).
Padilla-Martínez, F., Collin, F., Kwasniewski, M. & Kretowski, A. Systematic review of polygenic risk scores for type 1 and type 2 diabetes. Int. J. Mol. Sci. 21, 1703 (2020).
Lynn, F. C. et al. MicroRNA expression is required for pancreatic islet cell genesis in the mouse. Diabetes 56, 2938–2945 (2007).
Joglekar, M. V., Parekh, V. S., Mehta, S., Bhonde, R. R. & Hardikar, A. A. MicroRNA profiling of developing and regenerating pancreas reveal post-transcriptional regulation of neurogenin3. Dev. Biol. 311, 603–612 (2007).
Bonner-Weir, S., Baxter, L. A., Schuppin, G. T. & Smith, F. E. A second pathway for regeneration of adult exocrine and endocrine pancreas. A possible recapitulation of embryonic development. Diabetes 42, 1715–1720 (1993).
Sebastiani, G. et al. A set of circulating microRNAs belonging to the 14q32 chromosome locus identifies two subgroups of individuals with recent-onset type 1 diabetes. Cell Rep. Med. 5, 101591 (2024).
Assmann, T. S., Recamonde-Mendoza, M., De Souza, B. M. & Crispim, D. MicroRNA expression profiles and type 1 diabetes mellitus: systematic review and bioinformatic analysis. Endocr. Connect. 6, 773–790 (2017).
Boxberger, N., Hecker, M. & Zettl, U. K. Dysregulation of inflammasome priming and activation by microRNAs in human immune-mediated diseases. J. Immunol. 202, 2177–2187 (2019).
Simpson, L. J. & Ansel, K. M. MicroRNA regulation of lymphocyte tolerance and autoimmunity. J. Clin. Invest. 125, 2242–2249 (2015).
Rawshani, A. et al. Excess mortality and cardiovascular disease in young adults with type 1 diabetes in relation to age at onset: a nationwide, register-based cohort study. Lancet 392, 477–486 (2018).
Januszewski, A. S. et al. Insulin micro-secretion in type 1 diabetes and related microRNA profiles. Sci. Rep. 11, 11727 (2021).
So, M. et al. Characterising the age-dependent effects of risk factors on type 1 diabetes progression. Diabetologia 65, 684–694 (2022).
Leete, P. et al. The effect of age on the progression and severity of type 1 diabetes: potential effects on disease mechanisms. Curr. Diab. Rep. 18, 115 (2018).
Marcovecchio, M. L. et al. The INNODIA Type 1 Diabetes Natural History Study: a European cohort of newly diagnosed children, adolescents and adults. Diabetologia 67, 995–1008 (2024).
Gadd, D. A. et al. Blood protein assessment of leading incident diseases and mortality in the UK Biobank. Nat. Aging 4, 939–948 (2024).
Lynam, A. L. et al. Logistic regression has similar performance to optimised machine learning algorithms in a clinical setting: application to the discrimination between type 1 and type 2 diabetes in young adults. Diagn. Progn. Res. 4, 6 (2020).
Harlan, D. M., Kenyon, N. S., Korsgren, O., Roep, B. O. & Immunology of Diabetes Society Current advances and travails in islet transplantation. Diabetes 58, 2175–2184 (2009).
Kale, A. & Rogers, N. M. No time to die—how islets meet their demise in transplantation. Cells 12, 796 (2023).
Redondo, M. J. & Morgan, N. G. Heterogeneity and endotypes in type 1 diabetes mellitus. Nat. Rev. Endocrinol. 19, 542–554 (2023).
Harding, J. L. et al. The incidence of adult-onset type 1 diabetes: a systematic review from 32 countries and regions. Diabetes Care 45, 994–1006 (2022).
Lin, J. et al. Distinct transcriptomic profiles in children prior to the appearance of type 1 diabetes-linked islet autoantibodies and following enterovirus infection. Nat. Commun. 14, 7630 (2023).
Kelly, M. A. et al. HLA typing and immunological characterization of young-onset diabetes mellitus in a Hong Kong Chinese population. Diabet. Med. 18, 22–28 (2001).
Joglekar, M. V. et al. A pro-endocrine pancreatic islet transcriptional program established during development is retained in human gallbladder epithelial cells. Cell. Mol. Gastroenterol. Hepatol. 13, 1530–1553 (2022).
Wong, W. K. et al. The long noncoding RNA MALAT1 predicts human pancreatic islet isolation quality. JCI Insight 5, e129299 (2019).
Joglekar, M. V. et al. Postpartum circulating microRNA enhances prediction of future type 2 diabetes in women with previous gestational diabetes. Diabetologia 64, 1516–1526 (2021).
Hardikar, A. A., Farr, R. J. & Joglekar, M. V. Circulating microRNAs: understanding the limits for quantitative measurement by real-time PCR. J. Am. Heart Assoc. 3, e000792 (2014).
Tuszynski, J. & Dietze, M. caTools: Tools: Moving Window Statistics, GIF, Base64, ROC AUC, etc. R package version 1.18.3 (2024).
Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 28, 1–26 (2008).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2022).
Posit team. RStudio: Integrated Development Environment for R. RStudio version 2023.12.1-402 http://www.posit.co/ (Posit Software, 2023).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Wang, Y. et al. shinyCircos-V2.0: leveraging the creation of Circos plot with enhanced usability and advanced features. Imeta 2, e109 (2023).
Brorsson, C. A. et al. Novel association between immune-mediated susceptibility loci and persistent autoantibody positivity in type 1. Diabetes 64, 3017–3027 (2015).
Oram, R. A. et al. A type 1 diabetes genetic risk score can aid discrimination between type 1 and type 2 diabetes in young adults. Diabetes Care 39, 337–344 (2016).
Machiela, M. J. & Chanock, S. J. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics 31, 3555–3557 (2015).
Hardikar, H. P. et. al. Applicability of a microRNA-based Dynamic Risk Score (DRS) for type 1 diabetes. Code Ocean https://doi.org/10.24433/CO.4476520.v1 (2025).
Acknowledgements
We acknowledge the contribution from cadaveric tissue donors and thank the study participants, their families and all support staff across multiple study sites in Australia, Canada, Denmark, Hong Kong SAR People’s Republic of China, India, New Zealand and the United States. We acknowledge the approvals through the Health Ministry Screening Committee (KEM Hospital Research Centre ID: 1510), Government of India, for the study samples from India, the Physiology Department, DY Patil University and the KEM Hospital Research Centre for their infrastructure through the Hinduja Foundation, the Mukul Madhav Foundation, and the Nityasha Foundation. R.C.W.M. acknowledges support from the Research Grants Council of the Hong Kong Special Administrative Region (ref. no: T12-402/13N and CU R4012-18). A.A.H. acknowledges infrastructure support through the University of Sydney, Ingham Institute, the Translational Health Research Institute and the School of Medicine, Western Sydney University, and from all the study team and collaborators on this project. M.V.J. has received research funding from Breakthrough T1D (advanced postdoctoral fellowship no. 3-APF-2016-178-A-N and transition award no. 1-FAC-2021-1063-A-N) and an Australia-India early and mid-career fellowship from the Australian Academy of Science. W.K.M.W. is supported through a Breakthrough T1D postdoctoral fellowship (no. 3-PDF-2023-1324-A-N). F.P. has received funding from INNODIA (IMI2-JU 115797) and INNODIA HARVEST (IMI2-JU 948268). S.E.G. has received funding from the Juvenile Diabetes Research Foundation (JDRF) (Gleevec Study). A.A.H. has received funding from the JDRF Australia and the Commonwealth of Australia through a special research initiative of the Australian Research Council (nos. 4-CDA-2016-228-M-B and 1-CDA-2021-1084-M-B); JDRF Australia in collaboration with The Leona M. and Harry B. Helmsley Charitable Trust (nos. 3-SRA-2019-694-M-B and 1-SRA2021-1067-MB) to Western Sydney University; Breakthrough T1D (no. 3-SRA-2022-1263-S-B); a National Health and Medical Research Council Ideas grant no. 2021/GNT2011557; Danish Diabetes & Endocrine Academy (NNF grant #NNF17SA0031406) and Western Sydney University support through the Ainsworth Medical Research Funding. None of the funders had any role in study design, data collection and analysis, or writing of the manuscript. Other members of the PREDICT T1D study group are listed at the end of the article as consortium authors.
Author information
Authors and Affiliations
Consortia
Contributions
A.A.H. is the study chair and wrote the first draft of the manuscript. A.A.H. and M.V.J. conceptualized and planned the study. M.V.J., R.J.F., R.H., C.L.M., W.K.M.W. and A.A.H. carried out the laboratory analyses. S.K., A.A.A., K.C.D., A.J.J., M.E.C., M.K.P., J.C.N.C., A.M.J.S., W.M.H., C.S.Y., R.C.W.M., E.A.D., T.W.J., S.E.G. and F.P. collected the clinical data. M.V.J., W.K.M.W., S.K., P.S.K., H.P.H., R.A.K., V.T., I.A., A.P., N.H.T.P., M.C. and A.A.H. analyzed the data. A.A.H., M.V.J. and S.E.G. acquired the funding. A.A.H., M.V.J. and W.K.M.W. managed the project. A.A.H., M.V.J., W.K.M.W., P.S.K. and H.P.H. were the core committee and finalized the manuscript. All authors critically reviewed and contributed sections of the manuscript. R.H. and R.J.F. provided the consumer perspective. All authors were involved in the conduct of the study and the collection and review of the study data. A.A.H., M.V.J., W.K.M.W., H.P.H., P.S.K., R.A.K., N.H.T.P. and M.C. verified the data. The writing group had full access to all data and decided to submit the manuscript for publication. H.P.H. and A.A.H. organized and finalized the code shared on the Code Ocean platform with the writing group. All authors reviewed and commented on several versions of the manuscript. The core writing group made appropriate revisions in consultation with the relevant authors to submit this revised manuscript.
Corresponding authors
Ethics declarations
Competing interests
S.E.G. has served on advisory boards for Abata, Genentech, GentiBio, Provention Bio, SAB Biotherapeutics, Sanofi and Shoreline Biosciences. He has received support from Provention Bio, Sanofi and the National Institutes of Health for his roles in conducting clinical trials. He serves on data and safety monitoring boards for Diamyd Medical, Breakthrough T1D and INNODIA. F.P. has received advisory and lecture fees from Sanofi Aventis. A.A.H. has served on the advisory boards of Abbott and Mylan, and has received grants through Breakthrough T1D and the Novo Nordisk Foundation to identify the biomarkers and regulators of diabetes progression. He has been funded through The Leona M. and Harry B. Helmsley Charitable Trust to develop a nanotechnology-based method for miRNA detection. The other authors declare no competing interests.
Peer review
Peer review information
Nature Medicine thanks Xiantong Zou and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Liam Messin and Sonia Muliyil, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 PREDICT T1D microRNA profiling following in vitro human islet cell death.
Freshly isolated human islet preparations (n = 6, identified by the site and/donor ID) were exposed to different concentrations (0 mM, 1 mM, 10 mM) of a nitric oxide (NO)-donor (sodium nitroprusside; SNP). Bidirectional hierarchical clustering identified increased levels of PREDICT T1D microRNAs in supernatant with increasing concentrations of sodium nitroprusside. Data are presented as normalised Ct (cycle threshold) values, as described in methods, and represent microRNA abundance as measured by TaqMan qRT-PCR. A red colour denotes higher abundance (lower Ct-value) of microRNA in the supernatant, while white colour indicates no detectable expression (Ct-value > 39). Replicates from different experiments for the same donor are identified by the same donor ID.
Extended Data Fig. 2 Context-wise expression of PREDICT T1D microRNAs.
Relative expression of the 50 PREDICT T1D microRNAs in the circulation of study participants from four contexts (AUS, HKG, IND, DNK). AUS Control n = 209, T1D n = 519; HKG Control n = 118, T1D n = 120; IND Control n = 133, T1D n = 497; DNK siblings of individuals with T1D n = 292, T1D n = 316. Further details of study participants are provided in Supplementary Table 2. The Y-axis presents microRNA transcript abundance (fold-over-detectable) with data for each of the four contexts and all four contexts together (“All Four”) on the X-axis. Each dot in the scatter plot denotes the microRNA expression level for a single individual, with a green colour for Control samples, while a dark red colour for individuals with T1D. For DNK, siblings of individuals with T1D are presented with a lighter red colour. Data presents the geometric mean with 95% CI (solid lines). Significance is calculated using Kruskal-Wallis test with uncorrected Dunn’s multiple comparison. NS= not significant; *=p < 0.05; **=p < 0.01; ***=p < 0.001, ****=p < 0.0001.
Extended Data Fig. 3 Genomic ___location and collinearity of PREDICT T1D microRNAs across four contexts.
Circos plot providing the relative genomic ___location for each of the PREDICT T1D microRNAs. Links connecting different microRNAs represent their collinearity (two- sided spearman correlation coefficient >0.9 and p < 0.05) within the four contexts (AUS n = 728, DNK n = 608, HKG n = 238, IND n = 630), each presented with a different colour as indicated below the plot.
Extended Data Fig. 4 Association between PREDICT T1D microRNAs and autoantibody expression.
A sub-analysis was carried out to assess correlation between autoantibody levels (IA2A, GADA, ZnT8R and ZnT8W) and PREDICT T1D microRNA abundance for participants from DNK, wherein autoantibody measurements for a subset of siblings and cases (T1D) were available. All statistically significant two-sided spearman correlations (P < 0.05) are indicated with a red (positive correlation) or blue (negative correlation) fill colour; a white fill represents no significant correlation. The correlation coefficient is presented within the correlation matrix as per the colour shade legend (right). Data represent siblings (n = 168) who had a first-degree relative (parent and/or sibling) with T1D and n = 96 individuals who were clinically diagnosed with T1D.
Extended Data Fig. 5 Associations between age at onset, age at sampling and/or diabetes duration with PREDICT T1D microRNAs.
A sub-analysis was carried out to assess the correlation between age at sampling, age at onset, diabetes duration and the PREDICT T1D microRNA abundance for a subset of the T1D study participants from DNK (n = 237) and HKG (n = 101). All statistically significant two-sided spearman correlations (P < 0.05) are indicated with a red (positive correlation) or blue (negative correlation) fill colour; a white fill represents no significant correlation. The correlation coefficient is presented within the correlation matrix as per the colour shade legend (right).
Extended Data Fig. 6 The relative importance of PREDICT T1D microRNAs across four contexts.
The relative importance of all PREDICT T1D microRNAs and age at sample collection is presented as percent contribution for each of the four contexts: Australia (Red), Denmark (Blue), Hong Kong (Yellow) and India (Green). Mean Decrease Gini (MDG) index was estimated through random forest ML-based algorithm to rank the variables by their relative importance.
Extended Data Fig. 7 Capacity of pre- transplant (pre-Tx) baseline clinical and/or biochemical parameters in discriminating participant’s responsiveness to therapy.
(a-h) Comparison of available baseline clinical/biochemical features to assess their capacity to discriminate study participants into two treatment response groups – those that showed no/low (0– 0.10U/Kg/day, n = 8, (panel h), 10 (panel c, e, f), 12 (panel g) or 13 (panel a, b, d) study participants) exogenous insulin requirement at 1-month post-Tx (blue) vs those that required higher (>0.12U/Kg/day, n = 10 (panel h), 15 (panel c, e, f), 16 (panel b) or 18 (panel a, d, g) individuals) exogenous insulin at 1-month post-Tx (orange). The number of participants in each comparison varied based on data availability. Statistical significance is presented based on two-sided t-test between the two groups. Data presents the minimum, 25th percentile, median, 75th percentile and maximum values in the box and whiskers graphs. (i) A dimensionality reduction algorithm was used to see if all clinical variables together offered reliable stratification of the no/low (n = 6) vs high exogenous insulin requirement (n = 9).
Extended Data Fig. 8 Capacity of baseline variables in discriminating participant’s responsiveness to imatinib therapy.
(a-h) Comparison of available baseline clinical/biochemical features to assess their potential in stratifying clinical trial participants to two treatment response groups – those that were in the Lower Quartile (LQ) of drug response (blue, n = 11/group, panel d with n = 10) vs those in the Upper Quartile (UQ) of the drug response (orange, n = 11). Statistical significance is presented based on two-sided Welch’s t-test between the two groups. Data presents the minimum, 25th percentile, median, 75th percentile and maximum values in the box and whiskers graphs. (i) A dimensionality reduction algorithm was used to see if all available baseline clinical variables together could stratify the LQ (n = 11) and UQ (n = 11) participants. DV=dependant variables. (j) The expression of 17 of the PREDICT T1D microRNAs (indicated on the Y-axis) was significantly different across the UQ (n = 11) and LQ (n = 11) of imatinib clinical trial participants. All microRNAs (most targeting tyrosine kinase) were significantly higher in participants who best responded to imatinib therapy. Statistical significance across the two groups is presented for each microRNA comparison based on a one-sided Welch’s t-test. Data presents the median (red solid line), quartiles (blue dotted line) and distribution in the violin plots.
Extended Data Fig. 9 Usability of the eDRS4C risk score in different scenarios.
Plasma of a single individual was assessed for the PREDICT T1D microRNAs and autoantibodies from 14 months of age to 60 months of age (Clinical diagnosis of T1D). a) shows the PREDICT T1D microRNA-based eDRS4C at different timepoints during progression to T1D. The eDRS4C was already high (>60% T1D probability) at the first measurement (14 months), increasing to >75% T1D probability at 18 months of age, and remaining high (>60% probability) thereafter. Islet autoantibodies (b) increased in circulation at later time points (from week 39 onwards). Similarly, the PREDICT T1D microRNAs can be used in an anomaly detection algorithm to identify individuals within a cohort (first-degree T1D relatives from DNK) who could be further risk-stratified to T1D progressors and non-progressors. c) An isolation forest (anomaly detection) plot using existing biomarkers of T1D risk (GRS, autoantibodies and age), T1D progressors (n = 4), and non-progressor (n = 159) d) An isolation forest (anomaly detection) plot using the top 10 features (see Fig. 2d) of this microRNA-based T1D risk score. The red dots indicate siblings predicted to be at the highest risk of progression to T1D, while the blue dots represent those at lower risk of T1D. Four of these individuals (S1137, S1213, S1338, S3210) within this cohort progressed to T1D in 12 years from sample collection (T1D progressors n = 4, and non-progressors n = 288). Those labelled in a red-coloured font are correctly identified as progressors using the existing (GRS, autoantibodies, and age; c) or the top-10 features of the microRNA-based (d) risk scores.
Extended Data Fig. 10 A comparison of existing studies evaluating plasma/serum microRNAs in T1D.
A comparison of studies that assessed the expression of microRNAs in plasma/serum from healthy controls and individuals with type 1 diabetes. The red bubbles indicate candidate microRNA studies, whilst the Discovery- Validation study designs are marked with blue and green bubbles respectively, with a connecting line indicating validation of specific (discovery) miRNAs within that study. The X-axis indicates the year of publication, the Y- axis indicates the number of microRNAs assessed in the study, the size of the bubble represents the number of study participants, while the colour of the bubble indicates the study type (Candidate, Discovery, Validation, Replication, Application). The microRNAs measured and presented in this study are placed on the rightmost end (based on this preprint’s submission date for 2025). Studies included in this bubble plot are listed in Supplementary Table 8.
Supplementary information
Supplementary Information
Supplementary Fig. 1, Tables 1–11 and references.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Joglekar, M.V., Wong, W.K.M., Kunte, P.S. et al. A microRNA-based dynamic risk score for type 1 diabetes. Nat Med (2025). https://doi.org/10.1038/s41591-025-03730-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41591-025-03730-7