Fig. 6: STageR - cluster-based epigenetic stage classifier. | Nature Communications

Fig. 6: STageR - cluster-based epigenetic stage classifier.

From: Nonlinear DNA methylation trajectories in aging male mice

Fig. 6

a Workflow of the epigenetic stage of life prediction using STageR. Firstly, the overlap between the query dataset and the nonlinear CpG clusters is identified. Median DNA methylation of overlapping CpGs is calculated, dramatically reducing the feature space’s dimensionality to three. STageR performs a multinomial logistic elastic net regression and assigns probabilities to the three epigenetic stages of life. The stage with the highest probability is highlighted in red in the bar plots. b Mean β-coefficients in the multinomial logistic regression of STageR for clusters in the model per life stage. The bottom panel shows the associated Z-score trajectories. c Misclassification error in tenfold cross-validation models with a subsampled number of cytosines in each cluster (x-axis), n = 100 for each box plot. For the box plots, the center line shows the median, the box limits show the first and third quartiles, and the upper and lower whiskers extend from the hinge to the largest or the lowest value no further than 1.5× the interquartile range (IQR) from the hinge. d STageR prediction for validation dataset. Left: Confusion matrix of predicted life stages (y-axis) for validation samples from four distinct age groups (x-axis). Right: Predicted probabilities for life stages (y-axis) for all validation samples (x-axis). Red boxes are drawn around the life stage with the maximum probability. e STageR prediction for subsampled data. Confusion matrices and predicted probabilities when using 75, 50, 25, and 10% (from top to bottom) randomly sampled cytosines from each cluster. f STageR predictions for publicly available datasets45,46. Source data are provided as a Source Data file.

Back to article page