Fig. 2: Scenario 1: Deployment to a new site—breast screening task. | Nature Communications

Fig. 2: Scenario 1: Deployment to a new site—breast screening task.

From: Automatic correction of performance drift under acquisition shift in medical image classification

Fig. 2

Left column: Specificity in function of sensitivity before and after prediction alignment. For this analysis, we sample an evaluation set (of 2500 cases) and a disjoint alignment set (of 1000 cases) from all available cases, this sampling is repeated 500 times with replacement. Sensitivity, specificity, ROC-AUC are measured over these 500 bootstrap samples and results are reported in terms of average results over the bootstrap samples and error bars depict the 95%-bootstrap confidence interval for each metric. Right column: the difference between sensitivity and specificity before and after alignment. Boxplots are constructed from 500 repeated sampling of evaluation and alignment sets; each box shows the 25%, 50% and 75% percentiles of the bootstrap distribution; whiskers denote the 5% and 95% percentiles and any point outside of this range is represented as an outlier. UPA is effective at recovering the desired sensitivity/specificity balance across all out-of-distribution datasets. Source data are provided as a Source Data file.

Back to article page