Table 2 Results summary of all experiments where we train the models on either batch and test the models within the same-batch or cross-batch. We perform hypothesis testing where the null hypothesis states that the model testing accuracy with color normalization is the same as using just the original H&E images and the alternative hypothesis as the model testing performance with color normalization is better than with the original H &E images. p-values are indicated in the parenthesis.
From: Impact of stain variation and color normalization for prognostic predictions in pathology
Testing set | Batch A | Batch B |
---|---|---|
Train on Batch A | ||
 Original H&E | 0.81 | 0.53 |
 Traditional method | 0.96(p\(=\)0.010) | 0.60 |
 Generative method | 0.93 (p\(=\)0.069) | 0.61 |
Train on Batch B | ||
 Original H&E | 0.52 | 0.74 |
 Traditional method | 0.58 | 0.88 (p\(=\)0.033) |
 Generative method | 0.52 | 0.87 (p\(=\)0.033) |