Table 2 Diagnostic performance of the transformer-based model on a pixel level after 250 training epochs and additional finetuning.

From: Detection and localization of caries and hypomineralization on dental photographs with a vision transformer model

Diagnostic category

Total pixel number (N × 106)

F1

IoU

Average precision

Accuracy

Diagnostic performance after 250 training epochs (baseline training)

Caries

Non-cavitation

6.096

0.595

0.423

0.683

0.983

Grayish translucency

0.260

0.347

0.210

0.420

0.999

Enamel breakdown

0.139

0.161

0.088

0.468

0.999

Dentin cavity

2.713

0.763

0.617

0.751

0.995

Fully destructed tooth

0.922

0.630

0.460

0.542

0.997

Molar–incisor hypomineralization (chalky teeth)

Demarcated opacity

3.715

0.586

0.423

0.657

0.990

Enamel disintegration

0.688

0.604

0.433

0.674

0.998

Atypical restoration

1.552

0.669

0.503

0.704

0.996

None

246.057

0.984

0.969

0.980

0.970

Total

262.142

0.962

0.937

0.961

0.964

Diagnostic performance after 250 training epochs + finetuning

Caries

Non-cavitation

6.386

0.773

0.630

0.813

0.990

Grayish translucency

0.292

0.746

0.595

0.743

0.999

Enamel breakdown

0.136

0.521

0.352

0.588

0.999

Dentin cavity

2.471

0.818

0.692

0.830

0.997

Fully destructed tooth

1.674

0.881

0.787

0.882

0.999

Molar–incisor hypomineralization (chalky teeth)

Demarcated opacity

4.758

0.804

0.672

0.827

0.993

Enamel disintegration

0.322

0.673

0.507

0.669

0.999

Atypical restoration

1.566

0.906

0.829

0.902

0.999

None

244.539

0.990

0.979

0.988

0.981

Total

262.144

0.977

0.959

0.977

0.978