Tabel 1 Comparison of Macformer with different augmentation numbers and MacLS on ChEMBL test dataset

From: Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery

Method

aug.a

Recovery (%)

Validity (%)

Uniqueness (%)

Noveltymol (%)

Noveltylinker (%)

Macrocyclization (%)

Macformerb

None

54.85 ± 14.28

66.74 ± 2.29

63.18 ± 6.38

89.30 ± 1.94

40.56 ± 2.33

95.00 ± 0.74

×2

96.09 ± 0.61

80.34 ± 1.38

64.43 ± 0.23

91.58 ± 0.15

58.91 ± 0.36

98.62 ± 0.17

×5

97.54 ± 0.16

81.94 ± 1.42

65.36 ± 0.13

91.79 ± 0.16

62.11 ± 0.65

98.80 ± 0.11

×10

97.02 ± 0.05

82.59 ± 1.57

64.44 ± 0.46

91.76 ± 0.22

60.27 ± 0.96

98.46 ± 0.04

MacLS_selfc

/

0.01 ± 0.01

17.05 ± 0.29

95.33 ± 0.01

100 ± 0.00

0.00 ± 0.00

100 ± 0.00

MacLS_extrac

/

4.16 ± 0.20

89.65 ± 0.03

96.32 ± 0.06

99.65 ± 0.02

0.00 ± 0.00

100 ± 0.00

  1. aThe fold of augmentation of ChEBML training dataset.
  2. b Data are mean ± SD, n = 10 independent experiments using different source SMILES strings. Source data are provided as a Source Data file.
  3. c Data are mean ± SD, n = 3 independent experiments using top 3 low-energy conformations. Source data are provided as a Source Data file.