Fig. 2: Comparison of chemical space between novel macrocycles generated by Macformer trained with five-fold data augmentation and MacLS_extra, respectively, on ChEMBL test and ZINC datasets. | Nature Communications

Fig. 2: Comparison of chemical space between novel macrocycles generated by Macformer trained with five-fold data augmentation and MacLS_extra, respectively, on ChEMBL test and ZINC datasets.

From: Macrocyclization of linear molecules by deep learning to facilitate macrocyclic drug candidates discovery

Fig. 2

a Distribution of average Tanimoto coefficient between generated novel and ground-truth target macrocycles. ChEMBL, Macformer, ×5, n = 23772; ChEMBL, MacLS_extra, n = 23765; ZINC, Macformer, ×5, n = 5514; ZINC, MacLS_extra, n = 5551. b UMAP plot of the 1024-bit Morgan fingerprints of the linkers in the ChEMBL training dataset (n = 9243 linkers) and the novel linkers generated by Macformer on ChEMBL test (n = 9039 linkers) and ZINC (n = 2082 linkers) datasets, respectively. c Retrospective macrocyclization of a Checkpoint Kinase 1 (CHK1) inhibitor64 by Macformer. The Tc values between the generated novel and target compounds were labeled. Source data are provided as a Source Data file.

Back to article page