Fig. 4: Analyses of the evolution of wake-abstraction-dreaming cycles.
From: A data-driven group retrosynthesis planning model inspired by neurosymbolic programming

a The comparison of the exact match accuracy between our single-step model and other baselines (G2Gs38, GraphRetro39, Retrosim33, Transformer36, Megan37, Neuralsym34, and GLN35) when making template selection. The exact match accuracy is determined by whether the set of reactants given in canonical SMILES matches exactly with the ground truth reactants. The top-K accuracy refers to the number of correct predictions among the model’s top K predictions. Experiments are run over 10 random seeds, and the average results are reported; the same applies to other panels in this figure. The shaded area represents the range between the maximum and minimum values; the same applies to Panel (c). b The number of increments of the library size in the abstraction phase within the evolution cycles, with growth patterns shown in different colors based on occurrence frequency. The lower and upper whiskers represent the minimum and maximum of the ten data points, respectively; the same applies to Panels (d–h). c The accuracy of our single-step model within the evolution cycles. d The successfully solved retrosynthetic tasks within the evolution cycles are plotted based on the distribution of synthesis route lengths. e, f Ablation study conducted on the Retro-190 and the larger and more challenging dataset, respectively, including (1) wake module only (W), (2) wake and abstraction modules (WA), (3) wake and dreaming modules (WD), and (4) all modules (WAD). g, h The performance of the abstraction module at different frequency thresholds (ζ) on the Retro-190 and the larger and more challenging datasets, respectively. Source data are provided as a Source Data file.