Fig. 2: Information of datasets and performance of FastEI. | Nature Communications

Fig. 2: Information of datasets and performance of FastEI.

From: Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library

Fig. 2

a The molecule classes predicted by ClassyFire for the test set. b The visualization of the ECFPs of 240,000 molecules randomly selected from f-CHEMBL and 232,826 molecules from the training set by UMAP. c The spectrum matching time of FastEI and WCS on libraries with different sizes. d The contribution of Word2vec embeddings and HNSW to FastEI (WCS weighted binning + cosine similarity, EC embeddings + cosine similarity, BH weighted binning + HNSW, FastEI embeddings + HNSW). e The performance of FastEI and WCS on the test set in terms of recall rates at different top x levels. Source data are provided as a Source Data file.

Back to article page