Fig. 2: Database evaluation based on short-read amplicon data from this study. | Nature Communications

Fig. 2: Database evaluation based on short-read amplicon data from this study.

From: MiDAS 5: Global diversity of bacteria and archaea in anaerobic digesters

Fig. 2

The ASVs for each of the samples were filtered based on their relative abundance (only ASVs with ≥0.01% relative abundance were kept) before the analyses. The percentage of the microbial community represented by the remaining ASVs after the filtering was 95.44% ± 2.23% (mean ± SD) for V1-V3 amplicons (only bacteria), 99.65% ± 0.17% for V3-V5 amplicons (mainly archaea), and 97.34% ± 2.01% for V4 amplicons (bacteria and archaea) across samples. High-identity (≥99%) hits were determined by stringent mapping of ASVs to each reference database. Classification of ASVs was done using the SINTAX classifier. The violin and box plots illustrate the distribution of the percentage of ASVs with high-identity hits or genus/species-level classifications for each database, analyzed across 570 biologically independent samples, including two biological replicates for each digester. Box plots indicate median (middle line), 25th, 75th percentile (box), and the min and max values after removing outliers based on 1.5x interquartile range (whiskers). Outliers have been removed from the box plots to ease visualization. Different colors are used to distinguish the different databases: GTDB_bac120_ssu_reps_r214, GTDB_ssu_all_r214, GreenGenes2_2022_10 (backbone and complete database), SILVA 138.1 SSURef NR99, MiDAS 4.8.1, and MiDAS 5.2.

Back to article page