Microbiome–metabolite linkages drive greenhouse gas dynamics over a permafrost thaw gradient

Freire-Zapata, Viviana; Holland-Moritz, Hannah; Cronin, Dylan R.; Aroney, Sam; Smith, Derek A.; Wilson, Rachel M.; Ernakovich, Jessica G.; Woodcroft, Ben J.; Bagby, Sarah C.; Rich, Virginia I.; Sullivan, Matthew B.; Stegen, James C.; Tfaily, Malak M.

doi:10.1038/s41564-024-01800-z

Download PDF

Article
Open access
Published: 01 October 2024

Microbiome–metabolite linkages drive greenhouse gas dynamics over a permafrost thaw gradient

Nature Microbiology volume 9, pages 2892–2908 (2024)Cite this article

12k Accesses
4 Citations
27 Altmetric
Metrics details

Subjects

Abstract

Interactions between microbiomes and metabolites play crucial roles in the environment, yet how these interactions drive greenhouse gas emissions during ecosystem changes remains unclear. Here we analysed microbial and metabolite composition across a permafrost thaw gradient in Stordalen Mire, Sweden, using paired genome-resolved metagenomics and high-resolution Fourier transform ion cyclotron resonance mass spectrometry guided by principles from community assembly theory to test whether microorganisms and metabolites show concordant responses to changing drivers. Our analysis revealed divergence between the inferred microbial versus metabolite assembly processes, suggesting distinct responses to the same selective pressures. This contradicts common assumptions in trait-based microbial models and highlights the limitations of measuring microbial community-level data alone. Furthermore, feature-scale analysis revealed connections between microbial taxa, metabolites and observed CO₂ and CH₄ porewater variations. Our study showcases insights gained by using feature-level data and microorganism–metabolite interactions to better understand metabolic processes that drive greenhouse gas emissions during ecosystem changes.

Soil microbial trait-based strategies drive metabolic efficiency along an altitude gradient

Article Open access 03 December 2021

A combined microbial and biogeochemical dataset from high-latitude ecosystems with respect to methane cycle

Article Open access 04 November 2022

Standardized multi-omics of Earth’s microbiomes reveals microbial and metabolite diversity

Article Open access 28 November 2022

Main

Permafrost peatlands are critical carbon (C) reservoirs¹, and as climate warming accelerates thaw, substantial carbon releases are expected. The magnitude of climate feedback depends on both the amount of C released through microbial decomposition and the composition of greenhouse gases (GHG) released (for example, CO₂ versus CH₄), influenced by microbial decomposition of soil organic matter (SOM)^2,3. However, fine-scale interactions affecting ecosystem emissions remain poorly understood. Advances in metabolomics now enable precise characterization of environmental metabolites⁴, providing insights into metabolic mechanisms from microbial community–environment interactions⁵ that structure ecosystem response to environmental changes⁶. Metabolites, produced by microbial reactions in response to physiological or environmental changes⁷, provide direct insight into in situ microbial function⁸. However, the full suite of environmental metabolites^9,10—the metabolome—results from biotic sources beyond microorganisms, including plant exudates and detritus¹¹ as well as abiotic by-products¹². This complexity can lead to distinct controls shaping metabolome and microbiome assembly, potentially resulting in divergent and asynchronous responses to environmental changes^13,14. To understand these drivers, we applied community assembly theory jointly to microbiomes and metabolomes, treating metabolite ‘communities’ as ecologically analogous to microbial ones¹³. This approach helps disentangle metabolome and microbiome functions, enhancing our grasp of ecosystem biogeochemistry and improving climate change predictions^13,15. We leveraged shotgun metagenomics and high-resolution metabolomics techniques respectively, guided by community assembly principles, to decipher the composition and structure of microbial communities and metabolomes across a thaw gradient. This is particularly relevant for transitioning ecosystems, such as thawing permafrost, where microbial communities and metabolomes reassemble as conditions change. Community assembly involves deterministic and stochastic factors^16,17 (Table 1), which can be disentangled using phylogenetic null models^18,19. Furthermore, we performed random matrix theory (RMT) network analysis to assess the interactions within these reassembling communities and metabolomes.

Table 1 Terms used in the Article and their definitions

Full size table

Our focus is on the Stordalen Mire, a model permafrost ecosystem in northern Sweden spanning three thaw-stage habitats: dry permafrost-underlain palsas (with a seasonally thawed active layer), partially thawed and inundated ombrotrophic bogs, and fully thawed and inundated minerotrophic fens. This thaw progression substantially alters microbial communities^3,20, vegetation^21,22 and SOM composition^23,24,25,26, resulting in increased GHG emissions²⁷.

We hypothesized that the primary drivers shaping microbial community and metabolome assembly shifted in response to the unique environmental signatures of each habitat. In palsa environments, we expected selection pressures to dominate assembly, leading to homogenous microbial communities adapted to seasonal thawing events^28,29. The high chemical lability of litter and presence of oxygen probably promoted rapid microbial degradation compared with the anoxic conditions of bogs and fens²⁶. Steep environmental gradients suggested the dominance of deterministic factors, particularly variable selection, in shaping the palsa metabolome. Limited dispersal, encouraged by dry ombrotrophic conditions, probably contributed to microbial and metabolome homogeneity. Bog ecosystems presented a contrasting scenario. Although exposed to a strong selective filter from Sphagnum metabolites and acidity, the bog microbiota probably experienced some local dispersal owing to fluctuating water table levels, potentially mitigating homogenous selection. Sphagnum moss strongly influenced the bog metabolome through litter inputs and bioactive metabolites. We predicted homogenization of the metabolome due to accumulation of recalcitrant metabolites, altered enzymatic activity and anoxic conditions²⁴. For fen environments, we hypothesized that minerotrophic conditions and full inundation promoted microbial dispersal, outweighing selective pressure^28,30. Runoff from surrounding areas and water mixing might have facilitated metabolite dispersal³⁰. We predicted a more variable metabolome shaped by the diverse and active microbial community present supported by the presence of microbially derived compounds with low aromaticity^23,24,25.

We examined the congruence between microbial communities and metabolomes at community and feature scale (microbial lineages and metabolite formulas). We hypothesized that synchrony in ecological assembly processes at the community level signified a stable, coordinated ecosystem response while feature-scale alignment suggested specialized, finely tuned microbial activities and metabolite patterns.

Results

Divergent assembly of metabolites and microorganisms in permafrost

We used paired genome-resolved metagenomics and high-resolution Fourier transform ion cyclotron resonance (FTICR) mass spectrometry to characterize peatland microbial communities and metabolomes. Samples were collected from three habitats (palsa, bog and fen) at different depths (shallow, middle and deep) at the Stordalen Mire between June and August 2012 (Supplementary Data 1). Although we used a metagenome-assembled genome (MAG) database compiled over a longer time frame (2011–2017), containing 13,290 MAGs^31,32, we analysed 1,402 MAGs from 2012 samples with paired metabolome data, detecting 14,432 metabolites (6,763 with assigned molecular formulas). By investigating ecological processes shaping microbiome and metabolome assembly at various scales, our analysis revealed a decoupling between drivers of microbiome and metabolome assembly. Microbial assembly shifted along the permafrost thaw gradient from environmental filtering and biotic interactions towards increased stochasticity resulting in low compositional turnover within habitats (Fig. 1a,b). Microbial assembly was driven by homogeneous selection in the palsa and homogeneous dispersal in the bog and fen (Fig. 1e). Variable selection contributed somewhat to the fen. In the palsa, limited spatial turnover probably resulted from consistent environmental filters such as dry, ombrotrophic conditions and strong seasonal freezing–thawing cycles. Increased water availability promotes dispersal in the bog and fen. Various ecological factors may shape bog communities, including Sphagnum presence, low pH, nutrient levels³³ and water table fluctuations, creating distinct niches³⁴. The diverse fen SOM composition²² provided various avenues for microbial communities to thrive^3,24,25. In contrast to microbial assembly, metabolome compositions were primarily driven by variable selection and homogenizing dispersal (Fig. 1b–e). The bog had the highest variable selection, followed by the palsa and the fen. Homogenizing dispersal had the most impact on the fen, then the palsa and, lastly, the bog. Despite our hypotheses, Sphagnum-derived metabolites did not create a uniform bog metabolome (variable selection was dominant), suggesting that microorganisms can degrade complex polyphenolic compounds³⁵. Abiotic reactions also facilitate Sphagnum leachate breakdown³⁶. Divergent palsa and fen metabolomes may arise from factors systematically affecting production or transformation, akin to ecological selection^13,37. Differing metabolite patterns probably stem from niche specialization, nutrient dynamics and habitat physicochemical properties^23,38. As distinct processes govern microbial versus metabolite assembly, we examined their coordination using Mantel tests within each habitat, finding no significant correlations. However, Spearman correlations revealed strong microbial and metabolite beta nearest taxon index (βNTI) associations in the palsa at monthly and depth scales (Supplementary Data 2) indicating the need for finer-scale analysis to elucidate microorganism–metabolite interactions.

**Fig. 1: Ecological assembly processes driving active-layer permafrost microbial communities and metabolites.**

Next, we examined the environmental drivers underlying the observed divergent assembly patterns between microbiomes and metabolomes. Depth, C:N ratios and precipitation significantly contributed to shaping palsa and fen metabolomes (Fig. 1f). This highlights the influence of oxygen availability, microbial activity and peat stratification on metabolome assembly^24,26,39. Notably, bog metabolite assemblages showed no significant correlations with environmental factors, suggesting the involvement of unexplored variables. Even though this habitat experienced a higher influence of variable selection (Fig. 1e), the strongest connection with environmental variables was observed in the other habitats suggesting that the bog is potentially organized at the feature scale and driven by the connections and actions of individual features. This highlights the importance of feature-scale microorganism–metabolite connections in the bog and the need for a multidimensional approach to understand the ecological processes shaping permafrost ecosystems.

Microbial and metabolite dynamics at the feature scale

Building on a previous work¹⁵, we used a feature-specific beta diversity null modelling approach (βNTI_feature) to identify individual microbial taxa and metabolites influencing ecological variation or similarity within communities¹⁵. This method pinpoints members with unique responses to ecological pressures^15,40. Although most individual microorganisms minimally affected divergence or convergence (ecological variation or similarity, respectively) (Supplementary Data 3 and 4), a greater proportion of metabolites contributed to metabolome variation within each habitat compared with the microbiome (Supplementary Data 3 and 4, and Supplementary Note 1). This aligns with findings¹⁵ suggesting that the transient nature of metabolomes drives this difference.

Furthermore, our analysis of correlations between microbial genome abundances and metabolite-βNTI_feature-derived clusters unveiled groups influencing ecological chemical variation, similarity and stochasticity (Fig. 2c). More microbial–metabolite correlations were observed in the bog, supporting our findings that this habitat is organized at the feature-scale level (previous section). This suggests that strong microorganism–metabolite interactions are more important in the bog for organic matter diversification. In the bog, cluster 1 metabolites contributed to metabolome divergence with a higher relative abundance of S-containing compounds, while cluster 2 exhibited a prevalence of N-containing compounds contributing to divergence and stochasticity (Fig. 2c). Meanwhile, in the palsa and fen, we identified three metabolite clusters with different contributions to metabolome assembly and significantly different metabolite properties (Supplementary Note 2). Across habitats, clusters contributing towards metabolome convergence or having insignificant contributions had higher abundance of CHO compounds, primarily carbohydrates, aligning with previous observations¹⁵ and suggesting common processes driving chemical similarity across permafrost gradients (Fig. 2d).

**Fig. 2: Metabolite-βNTI_feature-derived clusters correlate with specific microbial taxa and environmental factors.**

Strong positive and negative correlations were evident between genome abundances and metabolite clusters in each habitat (Fig. 2a and Supplementary Data 5), probably arising from the peat spatial complexity, and dynamic C distribution creating microbial niches⁴¹ leading to specialization in the utilization of distinct metabolite pools. This was especially evident in bog cluster 1, in which S-containing metabolite clusters (Fig. 2d), with CHOS elemental composition, were linked to metabolome divergence. Cluster 1 metabolites exhibited characteristics of recalcitrance, potentially Sphagnum-moss-derived compounds. They had a high proportion of aromatic rings (modified aromaticity index (AI_mod) > 0.5 and AI_mod > 0.67); were highly unsaturated (double-bond equivalent minus oxygen (DBE_O) > 0); had a low nominal oxidation state of carbon (NOSC < 0), suggesting low bioavailability⁴²; and were enriched in S-containing condensed hydrocarbon and lignin-like compounds (Fig. 2d and Extended Data Fig. 1c). Previous studies have proposed that Sphagnum litter is the primary source of organic S in peat, serving as the main reservoir for recycling^43,44. These S-containing metabolites probably play a key role in SOM cycling within bogs. Comparing metabolites across plant species and peat soil from the three habitats showed that (1) Sphagnum had a lower NOSC than other plants, (2) S-containing compounds were highly consumed in bogs and (3) newly observed lignin-like compounds (compounds not observed in the plant extracts or the peat) were abundant in the bog and fen²⁶. These findings suggest that bog microbial communities co-evolved to efficiently use these metabolites, consistent with previous substrate addition experiments⁴⁵. Moreover, given low S and N levels in bogs, these metabolites may control SOM decomposition rates in the Stordalen Mire.

Organic S compounds, such as those in cluster 1, may serve as reservoirs for microbial sulfate mobilization⁴⁶ via assimilatory or dissimilatory sulfate reduction. Supporting this mechanism in the bog, 55 MAGs across 9 phyla, mostly Acidobacteria, Verrucomicrobia and Proteobacteria, strongly correlated with cluster 1 metabolites and encoded and expressed genes for S metabolism pathways (Extended Data Fig. 2 and Supplementary Note 2).

Sulfur-containing compounds can also result from sulfide (H₂S) incorporation into organic matter by sulfate-reducing microorganisms (SRM)⁴⁷ during dissimilatory sulfate reduction. SOM degradation in peatlands depends on available terminal electron acceptors, forming a redox gradient⁴⁸. As sulfate reduction is more thermodynamically favourable than fermentation and methanogenesis, SRM can outcompete methanogens for substrates, mitigating methane flux^3,23. SRM have versatile substrates from H₂, fatty acids, alcohols to aromatics⁴⁹. Syntrophic SRM–methanogen interactions occur via H₂, formate, acetate⁵⁰, methyl sulfides⁵¹ or direct electron transfer⁵², further influencing GHG production.

Therefore, bacteria involved in organosulfur transformations and methanogen–methanotroph interactions may be key functional species in the bog. Our analysis identifies these potentially crucial bacteria and suggests their role in driving metabolome divergence. Correlations between bacterial genomes and S-containing metabolite clusters hint at their possible use as terminal electron acceptors, aligning with previous proposals²³. Going beyond simply quantifying inorganic S species, our study identifies specific organosulfur metabolites and associated taxa and elucidates complex bog biogeochemical cycling.

Metabolite clusters further correlated with environmental factors (Fig. 2a and Supplementary Note 2). This analysis links individual microbial and metabolic features to assembly processes and environmental parameters, offering insights into the interplay and scale dependencies governing permafrost microbiome and metabolome dynamics⁵³.

Microbial and metabolite assembly linked to greenhouse gas

To explore the connection between ecological assembly and biogeochemical function at the feature scale, we investigated whether metabolically important genomes (those significantly associated with metabolite clusters) were also associated with varying CO₂ levels in the palsa peat or with dissolved CO₂ and CH₄ concentrations in the bog and fen porewater. Even though the measured GHG concentrations do not directly reflect in situ production rates or above ground fluxes, their trends can help us understand how microscale processes may influence systemic outputs⁵⁴. Significant correlations between genome abundances and GHG levels were observed only in the bog with 45 MAGs correlating with CO₂, and 29 MAGs correlating with CH₄ (Supplementary Data 5). Here we focus the discussion only on five of those genomes that were also found to significantly contribute to community assembly (Fig. 3a,b). The lack of significant correlations in other habitats probably suggests that the relationship between ecological processes and biogeochemical functions may occur at different scales along the thaw gradient. In the bog, where there are strong ecological filters such as low pH and the presence of recalcitrant compounds^24,33, specific features (that is, microorganisms) seem to contribute to GHG production (Fig. 3a,b). Meanwhile, in the palsa and fen, this function may depend on the action of many members of the microbial community, making their contributions less important at the feature level.

**Fig. 3: Correlation of genomes that significantly correlated with BNTI_feature metabolite clusters with CO₂ and CH₄ in the bog porewater.**

Of the five bacterial genomes that correlated with GHG levels in the bog porewater, two genomes, Terracidiphilus (Acidobacteria) and Fen-455 (Actinobacteria), had positive correlations with dissolved CO₂ and CH₄ concentration. Holophaga (Acidobacteria) correlated solely with CO₂, while RAAP-2 (Actinobacteria) correlated negatively with CO₂ and CH₄ (Supplementary Data 5).

Members of Acidobacteria have been described as the primary degraders of Sphagnum polysaccharides in peat bogs³. These Terracidiphilus genomes expressed a wide set of enzymes for the degradation and utilization of polysaccharides, highlighting their role in breaking down biopolymers⁵⁵ and contributing to core biogeochemical processes. Genomic and gene expression analysis revealed that Terracidiphilus genomes encoded several genes related to central carbon metabolism and S cycling (Fig. 3, Supplementary Note 3 and Extended Data Fig. 3). The Terracidiphilus genome contained genes for assimilatory sulfate reduction.

Importantly, Acidobacteria genomes expressed genes for producing and/or utilization of common methanogenic substrates (that is, acetate, formate). Some expressed Wood–Ljungdahl pathway genes. The correlation of these genomes with porewater CH₄ levels suggests they are key community members providing or competing for methanogen substrates. Our results highlight their versatile metabolism, linking the presence of these microorganisms to GHG production—insights relevant for environmental modelling and management strategies.

Microbial–metabolite networks in a permafrost thaw gradient

In our systematic exploration of microbial–metabolite interactions, we applied abundance-based co-occurrence networks using RMT-based network analysis^56,57,58. This method helped us study how metabolites influencing metabolome convergence and divergence interacted with the microbial community within each habitat. The networks showed characteristics typical of complex networks, including scale-free node connectivity, small-world properties and high modularity (Supplementary Data 6). A total of 177 MAGs (palsa = 30, bog = 66, fen = 84) constituted the networks, termed networked microbial communities. Node numbers increased with thaw progression, with distinct interaction patterns within each habitat.

Notably, the bog exhibited the most microorganism–metabolite links, followed by the fen and the palsa. This corroborates our previous analysis that the bog has a strong feature-scale organization. Most microorganism–metabolite interactions were negative (Extended Data Fig. 4b), probably reflecting microbial transformation or utilization of metabolites^23,25,59. Limited microorganism–microorganism connections in the palsa were attributed to its dry ombrotrophic conditions. Conversely, increased water content in the bog and fen was associated with greater microbial and metabolite dispersal, probably contributing to increased microorganism–microorganism, microorganism–metabolite and metabolite–metabolite interactions (Extended Data Fig. 4b). Across all three habitats, we identified nodes that played important roles in shaping network structure and stability, including module hubs, which are highly connected nodes within a network module, and connector nodes, which are extensively linked to multiple modules⁵⁶. Interestingly, the bog exhibited a prevalence of connector nodes. These nodes in all three habitats were predominantly lignin-like compounds with CHO and CHNOP elemental composition, characterized by low NOSC values. CHNOP and CHOSP module hubs and connectors contributed to metabolome divergence (Supplementary Data 6). Lignin-like compounds probably serve as primary nutrient sources for microorganisms. Their degradation can produce polyphenols, which are considered recalcitrant and potentially inhibitory to microbial activity under anaerobic conditions⁴⁸, despite evidence of anaerobic degradation³⁵. Consequently, these compounds and their derived polyphenols may influence GHG emissions in peatlands.

Using the METABOLIC pipeline—community option⁶⁰, we examined the functional profiles of the 177 networked MAGs. These communities, primarily Acidobacteria, Actinobacteria, Proteobacteria and Verrucomicrobia phyla, had only three shared MAGs among different habitats. In addition, the bog and fen networks included three archaeal phyla: Halobacteria, Methanobacteriota and Thermoproteota. Alluvial plots (Fig. 4d–f) elucidated microbial community contributions to metabolic and biogeochemical processes. Notably, in the palsa and bog, Acidobacteria played a substantial role in various steps of the C, N and S cycles, evident from numerous negative links with metabolite nodes (Extended Data Fig. 4d,e).

**Fig. 4: Microbial–metabolite co-occurrence networks within each habitat across the thaw gradient and their functional potential.**

Furthermore, metatranscriptomics analyses were conducted to investigate the potential microbial activity associated with the microbial–metabolite networks. The analyses revealed networked Acidobacteria are key degraders of palsa and bog SOM, highly expressing diverse carbohydrate-active enzymes (CAZymes) for plant-derived polysaccharides (for example, those in the glycoside hydrolase and polysaccharide lyase families) and polyphenolic compound degradation (for example, auxiliary activity family with ligninolytic activity⁶¹) (Extended Data Fig. 5), consistent with previous meta-omics data at this site³. The latter finding further supports the correlation of the networked bacteria with lignin-like compound nodes, especially in the bog (Extended Data Fig. 4e), highlighting the use of microbial–metabolite networks for surveying potential microbial activity.

Metatranscriptomics revealed genes expressed for fatty acid (that is, butanoate, propionate) and fermentation end products (that is, glycerol, ethanol, lactate) utilization. Transcripts for propionate fermentation were dominated by Acidobacteria in the palsa and bog. For butanoate, Acidobacteria led in the palsa while sharing with Actinobacteria in the bog. Glycerol and ethanol fermentation transcripts were mostly Acidobacteria in the palsa. In the bog, networked Acidobacteria highly expressed lactate and ethanol fermentation genes. Bog networked Acidobacteria, Verrucomicrobia and Halobacteria expressed acetogenesis genes compared with a broader range of genomes expressing them in the fen (Extended Data Fig. 6).

Metatranscriptomics also revealed the involvement of networked genomes in core redox reactions. Bog and fen Halobacteria and Methanobacteriota expressed hydrogenotrophic methanogenesis genes. Bog Methanobacteriota also expressed methylotrophic methanogenesis genes, while in the fen, only Methanobacteriota genomes were responsible for this process. We observed expression of dissimilatory sulfur oxidation and assimilatory and dissimilatory sulfur reduction in the bog and fen networked bacteria, but only assimilatory sulfur reduction in palsa. These expression patterns in C degradation, fermentation, methanogenesis and S cycling linked to network topology shifts emphasize the importance of metatranscriptomics to measure active processes across the different thaw environments. Unlike metagenomics, which does not ensure gene expression, integrating metatranscriptomics into the correlation networks offers a direct measure of active processes.

Discussion

We used an integrative metagenomic and metabolomic approach to study permafrost peatland ecosystems. Contrary to our initial hypothesis of congruence between microbial communities and metabolites, we observed divergent assembly patterns across the thaw gradient, highlighting the need for a nuanced framework capturing the complex microorganism–metabolite interplay. We integrated multi-omics data using a null modelling framework to quantify ecological processes governing microbial and metabolome assembly. This approach uses phylogenetic metrics to identify signatures of processes like selection and dispersal limitation, providing quantitative estimates of different ecological drivers’ influences on assembly patterns^{13,18,19,62,63}.

While null modelling elucidated key assembly processes, correlation-based analyses complemented this by identifying potential microorganism–metabolite interactions across datasets. Notably, organosulfur compounds, potentially originating from Sphagnum mosses, were identified as important factors influencing both metabolome composition and microbial community assembly, thereby impacting GHG concentrations in porewater. This highlights the importance of S and C cycling dynamics in peatlands as S cycling exerts an important control on organic C degradation and GHG emissions^47,49,64 and the need for further exploration of organosulfur compounds in Sphagnum-dominated peatlands.

Our integrated multi-omics approach sheds light on metabolic processes driving GHG emissions from thawing permafrost. Future work aims to elucidate causal mechanisms through complementary methods such as mapping multi-omics data into metabolic pathways using chromatographic techniques (for example, liquid chromatography tandem mass spectrometry (LC-MS/MS)) and microbial isolation studies. We acknowledge the limitations of relying solely on direct infusion FTICR mass spectrometry (DI-FTICR-MS) for metabolite identification, including the inability to differentiate isomers, or determine molecular structures, as well as issues with signal suppression or enhancement⁶⁵. Moving forward, implementing this technique alongside chromatographic techniques could provide a better understanding of the complex processes in natural ecosystems.

Furthermore, incorporating longer timescales and expanded spatial representation will further refine our understanding of the most relevant microorganism–metabolite interactions. By integrating microbial lineages and metabolite formulas into future mechanistic models of GHG emissions such as those described here⁶⁶, we can gain a focused picture of the key interactions and processes driving emissions across Arctic permafrost ecosystems, particularly given the importance of plant-derived organosulfur compounds and Sphagnum’s role in shaping microbial and metabolic dynamics.

Methods

Site description

The Stordalen Mire, a peat plateau characterized by discontinued permafrost, is located southeast of the Abisko Scientific Research Station in northern Sweden (68.35° N, 19.05° E; 351 m above sea level). Altered climate drivers, specifically permafrost thaw in response to rising temperatures (2.5 °C between 1913 and 2006)⁶⁷, have caused topographic changes in the mire, which have affected hydrological patterns altogether with moisture levels and nutrient availability⁶⁸. As a result, three habitats along the thaw gradient have been formed in the mire with different vegetation²¹, microbial communities³, SOM composition^23,24,25 and rates of greenhouse gas production^27,39. Palsa sites are elevated dry hummocks with intact permafrost (0.5–2.0 m above their surroundings)⁶⁹, while bog and fen sites are wet depressions with partially thawed or thinning permafrost, and totally thawed permafrost, respectively⁶⁸.

Palsa sites are underlain by a thick permafrost layer (10–20 m)⁷⁰. These sites have no measurable water table, with thin peat and active layer depths (0.4–0.7 m)⁶⁹. Palsa vegetation is dominated by dwarf shrubs, feather mosses and lichens^68,69,71. Bog sites receive rainfall and runoff from nearby palsa sites⁶⁹. They are ombrotrophic and have thicker peat (0.5 to >1 m) and active layer (>1 m depth) than the palsa. The bog’s vegetation is characterized by Sphagnum species and small sedges (for example, Sphagnum spp. and Eriophorum vaginatum, respectively)⁷². Fen sites are completely thawed ecosystems. They are minerotrophic and receive surface and groundwater inputs. The water table at the fen sites is at or near the peat surface, providing nutrients to support vegetation such as tall sedges (Carex spp. and Eriophorum. angustifolium) and Sphagnum mosses⁷².

Sample collection

Peat soil samples were collected between June and August 2012 at different depths (shallow, middle and deep) from the three habitats: palsa, Sphagnum-dominated bog and Eriophorum-dominated fen (Supplementary Data 1). Sets of triplicate cores were collected using an 11-cm-diameter homemade circular push corer. To avoid cross-contamination, 1 cm from each core’s edge was not included in the sampling and the corer was rinsed with distilled water between cores. After collection, the cores were divided directly in the field into sections as follows: surface (1–5 cm), middle (10–14 cm) and deep (20–24 cm) and then placed on dry ice until transport to the laboratory where they were stored at −20 °C until further analysis. Cores for microbial analysis were placed in cryotubes, mixed with a LifeGuard solution (MoBio Laboratories) and stored at −80 °C until processing.

Data collection of peat temperature, depth of each core, percentage of carbon and nitrogen content in the solid phase peat and dissolved CO₂ and CH₄ concentrations used in this study was previously described⁷³. Geochemistry data generated in this study are available at the EMERGE Database at https://emerge-db.asc.ohio-state.edu/datasources/0001_Coring2012, https://emerge-db.asc.ohio-state.edu/datasources/0006_GeochemPorewater20102012 and https://emerge-db.asc.ohio-state.edu/datasources/0005_GeochemSolid20102012. Details about sampling date, time of sample collection, global positioning system (GPS) ___location, air and soil temperature, active layer depth and notes related to the sampling can be found at the EMERGE Database at https://emerge-db.asc.ohio-state.edu/datasources/12. For further clarity, we included this information in Supplementary Data 1 (sample metadata table). We believe that these data are in concordance with the requirements described in ref. ⁷⁴. Meteorological data used in this study, including precipitation and temperature measurements, were retrieved from the Abisko Scientific Research Station, 2012, and the Sweden Meteorological and Hydrological Institute for Stordalen Station 188790 (https://www.smhi.se/data/meteorologi/ladda-ner-meteorologiska-observationer#param=airtemperatureInstant,stations=all,stationid=188790).

FTICR-MS sample preparation and data preprocessing

We implemented DI-FTICR-MS to gain a comprehensive overview of the soil metabolome across the thawing permafrost gradient. This technique leverages the high resolving power, ultrahigh mass accuracy and sensitivity of FTICR-MS⁷⁵, along with rapid data acquisition from direct infusion into the mass spectrometer ion source. As SOM comprises a complex and dynamic mixture of metabolites, DI-FTICR-MS detects and resolves a wide range of individual molecules for characterization of molecular composition and transformation. While chromatography-based methods such as LC-MS/MS enable more accurate structural elucidation, they can be biased towards highly abundant compounds if appropriate care is not taken that increases the analysis time. As the objective here was to capture assembly processes across the soil metabolome, we favoured a high-throughput approach providing wider detection of low-abundance metabolites without focusing extensively on structural annotation beyond biochemical classification.

In this study, we used only water as an extractant to focus on the bioavailable compounds actively cycling with microorganisms in this dynamic thawing environment. Our goal was to mimic the natural thaw conditions as closely as possible. Ultrahigh-resolution characterization of the water-soluble metabolites that were extracted from peat was achieved using a 12 T Bruker FTICR mass spectrometer (Bruker, SolariX) located at the Pacific Northwest National Laboratory using a methodology previously described²³. Briefly, 100 mg of frozen peat soil was combined with 1 ml of Milli-Q water and shaken for 2 h. Then, the water and peat soil mix was centrifuged and the resultant supernatant was mixed with high-performance liquid chromatography (HPLC)-grade methanol (1:2 water to methanol ratio). The resulting solution was injected directly into the 12 T Bruker FTICR mass spectrometer in which a Bruker electrospray ionization source was used to generate negatively charged molecular ions. The negative ionization mode was used for all metabolomics analyses in this study, as previous work has shown that organic matter, which makes up a large proportion of the sample matrix⁷⁶, is predominantly composed of oxygen-containing compounds such as carboxylic acids that ionize best in negative mode. While positive ionization could provide additional coverage of some nitrogen-containing compounds, the focus of this analysis was presence and absence and relative comparisons between samples, rather than capturing all possible metabolites. As all samples were analysed under the same parameters, comparisons should still be valid. However, the use of only the negative ion mode may preclude detection of some metabolites that ionize exclusively in positive mode.

To ensure instrument stability, a Suwannee River fulvic acid standard obtained from the International Humic Substance Society was injected at the beginning of the run. To monitor potential carryover between samples, HPLC-grade methanol blanks were injected throughout the process. In addition, the instrument was flushed between samples with a solution of Milli-Q water and HPLC-grade methanol. Variations in carbon concentration from different samples were controlled by modulating the ion accumulation time that was adjusted for each sample and ranged between 0.1 s and 0.3 s. A total of 144 scans were collected for each sample. Scans were averaged and calibrated using an organic homologous series separated by 14 Da (CH₂). The mass accuracy was <1 ppm for single changed ions across a 100–1,200 m/z range, the mass resolution was ~240,000 at 341 m/z and the transient was 0.8 s.

Raw spectra collected from each sample were converted to a list of m/z values using the BrukerDaltonik version 4.2 FT-MS peak picking module using a signal-to-noise ratio of 7 and an absolute intensity threshold of 100 (default). Out of the 85 samples analysed, 17 were technical replicates in which the mass spectrometry data were collected twice through independent sample injections. These replicate samples allowed assessment of technical variability in the metabolomics data pipeline. The remaining 68 samples corresponded to the biological experimental conditions of interest across different sites, depths and time points; these samples constitute 3 replicates per peat core sampled within each habitat and month, unless otherwise specified (Supplementary Data 1). Chemical formula assignment was performed with Formularity⁷⁷ using parameters described in ref. ⁷⁸. Compounds with m/z values outside of 200 to 900 m/z and isotopic peaks (¹³C-peaks) were filtered out for downstream analysis.

Log-transformed FTICR-MS intensities for the extracts of the solid peat used in this study were previously published²⁶ and are available from the EMERGE Database at https://emerge-db.asc.ohio-state.edu/datasources/0141_Wilson-etal-2022-STOTEN_ICR-plants. We eliminated eight samples from this dataset because the number of m/z detected in them was very low compared with that of the rest of the samples, less than 200 m/z (and in some cases close to zero m/z) compared with the more than 2,000 m/z detected in most of the samples. These differences had the potential to introduce bias in the analysis. Furthermore, the number of m/z detected in the other replicates and technical replicates (when available) of these samples was like the rest of the dataset, suggesting that the eliminated samples might have suffered from issues during the m/z collection process. The eliminated samples were Aug_E_3_D_2012, Aug_P_3_D_2012, Aug_S_2_S_2012, Aug_S_3_D_2012, Aug_S_3_M_2012, July_P_3_D_2012, July_S_3_D_2012 and June_E_1_M_2012.

Standard indices inferred from FTICR molecular formulae (that is, Kendrick defect, double-bond equivalence (DBE), aromaticity index (AI), NOSC and the standard Gibb’s free energy (ΔG°C-ox)) were calculated using the ‘ftmsRanalysis’ R package version 1.1.0 (ref. ⁷⁹).

Potential biochemical transformations occurring between metabolites identified with FTICR-MS were estimated as follows: pairwise differences between chemical masses (m/z values) were calculated and mapped to a database containing 1,255 known chemical transformations, retrieved from the Kyoto Encyclopedia of Genes and Genomes (KEGG) compound database^80,81, and previously described in ref. ¹³. Mass differences within 1 ppm from the known transformations were considered in the analysis. This approach allows identifying potential relationships among metabolites, which are represented as transformation networks, in which chemical masses (m/z values) are represented as nodes and pairwise mass differences as edges.

Metabolite dendrogram construction

In this study, we applied community ecology metrics (for example, beta diversity) to understand metabolite assembly¹³, using three relational dendrograms built based on (1) metabolite molecular characteristics (molecular characteristic dendrogram (MCD)), (2) metabolite potential biochemical transformations (transformation dendrogram (TD)) and (3) metabolite transformation-weighted characteristics (transformation-weighted characteristic dendrogram (TWCD)). Moreover, metabolite dendrograms were built using binary presence and absence values instead of peak intensities to avoid biases in abundance estimates due to charge competition.

Briefly, the MCD was constructed using an unweighted pair-cluster method using arithmetic averages (UPGMA) hierarchical clustering analysis of the Euclidean distance matrix derived from between-metabolite similarities calculated based on their molecular properties (that is, elemental composition, double-bond equivalence, modified aromaticity index and Kendrick’s mass defect). The TD was constructed based on a transformation distance matrix derived from transformation networks that represent putative biochemical reactions occurring between different metabolites. Finally, the TWCD was created by combining previously mentioned matrices using a UPGMA hierarchical clustering analysis¹³.

The results shown in this study are derived from the TWCD, which is a combination of the molecular characteristic and transformation-based dendrograms. The full suite of metabolites represented 14,432 peaks. Of those, 6,763 were assigned a molecular formula (MCD), 13,177 peaks were part of potential biochemical transformations (TD) and 6,526 peaks were included in the combined dendrogram.

DNA extraction, metagenome sequencing, assembly and binning

DNA extraction and sequencing for these samples were performed using the PowerMax Total Nucleic Acid extraction kit (MoBio) and sequenced with a combination of HiSeq (2 × 100 bp) and NextSeq (2 × 150 bp) platforms, protocols described in ref. ³. For the analysis in this paper, we leveraged an existing MAG database³¹ built from MAGs derived from field samples collected from 2010 to 2017 at the Stordalen Mire, MAGs from a previously published study using the 2011–2012 samples³ and MAGs from a stable isotope probing experiment performed on field peat with labelled litter added (‘SIP study’). Full details about the database construction can be found in ref. ³² and are briefly summarized below.

Reads from both the field and SIP studies were cleaned via Trimmomatic⁸². In both studies, assembly and binning were performed on all samples independent of each other (no co-assembly). Field reads were assembled with SPAdes (v3.12, with -meta option) with default kmer sets⁸³. For the field assemblies, initial bin sets for each sample were generated using the UniteM (v0.0.18) workflow⁸⁴, with the following options: mb_sensitive, mb_verysensitive, mb_specific, mb_veryspecific, mb2, max40, max107, bs and gm2. Then, two rounds of ensemble binning were performed with DAS Tool (v1.1.1)⁸⁵, MetaWRAP (v1.0.6)⁸⁶ and UniteM (v0.0.18)⁸⁴, with the output from the first round of ensemble binning being used as input for the second round. Following the second round of ensemble binning, completeness and contamination statistics of the resulting ensemble bins were assessed via CheckM (v1.0.12)⁸⁷ lineage workflow. For each of the three ensemble bin sets, bins with 70% completion and less than 10% contamination were used to calculate a quality score: completeness − (5 × contamination). A candidate bin set for each sample was chosen based on the ensemble binning tool that had yielded the bin set with the highest quality score. The bins in each candidate bin set were then further refined with RefineM (v0.0.24)⁸⁸ and manually examined in anvi’o (v.5.2)⁸⁹. The SIP study reads were assembled with SPAdes (-meta option enabled) and MEGAHIT (v.1.1.3, default kmer set)⁹⁰, and bins were generated with MetaBAT2 (v.2.12.1)⁹¹. Additional MAGs from all studies were generated independently via the Department of Energy Joint Genome Institute (DOE JGI) metagenome annotation pipeline⁹², and downloaded in December 2020.

Together, the database comprises 13,290 MAGs with at least 70% completeness and 10% contamination (determined via CheckM (ref. ⁸⁷)). The ~13,000 MAGs were reduced to 1,806 genomes when dereplicated at the 95% similarity (species) level via galah (https://github.com/wwood/galah (ref. ⁹³)). The relative abundance of each MAG was assessed by read mapping with CoverM (ref. ⁹⁴; https://github.com/wwood/CoverM), using the same relative abundance calculation strategy as in ref. ³ (that is, using the CoverM genome option, with the following parameters: –min-read-percent-identity 0.95, –min-read-aligned-percent 0.75, -coupled and -m trimmed_mean (Supplementary Data 1)). For all statistical analyses, the relative abundance of each species in each sample was calculated by dividing its coverage by the total coverage of all species in the dereplicated set. MAGs were annotated using DRAM (1.4.0)⁹⁵ and can be found at https://doi.org/10.5281/zenodo.7587534.

This study focused on 2012 as it provided matching metagenomic and metabolomic data, for 67 samples, and the desired temporal resolution. To maximize genome-resolved inferences from these data, we queried the 2012 metagenomes with the site-specific MAG database spanning samples from 2011 to 2017 of 13,290 MAGs (1,806 when dereplicated to species level; BioProject PRJNA386568 (ref. ³²)). Read mapping with CoverM to the dereplicated dataset resulted in an abundance table of 1,402 species-dereplicated MAGs that were present in 2012 samples and were used for subsequent analyses.

To quantify the portion of the microbial community diversity present in the 2012 samples that was represented by the 1,402 MAGs, we applied the ‘appraise’ tool from SingleM (version 0.15.1)⁹⁶ setting the sequence identity threshold to 0.86 (genus-level divergence). Our analysis revealed that the 1,402 MAGs collectively accounted for 71.8% of the bacterial community and 62.9% of the archaeal community at the genus level within the 2012 samples.

Metatranscriptomics

RNA extraction and sequencing of 2012 samples were described in ref. ³. Briefly, 240 ng of RNA extracted from peat material was further cleaned using DNAse I (Roche) to remove residual RNA, then library preparation was performed using ScriptSeq Complete (Bacterial) low-input library kits (Epicentre). Agilent 2100 Bioanalyzer and Agilent 2200 Tapestation (Agilent Technologies) were used to test the quality of the RNA and libraries. RNA quantity was measured using Qubit (ThermoFisher Scientific). Samples were sequenced on 1/8th of a NextSeq (Illumina) lane, with initial shallow runs conducted on 1/11th of a HiSeq (Illumina) and MiSeq (Illumina) lanes³.

Forward-stranded metatranscriptome reads were processed using TranscriptM (ref. ⁹⁷) v0.3.1, including QC by KneadData (https://github.com/biobakery/biobakery/wiki/kneaddata), genomic-DNA decontamination and read counting to produce transcripts per million (TPM) per gene per genome (Supplementary Data 7). DRAM⁹⁵ was used to perform gene calling and assign KEGG^80,81 and CAZyme annotations⁹⁸ (http://www.cazy.org/). GraftM (v.0.15.0)⁹⁹ packages were generated for homologous genes forward and reverse dissimilatory sulfite reductase (dsrAB/rdsrAB) grouped under the same KEGG IDs (K11180, K11181), as follows. Seed sequences (with experimentally confirmed functioning where available) were searched against Uniref90 (ref. ¹⁰⁰) r2022_01 using MMseqs2 (ref. ¹⁰¹) easy-search. A total of 300 sequences per seed were then combined with matching sequences from our MAGs to create hidden Markov models (HMMs) and phylogenetic trees using GraftM create. The trees were manually annotated in ARB (ref. ¹⁰²) to label each clade with the function of the seed sequences. These trees were used to classify the matching sequences from our MAGs and encode their annotation as a specific homologue. Then, manually curated KEGG-based definitions (Supplementary Data 8) for carbon degradation, fermentation and fixation and methanogenesis and sulfur redox cycling pathways were used to transform TPM values for genes to a pathway-average count per pathway. For methanogenesis, curation was performed following the methods described in ref. ¹⁰³. Briefly, TPM was averaged across reactions in a pathway and summed across alternative pathways.

Microbial tree construction

Bacteria and archaea phylogenetic trees were built from MAGs stored at the EMERGE Database³¹ using the de novo workflow of the GTDB toolkit version 1.5.1 (ref. ¹⁰⁴) (reference data version r202). Bacterial and archaeal trees were initially built separately, as each group has different marker genes. The bacteria tree was rooted with Patescibacteria as the outgroup, and the archaeal tree was rooted with Altarchaeota as the outgroup. After the trees were built, they were joined at the archaeal–bacterial split proposed in ref. ¹⁰⁵. Tree tips that were not present in any sample (for example, those from GTDB and those from the EMERGE MAG Database deriving from different studies) were dropped and not used for the analysis.

Microbial phylogenetic signal

The microbial community assembly from 2012 peat samples was investigated using ecological null modelling^18,19, which assumes phylogenetic signal is correlated with environmental optima. The phylogenetic signal explains the tendency of closely related species to resemble each other ecologically more than they do the other randomly selected species in a tree¹⁰⁶. To evaluate the phylogenetic signal, we estimated the environmental optima of each microbial operational taxonomic unit (OTU) for the following variables: peat temperature, average depth of the core, carbon and nitrogen ratio, and precipitation. Briefly, OTU environmental optima were calculated based on the abundance-weighted mean values for each environmental variable (function optima, ‘analogue’ package version 0.17.6)¹⁰⁷. The Euclidean distance matrix was calculated from the optima estimates, which represents differences in ecological niches across OTUs. A matrix of between-OTU phylogenetic distance was calculated from the microbial phylogenetic tree (function cophenetic.phylo, ‘picante’ package version 1.8.2)¹⁰⁸. Finally, the phylogenetic signal was evaluated by quantifying the relationship between these matrices via a Mantel correlogram as in ref. ¹⁹ (Supplementary Fig. 1).

β-diversity analysis and ecological null modelling

To estimate and compare the ecological processes driving microbial (n = 67, with paired metabolomics data) and metabolite (n = 85) assemblages, the βNTI and Raup–Crick Bray–Curtis (RC_BC) index were calculated for the microbial phylogenetic tree and metabolite dendrogram following the methodology described in ref. ¹³. Briefly, the observed microbial and metabolite β-mean nearest taxon distance (βMNTD) was estimated using the comdistnt function from the package ‘picante’ version 1.8.2 (ref. ¹⁰⁸) and was compared with the null model (obtained from 1,000 randomizations). Then, the βNTI was calculated by normalizing the observed βMNTD with the null expectation following ref. ¹⁸). Using this approach, the influence of ecological processes can be differentiated as follows: for |βNTI| > 2 deterministic processes are assumed to predominantly shape metabolite and microbial assemblages, suggesting that environmental abiotic factors and biotic interactions determine (or impose selection) changes in species diversity and composition (in the case of microbial communities)^16,19 while biotic and abiotic transformations control fluctuations of metabolites within metabolite assemblages (for example, differences in production and degradation rates). If |βNTI| < 2, then it is assumed that stochasticity drives changes in species diversity, relative abundance and composition owing to random (unpredicted) disturbances. Similarly, for metabolites, dispersal can be explained by physical forces or vector movements that cause changes in the metabolome composition of the system¹³. Moreover, when βNTI > 2, variable selection explains how divergent environmental factors cause high compositional turnover between a pair of communities analysed, and when βNTI < −2, homogenous selection describes how steady selective pressures originated from persistent environmental conditions are the main cause of low compositional turnover between a pair of local communities¹⁸.

Significant differences between βNTI (microbial and metabolite) within each habitat were determined using a two-sided Mann–Whitney U test using the package ‘rstatix’ version 0.7.2 (ref. ¹⁰⁹), and multiple testing was corrected with the Bonferroni method.

Stochastic ecological processes were further investigated using the RC_BC turnover index. Briefly, the observed presence-and-absence-based Bray–Curtis values derived from pairwise comparisons were estimated and compared with the null expectation (generated after 1,000 randomizations). Then, deviations of the observed values from the null comparisons were normalized between +1 and −1 (RC_BC metric). If RC_BC > 0.95 and |βNTI| < 2, then higher-than-expected compositional differences between a pair of communities (or metabolomes) are primarily due to dispersal limitation enabling ecological drift. However, if RC_BC < −0.95 and |βNTI| < 2, then lower-than-expected compositional differences between a pair of communities (or metabolomes) are primarily due to homogenizing dispersal. Finally, if |RC_BC| < 0.95 and |βNTI| < 2, then the compositional turnover between a pair of local communities (or metabolomes) is not dominantly driven by selection, dispersal or ecological drift and this scenario is referred to as being undominated¹⁹.

Correlations with environmental variables

To further understand drivers that control deterministic or stochastic processes influencing metabolite assembly, calculated βNTI values were correlated with environmental data including peat temperature, average depth of the core, carbon and nitrogen ratio, and precipitation. The precipitation value used represents a 3 day accumulation before the sampling day (Supplementary Data 9). Mantel tests (Mantel function, ‘vegan’ package v2.5-7)¹¹⁰ using a Pearson correlation and 9,999 permutations were used to estimate correlations between peat and microbial samples’ βNTI (n = 67, where the sample number was filtered to include only those that have matching microbiome and metabolome data), and the pairwise differences of each environmental variable between samples. Mantel statistics were calculated within each habitat and multiple testing was corrected with the Bonferroni correction method.

Feature-specific βNTI estimation

We further used a recently developed approach¹⁵ that performs null modelling at the feature level (βNTI_feature); a feature within this Article is a microbial community member or metabolite (FTICR-MS molecular formula) that forms part of a phylogenetic tree or relational dendrogram, respectively. This approach allows us to investigate how ecological pressures differentially affect specific community members and metabolites. The βNTI_feature for microbial and metabolite data was estimated within the three habitats following ref. ¹⁵ (microbial: palsa, n = 21; bog, n = 22; fen, n = 24; metabolome: palsa, n = 29; bog, n = 27; fen, n = 29). Briefly, the MAG-derived phylogenetic tree and the relative abundance OTU matrix were used for the microbial analysis, while the TWCD dendrogram and the FTICR peak intensity matrix (transformed to presence and absence) were used for the metabolites. The βNTI_feature was estimated in a similar way to the community βNTI; the βMNTD index of an individual feature (for example, the OTU or molecular formula) is calculated using the formula below:

$${\upbeta \mathrm{MNTD}}_{\mathrm{feat}}=\,\frac{1}{n}\mathop{\sum }\limits_{j=1}^{n}{f}_{{a}_{i}}\left(({,d}_{{a}_{i}{b}_{i}},)\right)$$

In this formula, ${f}_{{a}_{i}}$ represents the relative abundance of the feature $a$ in relation to the community $i$, while $n$ represents the number of samples and $\min ({d}_{{a}_{i}{b}_{i}})$ is the average minimum relational distance of fixed feature $a$ in relation to the fixed community $i$ to any other feature $b$ in the other communities $j$. Conspecifics were not removed from both datasets. The term ‘fixed’ indicates that this calculation is performed in one feature at a time; in other words, one microbial member or FTICR molecular formula is compared with the rest of the microbial members and metabolites at a single time (see further details in ref. ¹⁵). Then, in a similar way as the calculation at the community level, βMNTD_feat was estimated using the comdistnt function from the package ‘picante’ version 1.8.2 (ref. ¹⁰⁸) and was compared with the null model (obtained from 999 randomizations). Finally, βNTI_feature is calculated by the difference of the observed βMNTD_feat with the average of the null results ($\scriptstyle{\underline{\upbeta \mathrm{MNTD}}}_\mathrm{feat}^\mathrm{null}$), divided by the standard deviation of the null values, using the following formula:

$$\upbeta \mathrm{NT{I}_{feat}}=\,\frac{\upbeta \mathrm{MNTD}_\mathrm{feat}^\mathrm{obs}-{\underline{\upbeta \mathrm{MNTD}}}_\mathrm{feat}^\mathrm{null}}{\upbeta \mathrm{MNT{D}_{feat}^{sd}}}$$

To understand how a specific feature, either a specific taxon or metabolite, contributes towards the community variation at a specific scale, we used the following rules: if |βNTI_feature| < 1, the contribution is considered insignificant, and if 1 < |βNTI_feature| < 2, the contribution is considered moderate, whereas if |βNTI_feature| > 2, then the contribution of a specific taxon or metabolite is considered significant. Moreover, if the βNTI_feature was negative, then the feature was assumed to contribute to community convergence (ecological and functional similarities) whereas positive values represented contributions to divergence (ecological and functional differences)¹⁵.

Feature-specific βNTI-derived metabolite clusters

A hierarchical cluster analysis of metabolites (palsa, n = 29; bog, n = 27; fen, n = 29) was performed in R using the package ‘cluster’ version 2.1.3, using a modification of a previously described approach¹¹¹ but using the βNTI_feature value estimated in a previous section instead of metabolite abundances. Only metabolite features that were in 50% of the samples were kept. A distance matrix of the βNTI_feature values within each habitat was calculated using Manhattan distance using the function daisy. The resulting dissimilarity matrix was used for clustering the metabolites using the function pam. The optimal number of clusters (k) was determined by calculating the silhouette value of every metabolite, which represents the ratio of the distances to members of its own cluster to distances to members of the nearest neighbour cluster. For each k, the average silhouette value (ASV), also called silhouette coefficient¹¹², was calculated, to find which k maximizes the ASV.

A consensus βNTI_feature matrix was calculated as the median of the βNTI_feature value of the representative features of each cluster in each sample. Representative features (metabolites) were determined as those whose silhouette value was higher than the ASV of their respective cluster. This approach provided a better representation of the patterns of βNTI_feature values among the clusters¹¹¹.

Differences between the biochemical indexes AI_mod, DBE_O and NOSC of the representative features were determined using a Wilcoxon test.

Correlations between the consensus βNTI_feature values and environmental factors and individual microbial abundances were calculated using a Spearman’s rank test with the Hmisc package version 5.0.1 (ref. ¹¹³). Correlation P values were adjusted using the false discovery rate (FDR) method. Correlation networks were visualized using the igraph package version 1.4.1 (ref. ¹¹⁴^,115), ggraph package version 2.1.0 (ref. ¹¹⁶) and tidygraph version 1.2.3 (ref. ¹¹⁷). The abundance of genomes that significantly correlated with the consensus βNTI_feature metabolite clusters were correlated with CO₂ and CH₄ levels in palsa peat, and bog and fen porewater using Spearman’s rank test; the FDR method was used for multiple testing correction.

Microbial–metabolite co-occurrence networks

Co-occurrence networks of metabolite and microbial abundance data from each habitat (n = 67, analysis within each habitat included palsa, n = 21; bog, n = 22; and fen, n = 24) were constructed using the Molecular Ecological Network Analyses (MENAP) pipeline (http://ieg4.rccc.ou.edu/mena/main.cgi)^56,57,58. Only metabolites identified to significantly contribute towards convergence or divergence were included. Metabolite and microbial abundance were transformed separately using the centred log ratio transformation, then combined into a single matrix and uploaded to the MENAP web server. Data were filtered to include only features that were present in at least 50% of the samples for the accuracy and reliability of our correlation calculations^57,118. The similarity matrix was built using Spearman’s correlation, and RMT (ref. ⁵⁷) was applied to objectively determine the association threshold, thus preventing the use of arbitrary cut-offs that can introduce uncertainties during the process of building the networks. Further advantages of using this approach can be reviewed in refs. ^56,57,58,118. Networks were divided into modules using the greedy modularity optimization algorithm¹¹⁹. Nodes were assigned to one of four nodal topological roles based on their within-module connectivity (Zi) and intermodule connectivity (Pi)⁵⁶. Finally, networks were visualized using igraph R package version 1.4.1 (ref. ¹¹⁴) using a Fruchterman–Reingold layout algorithm, and metabolite nodes were coloured based on the elemental composition. Topological roles of nodes were classified into peripherals, module hubs and connectors based on the within-module connectivity (Zi) and participation coefficient (Pi)⁵⁶. Also, the MENAP pipeline calculates different network topological parameters including network size (n), number of links (L), power-law fitting of node degrees, average connectivity or degree, average cluster coefficient, average path distance (GD), geodesic efficiency (E), harmonic geodesic distance (HD), maximal degree, centralization of degree (CD), maximal betweenness, centralization of betweenness (CB), maximal stress centrality, centralization of stress centrality (CS), maximal eigenvector centrality, centralization of eigenvector centrality (CE), density (D), reciprocity, transitivity (Trans), connectedness (Con), efficiency, hierarchy, lubness, number module and modularity. Finally, to test the significance of the networks, a total of 100 random networks were constructed by rewiring the links (edges) among nodes while constraining n and L. The network properties of the randomized networks are calculated altogether with means and standard deviation, which are compared with the original or empirical network. This process was performed in the MENAP pipeline.

Functional annotation of networked microbial communities

The genomes of networked microbial communities from each habitat were annotated using the Metabolic and Biogeochemistry Analyses in Microbes (METABOLIC) pipeline in community mode (METABOLIC-C.pl)⁶⁰. The METABOLIC pipeline was run with default parameters, using a total of 30 MAGs as an input for the palsa, 66 MAGs for the bog and 84 MAGs for the fen. Briefly, this pipeline uses Prodigal from gene calling and annotates them using three sets of HMM-based databases: KOfam (ref. ¹²⁰), TIGRfam (ref. ¹²¹) and Pfam (ref. ¹²²) as well as custom metabolic HMM profiles. In addition, CAZymes are annotated using dbCAN2 (ref. ¹²³).

Alluvial plots generated by the METABOLIC pipeline were used to represent the contribution of different metagenomes (MAGs) to individual metabolic and biogeochemical processes within the carbon, nitrogen, sulfur and other cycles.

For MAGs that showed the strongest correlations with specific metabolites, we further analysed their corresponding metatranscriptomic data. This analysis validated the expression of genes involved in the biosynthesis pathways of these metabolites, reinforcing the functional connection between the identified microbial populations and the observed metabolite profiles.

Statistics

All statistical analyses and visualization were performed using the R statistical language versions 4.1.0 and 4.2.1 (ref. ¹²⁴) and with the packages ggplot2 (v3.4.2)¹²⁵, ftmsRanalysis (v1.1.0)⁷⁹, picante (v1.8.2)¹⁰⁸, rstatix (v0.7.2)¹⁰⁹, vegan (v2.5-7)¹¹⁰, cluster (v 2.1.3)¹²⁶, igraph (v1.4.1)¹¹⁴, ggraph (v 2.1.0)¹¹⁶, tidygraph (v1.3.1)¹¹⁷, Hmisc (v5.0.1)¹¹³, patchwork (v.1.1.2)¹²⁷ and ggpubr (v.6.0)¹²⁸.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The metagenomes and metatranscriptomes used in this paper are available via Zenodo at https://doi.org/10.5281/zenodo.10426238 (BioProject PRJNA386568)³¹. The annotation of the MAGs are available via Zenodo at https://doi.org/10.5281/zenodo.7587534. FTICR-MS intensities for the extracts of the solid peat used in this study were described in ref. ²⁶ and are available from the EMERGE Database at https://emerge-db.asc.ohio-state.edu/datasources/0141_Wilson-etal-2022-STOTEN_ICR-plants. Data of peat temperature, depth of each core, percentage of carbon and nitrogen content in the solid phase peat, and dissolved CO₂ and CH₄ concentrations used in this study were described in ref. ⁷³. The data are also available in the EMERGE Database at https://emerge-db.asc.ohio-state.edu/datasources/0001_Coring2012, https://emerge-db.asc.ohio-state.edu/datasources/0006_GeochemPorewater20102012 and https://emerge-db.asc.ohio-state.edu/datasources/0005_GeochemSolid20102012. Meteorological data used in this study, including precipitation and temperature measurements, were retrieved from the Abisko Scientific Research Station, 2012, and the Sweden Meteorological and Hydrological Institute for Stordalen Station 188790 (https://www.smhi.se/data/meteorologi/ladda-ner-meteorologiska-observationer#param=airtemperatureInstant,stations=all,stationid=188790). The KOfam database (https://www.genome.jp/ftp/db/kofam/) was used for annotation as part of the DRAM pipeline⁹⁵. Source data are provided with this paper.

Code availability

All scripts for data processing and visualization are available via GitHub at https://github.com/tfaily-lab/Metabolome_permafrost and Zenodo at https://doi.org/10.5281/zenodo.12571699 (ref. ¹²⁹).

References

Ma, L. et al. A globally robust relationship between water table decline, subsidence rate, and carbon release from peatlands. Commun. Earth Environ. 3, 254 (2022).
Article Google Scholar
Tanentzap, A. J. et al. Chemical and microbial diversity covary in fresh water to influence ecosystem functioning. Proc. Natl Acad. Sci. USA 116, 24689–24695 (2019).
Article PubMed PubMed Central CAS Google Scholar
Woodcroft, B. J. et al. Genome-centric view of carbon processing in thawing permafrost. Nature 560, 49–54 (2018).
Article PubMed CAS Google Scholar
Dettmer, K., Aronov, P. A. & Hammock, B. D. Mass spectrometry-based metabolomics. Mass Spectrom. Rev. 26, 51–78 (2007).
Article PubMed PubMed Central CAS Google Scholar
Shah, R. M. et al. Omics-based ecosurveillance uncovers the influence of estuarine macrophytes on sediment microbial function and metabolic redundancy in a tropical ecosystem. Sci. Total Environ. 809, 151175 (2022).
Article PubMed CAS Google Scholar
Jansson, J. K. & Hofmockel, K. S. Soil microbiomes and climate change. Nat. Rev. Microbiol. 18, 35–46 (2020).
Article PubMed CAS Google Scholar
Turnbaugh, P. J. & Gordon, J. I. An invitation to the marriage of metagenomics and metabolomics. Cell 134, 708–713 (2008).
Article PubMed CAS Google Scholar
Bauermeister, A., Mannochio-Russo, H., Costa-Lotufo, L. V., Jarmusch, A. K. & Dorrestein, P. C. Mass spectrometry-based metabolomics in microbiome investigations. Nat. Rev. Microbiol. 20, 143–160 (2022).
Article PubMed CAS Google Scholar
Peñuelas, J. & Sardans, J. Ecological metabolomics. Chem. Ecol. 25, 305–309 (2009).
Article Google Scholar
Jones, O. A. H. et al. Metabolomic analysis of soil communities can be used for pollution assessment. Environ. Toxicol. Chem. 33, 61–64 (2014).
Article PubMed CAS Google Scholar
Sokol, N. W., Sanderman, J. & Bradford, M. A. Pathways of mineral‐associated soil organic matter formation: integrating the role of plant carbon source, chemistry, and point of entry. Glob. Change Biol. 25, 12–24 (2019).
Article Google Scholar
Tang, J. & Riley, W. J. Weaker soil carbon–climate feedbacks resulting from microbial and abiotic interactions. Nat. Clim. Change 5, 56–60 (2015).
Article CAS Google Scholar
Danczak, R. E. et al. Using metacommunity ecology to understand environmental metabolomes. Nat. Commun. 11, 6369 (2020).
Article PubMed PubMed Central CAS Google Scholar
Graham, E. B. et al. Multi’omics comparison reveals metabolome biochemistry, not microbiome composition or gene expression, corresponds to elevated biogeochemical function in the hyporheic zone. Sci. Total Environ. 642, 742–753 (2018).
Article PubMed CAS Google Scholar
Danczak, R. E. et al. Inferring the contribution of microbial taxa and organic matter molecular formulas to ecological assembly. Front. Microbiol. 13, 803420 (2022).
Article PubMed PubMed Central Google Scholar
Chase, J. M. & Myers, J. A. Disentangling the importance of ecological niches from stochastic processes across scales. Philos. Trans. R. Soc. B 366, 2351–2363 (2011).
Article Google Scholar
Leibold, M. A. The niche concept revisited: mechanistic models and community context. Ecology 76, 1371–1382 (1995).
Article Google Scholar
Stegen, J. C., Lin, X., Fredrickson, J. K. & Konopka, A. E. Estimating and mapping ecological processes influencing microbial community assembly. Front. Microbiol. 6, 370 (2015).
Article PubMed PubMed Central Google Scholar
Stegen, J. C., Lin, X., Konopka, A. E. & Fredrickson, J. K. Stochastic and deterministic assembly processes in subsurface microbial communities. ISME J. 6, 1653–1664 (2012).
Article PubMed PubMed Central CAS Google Scholar
Emerson, J. B. et al. Host-linked soil viral ecology along a permafrost thaw gradient. Nat. Microbiol. 3, 870–880 (2018).
Article PubMed PubMed Central CAS Google Scholar
Johansson, T. et al. Decadal vegetation changes in a northern peatland, greenhouse gas fluxes and net radiative forcing. Glob. Change Biol. 12, 2352–2369 (2006).
Article Google Scholar
Hough, M. et al. Coupling plant litter quantity to a novel metric for litter quality explains C storage changes in a thawing permafrost peatland. Glob. Change Biol. 28, 950–968 (2022).
Article CAS Google Scholar
AminiTabrizi, R. et al. Controls on soil organic matter degradation and subsequent greenhouse gas emissions across a permafrost thaw gradient in Northern Sweden. Front. Earth Sci. 8, 557961 (2020).
Article Google Scholar
Hodgkins, S. B. et al. Changes in peat chemistry associated with permafrost thaw increase greenhouse gas production. Proc. Natl Acad. Sci. USA 111, 5819–5824 (2014).
Article PubMed PubMed Central CAS Google Scholar
Hodgkins, S. B. et al. Elemental composition and optical properties reveal changes in dissolved organic matter along a permafrost thaw chronosequence in a subarctic peatland. Geochim. Cosmochim. Acta 187, 123–140 (2016).
Article CAS Google Scholar
Wilson, R. M. et al. Plant organic matter inputs exert a strong control on soil organic matter decomposition in a thawing permafrost peatland. Sci. Total Environ. 820, 152757 (2022).
Article PubMed CAS Google Scholar
Varner, R. K. et al. Permafrost thaw driven changes in hydrology and vegetation cover increase trace gas emissions and climate forcing in Stordalen Mire from 1970 to 2014. Philos. Trans. R. Soc. A 380, 20210022 (2022).
Article CAS Google Scholar
Dini-Andreote, F., Stegen, J. C., Van Elsas, J. D. & Salles, J. F. Disentangling mechanisms that mediate the balance between stochastic and deterministic processes in microbial succession. Proc. Natl Acad. Sci. USA 112, E1326–E1332 (2015).
Article PubMed PubMed Central CAS Google Scholar
Doherty, S. J. et al. The transition from stochastic to deterministic bacterial community assembly during permafrost thaw succession. Front. Microbiol. 11, 596589 (2020).
Article PubMed PubMed Central Google Scholar
Mondav, R. et al. Microbial network, phylogenetic diversity and community membership in the active layer across a permafrost thaw gradient. Environ. Microbiol. 19, 3201–3218 (2017).
Article PubMed CAS Google Scholar
Cronin D. & NSF EMERGE Biology Integration Institute Metagenome-assembled genomes (MAGs) from Stordalen Mire, Sweden (0.0.1-beta). Zenodo https://doi.org/10.5281/zenodo.10426238 (2023).
McGivern, B. B. et al. Microbial polyphenol metabolism is part of the thawing permafrost carbon cycle. Nat. Microbiol. 9, 1454–1466 (2024).
Article PubMed PubMed Central CAS Google Scholar
Fudyma, J. D. et al. Untargeted metabolomic profiling of Sphagnum fallax reveals novel antimicrobial metabolites. Plant Direct 3, e00179 (2019).
Article PubMed PubMed Central Google Scholar
Andersen, R., Chapman, S. & Artz, R. Microbial communities in natural and disturbed peatlands: a review. Soil Biol. Biochem. 57, 979–994 (2013).
Article CAS Google Scholar
McGivern, B. B. et al. Decrypting bacterial polyphenol metabolism in an anoxic wetland soil. Nat. Commun. 12, 2466 (2021).
Article PubMed PubMed Central CAS Google Scholar
Fudyma, J. D., Chu, R. K., Graf Grachet, N., Stegen, J. C. & Tfaily, M. M. Coupled biotic–abiotic processes control biogeochemical cycling of dissolved organic matter in the Columbia River hyporheic zone. Front. Water 2, 574692 (2021).
Article Google Scholar
Danczak, R. E. et al. Ecological theory applied to environmental metabolomes reveals compositional divergence despite conserved molecular properties. Sci. Total Environ. 788, 147409 (2021).
Article PubMed CAS Google Scholar
Holmes, M. E. et al. Carbon accumulation, flux, and fate in Stordalen Mire, a permafrost peatland in transition. Glob. Biogeochem. Cycles 36, e2021GB007113 (2022).
Article CAS Google Scholar
McCalley, C. K. et al. Methane dynamics regulated by microbial community response to permafrost thaw. Nature 514, 478–481 (2014).
Article PubMed CAS Google Scholar
Ning, D. et al. A quantitative framework reveals ecological drivers of grassland microbial community assembly in response to warming. Nat. Commun. 11, 4717 (2020).
Article PubMed PubMed Central CAS Google Scholar
Jansson, J. K. & Hofmockel, K. S. The soil microbiome—from metagenomics to metaphenomics. Curr. Opin. Microbiol. 43, 162–168 (2018).
Article PubMed CAS Google Scholar
Wilson, R. M. & Tfaily, M. M. Advanced molecular techniques provide new rigorous tools for characterizing organic matter quality in complex systems. J. Geophys. Res. Biogeosci. 123, 1790–1795 (2018).
Article Google Scholar
Urban, N. R., Eisenreich, S. J. & Grigal, D. F. Sulfur cycling in a forested Sphagnum bog in northern Minnesota. Biogeochemistry 7, 81–109 (1989).
Article Google Scholar
Herndon, E., Richardson, J., Carrell, A. A., Pierce, E. & Weston, D. Sulfur speciation in Sphagnum peat moss modified by mutualistic interactions with cyanobacteria. New Phytol. 241, 1998–2008 (2024).
Article PubMed CAS Google Scholar
Fofana A. et al. Mapping substrate use across a permafrost thaw gradient. Soil Biol. Biochem. 108809 (2022).
Fakhraee, M., Li, J. & Katsev, S. Significant role of organic sulfur in supporting sedimentary sulfate reduction in low-sulfate environments. Geochim. Cosmochim. Acta 213, 502–516 (2017).
Article CAS Google Scholar
Blodau, C., Mayer, B., Peiffer, S. & Moore, T. R. Support for an anaerobic sulfur cycle in two Canadian peatland soils. J. Geophys. Res. Biogeosci. 112, G02004 (2007).
Article Google Scholar
Candry, P., Abrahamson, B., Stahl, D. A. & Winkler, M.-K. H. Microbially mediated climate feedbacks from wetland ecosystems. Glob. Change Biol. 29, 5169–5183 (2023).
Article CAS Google Scholar
Pester, M., Knorr, K.-H., Friedrich, M. W., Wagner, M. & Loy, A. Sulfate-reducing microorganisms in wetlands—fameless actors in carbon cycling and climate change. Front. Microbiol. 3, 72 (2012).
Article PubMed PubMed Central CAS Google Scholar
Alperin, M. J. & Hoehler, T. M. Anaerobic methane oxidation by archaea/sulfate-reducing bacteria aggregates: 1. Thermodynamic and physical constraints. Am. J. Sci. 309, 869–957 (2009).
Article CAS Google Scholar
Moran, J. J. et al. Methyl sulfides as intermediates in the anaerobic oxidation of methane. Environ. Microbiol. 10, 162–173 (2008).
Article PubMed CAS Google Scholar
McGlynn, S. E., Chadwick, G. L., Kempes, C. P. & Orphan, V. J. Single cell activity reveals direct electron transfer in methanotrophic consortia. Nature 526, 531–535 (2015).
Article PubMed CAS Google Scholar
Graham, E. B. et al. Coupling spatiotemporal community assembly processes to changes in microbial metabolism. Front. Microbiol. 7, 1949 (2016).
Article PubMed PubMed Central Google Scholar
Singleton, C. M. et al. Methanotrophy across a natural permafrost thaw environment. ISME J. 12, 2544–2558 (2018).
Article PubMed PubMed Central CAS Google Scholar
Reji, L. & Zhang, X. Genome-resolved metagenomics informs the functional ecology of uncultured Acidobacteria in redox oscillated Sphagnum peat. mSystems 7, e0005522 (2022).
Article PubMed Google Scholar
Deng, Y. et al. Molecular ecological network analyses. BMC Bioinformatics 13, 113 (2012).
Article PubMed PubMed Central Google Scholar
Zhou, J. et al. Functional molecular ecological networks. mBio 1, e00169–10 (2010).
Article PubMed PubMed Central Google Scholar
Zhou, J., Deng, Y., Luo, F., He, Z. & Yang, Y. Phylogenetic molecular ecological network of soil microbial communities in response to elevated CO₂. mBio 2, e00122–00111 (2011).
Article PubMed PubMed Central Google Scholar
Tfaily, M. M. et al. Organic matter transformation in the peat column at Marcell Experimental Forest: humification and vertical stratification. J. Geophys. Res. Biogeosci. 119, 661–675 (2014).
Article CAS Google Scholar
Zhou, Z. et al. METABOLIC: high-throughput profiling of microbial genomes for functional traits, metabolism, biogeochemistry, and community-scale functional networks. Microbiome 10, 33 (2022).
Article PubMed PubMed Central CAS Google Scholar
Levasseur, A., Drula, E., Lombard, V., Coutinho, P. M. & Henrissat, B. Expansion of the enzymatic repertoire of the CAZy database to integrate auxiliary redox enzymes. Biotechnol. Biofuels 6, 41 (2013).
Article PubMed PubMed Central CAS Google Scholar
Zhou, J. et al. Stochasticity, succession, and environmental perturbations in a fluidic ecosystem. Proc. Natl Acad. Sci. USA 111, E836–E845 (2014).
Article PubMed PubMed Central CAS Google Scholar
Stegen, J. C. et al. Quantifying community assembly processes and identifying features that impose them. ISME J. 7, 2069–2079 (2013).
Article PubMed PubMed Central Google Scholar
Bottrell, S. H. et al. Concentrations, sulfur isotopic compositions and origin of organosulfur compounds in pore waters of a highly polluted raised peatland. Org. Geochem. 41, 55–62 (2010).
Article CAS Google Scholar
Koch, B. P., Dittmar, T., Witt, M. & Kattner, G. Fundamentals of molecular formula assignment to ultrahigh resolution mass data of natural organic matter. Anal. Chem. 79, 1758–1763 (2007).
Article PubMed CAS Google Scholar
Graham, E. B. & Hofmockel, K. S. Ecological stoichiometry as a foundation for omics-enabled biogeochemical models of soil organic matter decomposition. Biogeochemistry 157, 31–50 (2022).
Article Google Scholar
Callaghan, T. V. et al. A new climate era in the sub‐Arctic: accelerating climate changes and multiple impacts. Geophys. Res. Lett. 37, L14705 (2010).
Article Google Scholar
Bäckstrand, K., Crill, P. M., Mastepanov, M., Christensen, T. R. & Bastviken, D. Non‐methane volatile organic compound flux from a subarctic mire in northern Sweden. Tellus B 60, 226–237 (2008).
Article Google Scholar
Olefeldt D. & Roulet, N. T. Effects of permafrost and hydrology on the composition and transport of dissolved organic carbon in a subarctic peatland complex. J. Geophys. Res. Biogeosci. https://doi.org/10.1029/2011JG001819 (2012).
Åkerman, H. J. & Johansson, M. Thawing permafrost and thicker active layers in sub‐arctic Sweden. Permafr. Periglac. Process. 19, 279–292 (2008).
Article Google Scholar
Malmer, N., Johansson, T., Olsrud, M. & Christensen, T. R. Vegetation, climatic changes and net carbon sequestration in a North‐Scandinavian subarctic mire over 30 years. Glob. Change Biol. 11, 1895–1909 (2005).
Article Google Scholar
Hough, M. et al. Biotic and environmental drivers of plant microbiomes across a permafrost thaw gradient. Front. Microbiol. 11, 796 (2020).
Article PubMed PubMed Central Google Scholar
Hodgkins, S. B. Changes in Organic Matter Chemistry and Methanogenesis Due to Permafrost Thaw in a Subarctic Peatland (Florida State Univ., 2016).
Morrison, N. et al. Standard reporting requirements for biological samples in metabolomics experiments: environmental context. Metabolomics 3, 203–210 (2007).
Article CAS Google Scholar
Barrow, M. P., Burkitt, W. I. & Derrick, P. J. Principles of Fourier transform ion cyclotron resonance mass spectrometry and its application in structural biology. Analyst 130, 18–28 (2005).
Article PubMed CAS Google Scholar
Stenson, A. C., Landing, W. M., Marshall, A. G. & Cooper, W. T. Ionization and fragmentation of humic substances in electrospray ionization Fourier transform–ion cyclotron resonance mass spectrometry. Anal. Chem. 74, 4397–4409 (2002).
Article PubMed CAS Google Scholar
Tolić, N. et al. Formularity: software for automated formula assignment of natural and other organic matter from ultrahigh-resolution mass spectra. Anal. Chem. 89, 12659–12665 (2017).
Article PubMed Google Scholar
Tfaily, M. M., Hess, N. J., Koyama, A. & Evans, R. D. Elevated [CO₂] changes soil organic matter composition and substrate diversity in an arid ecosystem. Geoderma 330, 1–8 (2018).
Article CAS Google Scholar
Bramer, L. M. et al. ftmsRanalysis: an R package for exploratory data analysis and interactive visualization of FT-MS data. PLoS Comput. Biol. 16, e1007654 (2020).
Article PubMed PubMed Central CAS Google Scholar
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Article PubMed PubMed Central CAS Google Scholar
Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2011).
Article PubMed PubMed Central Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article PubMed PubMed Central CAS Google Scholar
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
Article PubMed PubMed Central CAS Google Scholar
Parks, D. UniteM. GitHub https://github.com/dparks1134/UniteM (2017).
Sieber, C. M. K. et al. Recovery of genomes from metagenomes via a dereplication aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).
Article PubMed PubMed Central CAS Google Scholar
Uritskiy, G. V., DiRuggiero, J. & Taylor, J. MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis. Microbiome 6, 158 (2018).
Article PubMed PubMed Central Google Scholar
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Article PubMed PubMed Central CAS Google Scholar
Parks, D. H. et al. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat. Microbiol. 2, 1533–1542 (2017).
Article PubMed CAS Google Scholar
Eren, A. M. et al. Community-led, integrated, reproducible multi-omics with anvi’o. Nat. Microbiol. 6, 3–6 (2021).
Article PubMed PubMed Central CAS Google Scholar
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
Article PubMed CAS Google Scholar
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
Article PubMed PubMed Central Google Scholar
Huntemann, M. et al. The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4). Stand. Genom. Sci. 11, 17 (2016).
Article Google Scholar
Aroney, S. T. N., Camargo, A. P., Tyson, G. W. & Woodcroft, B. J. Galah: more scalable dereplication for metagenome assembled genomes (v0.4.0). Zenodo https://doi.org/10.5281/zenodo.10526086 (2024).
Aroney, S. T. N. et al. CoverM: read coverage calculator for metagenomics (v0.7.0). Zenodo https://doi.org/10.5281/zenodo.10531254 (2024).
Shaffer, M. et al. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res. 48, 8883–8900 (2020).
Article PubMed PubMed Central CAS Google Scholar
Woodcroft, B. J. et al. SingleM and Sandpiper: robust microbial taxonomic profiles from metagenomic data. Preprint at bioRxiv https://doi.org/10.1101/2024.01.30.578060 (2024).
sternp/transcriptm: public release (v0.3.1). Zenodo https://doi.org/10.5281/zenodo.11090118 (2024).
Drula, E. et al. The carbohydrate-active enzyme database: functions and literature. Nucleic Acids Res. 50, D571–D577 (2021).
Article PubMed Central Google Scholar
Boyd, J. A., Woodcroft, B. J. & Tyson, G. W. GraftM: a tool for scalable, phylogenetically informed classification of genes within metagenomes. Nucleic Acids Res. 46, e59 (2018).
Article PubMed PubMed Central Google Scholar
Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2014).
Article PubMed PubMed Central Google Scholar
Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).
Article PubMed CAS Google Scholar
Ludwig, W. et al. ARB: a software environment for sequence data. Nucleic Acids Res. 32, 1363–1371 (2004).
Article PubMed PubMed Central CAS Google Scholar
Ellenbogen, J. B. et al. Methylotrophy in the Mire: direct and indirect routes for methane production in thawing permafrost. mSystems 9, e00698–23 (2023).
PubMed PubMed Central Google Scholar
Chaumeil, P. A. et al. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2020).
Article CAS Google Scholar
Zhu, Q. et al. Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains Bacteria and Archaea. Nat. Commun. 10, 5477 (2019).
Article PubMed PubMed Central CAS Google Scholar
Blomberg, S. P. & Garland, T. Jr Tempo and mode in evolution: phylogenetic inertia, adaptation and comparative methods. J. Evol. Biol. 15, 899–910 (2002).
Article Google Scholar
Simpson, G. L. Analogue methods in palaeoecology: using the analogue package. J. Stat. Softw. 22, 1–29 (2007).
Article Google Scholar
Kembel, S. W. et al. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463–1464 (2010).
Article PubMed CAS Google Scholar
Kassambara, A. rstatix: pipe-friendly framework for basic statistical tests. R package version 0.7.2 https://cran.r-project.org/web/packages/rstatix/index.html (2023).
Dixon, P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 14, 927–930 (2003).
Article Google Scholar
Merder, J. et al. Dissolved organic compounds with synchronous dynamics share chemical properties and origin. Limnol. Oceanogr. 66, 4001–4016 (2021).
Article CAS Google Scholar
Legendre, P. & Legendre, L. Numerical Ecology (Elsevier, 2012).
Harrell, F. Jr. Hmisc: Harrell miscellaneous. R package version 5.0.1 https://CRAN.R-project.org/package=Hmisc (2023).
Csardi, G. & Nepusz, T. The igraph software package for complex network research. Interjournal Complex Syst. 1695, 1–9 (2006).
Google Scholar
Csárdi, G. et al. igraph: network analysis and visualization in R. R package version 1.4.1 https://CRAN.R-project.org/package=igraph (2023).
Pedersen T. ggraph: an implementation of grammar of graphics for graphs and networks. R package version 2.1.0 https://cran.r-project.org/web/packages/ggraph/index.html (2022).
Pedersen T. tidygraph: a tidy API for graph manipulation. R package version 1.3.1 https://cran.r-project.org/web/packages/tidygraph/index.html (2023).
Yuan, M. M. et al. Climate warming enhances microbial network complexity and stability. Nat. Clim. Change 11, 343–348 (2021).
Article Google Scholar
Newman, M. E. Modularity and community structure in networks. Proc. Natl Acad. Sci. USA 103, 8577–8582 (2006).
Article PubMed PubMed Central CAS Google Scholar
Aramaki, T. et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics 36, 2251–2252 (2019).
Article PubMed Central Google Scholar
Selengut, J. D. et al. TIGRFAMs and genome properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 35, D260–D264 (2007).
Article PubMed CAS Google Scholar
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
Article PubMed CAS Google Scholar
Zhang, H. et al. dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 46, W95–W101 (2018).
Article PubMed PubMed Central CAS Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2019).
Wickham, H. ggplot2. Wiley Interdiscip. Rev. Comput. Stat. 3, 180–185 (2011).
Article Google Scholar
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M. & Hornik, K. Cluster: cluster analysis basics and extensions. R package version 2.1.3 https://CRAN.R-project.org/package=cluster (2022).
Pedersen, T. patchwork: the composer of plots. R package verson 1.1.2 https://github.com/thomasp85/patchwork, https://patchwork.data-imaginist.com (2022).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).
Freire-Zapata, V. & Aroney, S. tfaily-lab/Metabolome_permafrost: metabolome paper Nature Microbiology. Zenodo https://doi.org/10.5281/zenodo.12588199 (2024).

Download references

Acknowledgements

We thank the EMERGE Biology Integration Coordinators (members listed in alphabetical order at the end of the paper) for project guidance and management, and the EMERGE 2012 Field Team (members listed in alphabetical order at the end of the paper) for fieldwork and sample collection. This study was supported, in part, by the Department of Energy Office of Science Biological and Environmental Research Grant DE-SC0021349, awarded to M.M.T. This research is a contribution of the EMERGE Biology Integration Institute, which also supports V.F.-Z., H.H.-M., D.R.C., R.M.W., B.J.W., J.G.E., M.B.S., V.I.R. and M.M.T. funded by the NSF Biology Integration Institutes Program, award number 2022070. Portions of this work were supported by the DOE JGI, a DOE Office of Science User Facility sponsored by the Office of Biological and Environmental Research (BER) and operated under contract number DE-1124 AC02-05CH11231 (JGI) and the Environmental Molecular Sciences Laboratory (https://ror.org/04rc0xn13), operated under contract numbers DE-AC02-05CH11231 (JGI) and DE-AC05-76RL01830 (EMSL). We would also like to thank the Swedish Polar Research Secretariat and Swedish Infrastructure for Ecosystem Science (SITES) for the support of the work done at the Abisko Scientific Research Station. This has been made possible by data provided by Abisko Scientific Research Station and SITES. SITES is supported by the Swedish Research Council.

Author information

List of authors and their affiliations appear at the end of the paper.

Authors and Affiliations

Department of Environmental Science, The University of Arizona, Tucson, AZ, USA
Viviana Freire-Zapata & Malak M. Tfaily
Department of Natural Resources and the Environment, University of New Hampshire, Durham, NH, USA
Hannah Holland-Moritz, Jessica G. Ernakovich & Maria Florencia Fahnestock
Center for Soil Biogeochemistry and Microbial Ecology, University of New Hampshire, Durham, NH, USA
Hannah Holland-Moritz
Department of Microbiology, The Ohio State University, Columbus, OH, USA
Dylan R. Cronin, Suzanne B. Hodgkins, Ahmed A. Zayed, Virginia I. Rich & Matthew B. Sullivan
Center of Microbiome Science, The Ohio State University, Columbus, OH, USA
Dylan R. Cronin & Matthew B. Sullivan
Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology (QUT), Translational Research Institute, Woolloongabba, QLD, Australia
Sam Aroney & Ben J. Woodcroft
Department of Biology, Case Western Reserve University, Cleveland, OH, USA
Derek A. Smith & Sarah C. Bagby
Department of Earth Ocean and Atmospheric Sciences, Florida State University, Tallahassee, FL, USA
Rachel M. Wilson
Department of Civil, Environmental, and Geodetic Engineering, The Ohio State University, Columbus, OH, USA
Matthew B. Sullivan
Terrestrial and Aquatic Integration Team, Pacific Northwest National Laboratory, Richland, WA, USA
James C. Stegen
School of the Environment, Washington State University, Pullman, WA, USA
James C. Stegen
Bio5 Institute, The University of Arizona, Tucson, AZ, USA
Malak M. Tfaily
Centre for Environmental and Climate Science, Lund University, Lund, Sweden
Rhiannon Mondav
Department of Sociology, Colorado State University, Fort Collins, CO, USA
Jennifer E. Cross
Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
Regis Ferriere & Scott R. Saleska
Schmid College of Science and Technology, Chapman University, Orange, CA, USA
Michael Ibba
Department of Earth Sciences and Earth Systems Research Center, University of New Hampshire, Durham, NH, USA
Ruth K. Varner

Authors

Viviana Freire-Zapata
View author publications
Search author on:PubMed Google Scholar
Hannah Holland-Moritz
View author publications
Search author on:PubMed Google Scholar
Dylan R. Cronin
View author publications
Search author on:PubMed Google Scholar
Sam Aroney
View author publications
Search author on:PubMed Google Scholar
Derek A. Smith
View author publications
Search author on:PubMed Google Scholar
Rachel M. Wilson
View author publications
Search author on:PubMed Google Scholar
Jessica G. Ernakovich
View author publications
Search author on:PubMed Google Scholar
Ben J. Woodcroft
View author publications
Search author on:PubMed Google Scholar
Sarah C. Bagby
View author publications
Search author on:PubMed Google Scholar
Virginia I. Rich
View author publications
Search author on:PubMed Google Scholar
Matthew B. Sullivan
View author publications
Search author on:PubMed Google Scholar
James C. Stegen
View author publications
Search author on:PubMed Google Scholar
Malak M. Tfaily
View author publications
Search author on:PubMed Google Scholar

Consortia

EMERGE 2012 Field Team

Suzanne B. Hodgkins
& Rhiannon Mondav

EMERGE Biology Integration Coordinators

Jennifer E. Cross
, Maria Florencia Fahnestock
, Regis Ferriere
, Suzanne B. Hodgkins
, Michael Ibba
, Scott R. Saleska
, Ruth K. Varner
& Ahmed A. Zayed

Contributions

V.F.-Z. and M.M.T. conceptualized, designed and supervised the study. V.I.R. coordinated sampling efforts. V.F.-Z. and M.M.T. conceptualized, designed and supervised the metabolome study. The EMERGE Biology Integration Coordinators, V.I.R., M.M.T., M.B.S., J.G.E., B.J.W., S.R.S. and R.K.V. conceptualized and designed the associated long-term field campaign and project. V.I.R. coordinated the 2012 sampling efforts, and the The EMERGE 2012 Field Team collected samples. M.M.T. and R.M.W. extracted and collected metabolomics data. V.F.-Z., H.H.-M., D.R.C., S.A. and D.A.S. analysed the data, and V.F.-Z. visualized the data. V.F.-Z. and M.M.T. drafted the paper with contributions from H.H.-M., S.A, D.A.S., M.B.S. and J.C.S. All authors provided comments and editions and approved the final draft.

Corresponding author

Correspondence to Malak M. Tfaily.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Microbiology thanks Lucas Braga and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Metabolite assembly.

A. βNTI feature contribution and the proportion of each elemental composition across and within habitats. B. Relative abundance of different elemental compositions in the three habitats (metabolome: palsa n = 29, bog n = 27, fen n = 29) where differences between habitats were calculated using a two-sided Wilcox test. Boxes represent the upper and lower quartile, the line in the box represents the median value and the whiskers represent the maximum and minimum value, no further than 1.5 times the interquartile range, values beyond the whiskers represent outliers and are plotted individually. P values were adjusted using Bonferroni method. C. Relative abundance of metabolite clusters class composition within each habitat.

Source data

Extended Data Fig. 2 Expression in transcripts per million (TPM) of metagenome assembled genomes (MAGs) that correlated with metabolite-βNTI-feature derived clusters 1 in the bog.

Heatmap representing the expression in TPM of MAGs correlated metabolite-βNTI-feature derived clusters 1 in the bog. The heatmap is divided into functional categories (right) with different functions within each category.

Source data

Extended Data Fig. 3 Expression in transcripts per million (TPM) of metagenome assembled genomes (MAGs) that correlated with greenhouse gases in the bog.

A. Heatmap representing the expression in TPM of MAGs correlated with greenhouse gases in the bog, the heatmap is divided into KEGG functional categories (right). B. Heatmap representing the expression in TPM of carbohydrate active enzymes expressed by these MAGs, CAZYmes are divided into different families (right).

Source data

Extended Data Fig. 4 Microbial-metabolite co-occurrence networks parameters.

A. Bar plots showing the total number of nodes in the network of each habitat and how many of those were either connectors or module hubs. Blue bars represent metabolite nodes and orange bars microbial nodes. B. Bar plots showing the number of links in the network of each habitat. Bars are colored to represent if the interactions were between pairs of metabolites, pairs of microbes or between a microbe and a metabolite. C. Bar plot representing the number of modules identified using the greedy modularity optimization algorithm. Colors represent if the modules contain only metabolites, only microbes or nodes from both types of data. D. Bar plot demonstrating the total number of correlations per networked phylum within each habitat. Positive correlations in red and negative correlations in blue. E. Heatmap showing the number of interactions of each networked phylum with different classes of metabolites, darked color means a higher number of interactions.

Source data

Extended Data Fig. 5 Number of Carbohydrate-Active enZYmes (CAZymes) annotated and expressed in the networked taxa.

Phylogenetic tree of the networked MAGs. Colored tiles show to which habitat network (Bog - green, Fen - blue, Palsa - brown) each MAG belongs to. The height of the bar plots in the outer ring indicates the number of CAZyme families that were annotated in each MAG. The color of the bar indicates how many genes encoding CAZymes were found in each MAG. The external bar indicated the number of expressed CAZymes.

Source data

Extended Data Fig. 6 Expression in transcripts per million (TPM) of metagenome assembled genomes (MAGs) that form part of the microbe-metabolite networks in the three habitats.

Heatmap representing the expression in TPM of MAGs that form part of the microbial-metabolite networks in the three habitats, palsa, bog and fen. The heatmap is divided into functional categories (right) with different functions within each category.

Source data

Supplementary information

Supplementary Information

Supplementary Fig. 1, Notes 1–3 and References.

Reporting Summary

Peer Review File

Supplementary Data 1

Sample metadata: information regarding the ___location, depth and date of sampling for all the samples used in the analysis presented in this paper. SampleID refers to the ID of the samples in the EMERGE Database. DepthMin and DepthMax indicate the shallowest and deepest depths, respectively, of the core section used for the sample. DepthCode is used to classify the samples based on the average depth (DepthAvg) into surface (S), medium (M) and deep (D). Sample replication metabolomics: table showing the number of replicates per peat core that were sampled and metabolomics analysis was performed. FTICR matrix: log-transformed FTICR-MS report. MAGs: table showing information about metagenome-assembled genomes included in this study: MAG accession number, taxonomy, completeness, contamination and strain heterogeneity. Abundance_table: metagenome-assembled genomes abundance table.

Supplementary Data 2

Spearman correlations (two sided) between microbial and metabolite (microbial: palsa, n = 21; bog, n = 22; fen, n = 24; metabolome: palsa, n = 29; bog, n = 27; fen, n = 29) βNTI_feature within each habitat. Correlations were performed with all the data (‘bulk’) and within samples of the same month (June, July and August) and depth (surface, middle and deep). The table shows Spearman rho values and their respective adjusted P values.

Supplementary Data 3

Bacterial OTU contribution towards the bacterial assemblages in the palsa, bog and fen (sheet 1, 2 and 3, respectively) and their assigned taxonomy. Contribution of the OTUs towards the assemblages was classified based on their βNTI_feature value as follows: significant contribution towards divergence (Sig. Divergence, βNTI_feature > 2), contribution towards divergence (Divergence, βNTI_feature > 1), significant contribution towards convergence (Sig. Convergence, βNTI_feature < −2), contribution towards convergence (Convergence, βNTI_feature < −1) and insignificant contribution (Insignificant, −1 < βNTI_feature < 1).

Supplementary Data 4

Individual metabolite contribution towards the metabolite assemblages in the palsa, bog and fen (sheet 1, 2 and 3, respectively) including their molecular formula, elemental composition, NOSC and assigned molecular class. Contribution of the metabolites towards the assemblages was classified based on their βNTI_feature value as follows: significant contribution towards divergence (Sig. Divergence, βNTI_feature > 2), contribution towards divergence (Divergence, βNTI_feature > 1), significant contribution towards convergence (Sig. Convergence, βNTI_feature < −2), contribution towards convergence (Convergence, βNTI_feature < −1) and insignificant contribution (Insignificant, −1 < βNTI_feature < 1).

Supplementary Data 5

Palsa, bog and fen sheets, respectively: Spearman correlations between βNTI_feature-derived metabolite clusters and MAG abundances within each habitat, palsa, bog and fen. The P values of the correlation were estimated as part of the rcorr function of the Hmisc R package, two-sided estimation. P values were adjusted using the FRD method. The tables show MAG ID, metabolite cluster number, Spearman rho value, P value and FDR-adjusted P value. Sheet 4 includes the MAGs that correlated with greenhouse fluxes in the bog. These tables include MAG ID, metabolite cluster that correlated with and its respective Spearman rho value, P value and FDR-adjusted P value. Also, it includes taxonomy assignment, contribution of those MAGs to microbial βNTI_feature and which gas those MAGs correlated with. Bog greenhouse correlation: five MAGs that significantly correlated with CO₂ and CH₄ and are contributors of community assembly. Bog CO₂ correlation: 45 MAGs that correlated with CO₂ in the bog. Bog CH₄ correlation: 29 MAGs that correlated with CH₄ in the bog.

Supplementary Data 6

Network_indices: microbial–metabolite topological network parameters for the empirical networks of each habitat and the parameters for the randomized networks. Networks were constructed using the MENAP pipeline. Topological parameters included similarity threshold, number of total nodes, number of metabolite nodes, number of microbial nodes, total links, R square of power law, average degree (avgK), average clustering coefficient (avgCC), average path distance (GD), geodesic efficiency (E), harmonic geodesic distance (HD), maximal degree nodes with max degree, centralization of degree (CD), maximal betweenness nodes with max betweenness, centralization of betweenness (CB), maximal stress centrality, nodes with max stress centrality, centralization of stress centrality (CS), maximal eigenvector centrality nodes with max eigenvector centrality, centralization of eigenvector centrality (CE), density (D), reciprocity transitivity (Trans), connectedness (Con), efficiency hierarchy, lubness, number module and modularity (sheet 1). Network module hub connectors: metabolite features identified as network module hubs and connectors based on their within-module connectivity (Zi) and intermodule connectivity (Pi)⁵⁶. The table includes m/z value, elemental composition, class, NOSC, contribution of each feature to metabolome assembly and βNTI_feature value (sheet 2). Networked microbes: includes the ID of the networked MAGs in the three habitats (sheet 3).

Supplementary Data 7

metaT accession: metatranscriptomics accession number of 2012 samples. metaT tpm_pathways: TPM of MAGs in 2012 samples, TPM averaged across reactions in specific pathways. metaT_2012: TPM and raw counts of 2012.

Supplementary Data 8

Manually curated KEGG-based definitions for carbon degradation, fermentation, and fixation, methanogenesis, and sulfur redox cycling pathways.

Supplementary Data 9

Environmental variables measured for each of the samples.

Source data

Source Data Fig. 1

Source data.

Source Data Fig. 2

Source data.

Source Data Fig. 3

Source data.

Source Data Fig. 4

Source data.

Source Data Extended Data Fig. 1

Source data.

Source Data Extended Data Fig. 2

Source data.

Source Data Extended Data Fig. 3

Source data.

Source Data Extended Data Fig. 4

Source data.

Source Data Extended Data Fig. 5

Source data.

Source Data Extended Data Fig. 6

Source data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Freire-Zapata, V., Holland-Moritz, H., Cronin, D.R. et al. Microbiome–metabolite linkages drive greenhouse gas dynamics over a permafrost thaw gradient. Nat Microbiol 9, 2892–2908 (2024). https://doi.org/10.1038/s41564-024-01800-z

Download citation

Received: 21 December 2023
Accepted: 30 July 2024
Published: 01 October 2024
Issue Date: November 2024
DOI: https://doi.org/10.1038/s41564-024-01800-z

This article is cited by

Weakened priming effect along soil profile in alpine grasslands on the Tibetan Plateau
- Mei He
- Kai Fang
- Yuanhe Yang
Science China Life Sciences (2025)

Subjects

Abstract

Similar content being viewed by others

Main

Results

Divergent assembly of metabolites and microorganisms in permafrost

Microbial and metabolite dynamics at the feature scale

Microbial and metabolite assembly linked to greenhouse gas

Microbial–metabolite networks in a permafrost thaw gradient

Discussion

Methods

Site description

Sample collection

FTICR-MS sample preparation and data preprocessing

Metabolite dendrogram construction

DNA extraction, metagenome sequencing, assembly and binning

Metatranscriptomics

Microbial tree construction

Microbial phylogenetic signal

β-diversity analysis and ecological null modelling

Correlations with environmental variables

Feature-specific βNTI estimation

Feature-specific βNTI-derived metabolite clusters

Microbial–metabolite co-occurrence networks

Functional annotation of networked microbial communities

Statistics

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

EMERGE 2012 Field Team

EMERGE Biology Integration Coordinators

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links