Introduction

The high mountain ranges of the Tibetan Plateau are global biodiversity hotspots with a rich endemic diversity in alpine flora1,2,3, which constitute a high proportion of regional diversity4,5. It is known that local endemic alpine species richness (alpha diversity), which typically have small effective population sizes and limited dispersal capacities6,7, is threatened by climate change8,9 and the upward movement of trees10,11,12,13. The upward migration of tree species threatens alpine ecosystems through habitat loss and fragmentation, as well as alterations in microclimate (e.g., increased sheltering effect). To understand how best to conserve endemic alpine species diversity across spatial scales, it is imperative to understand the driving mechanisms of beta-diversity, as it is the component of regional gamma-diversity that accumulates as a result of compositional differences between alpha diversity14,15,16,17,18.

The pattern of contemporary beta-diversity, which quantified the variation in species composition within a region, is generally interpreted from past climate regimes and their oscillations19,20, but the role of past changes in the upper range limit of trees remains unknown. Specifically, in areas with stable climate conditions, low fluctuation in the position of the upper range limit of trees would result in reduced habitat loss and fragmentation, thereby creating stable environmental conditions that are more conducive to the survival of endemic alpine species, compared to areas where the position of upper range limit of trees fluctuates more significantly. We should therefore expand our analysis of the drivers behind the current spatial patterns of beta-diversity, shifting focus from climate change to include a more immediate factor—range limit dynamics. This broader understanding could offer valuable insights for more accurately predicting future shifts in alpine endemism diversity under a rapidly warming world.

Here the upper range limit of trees, which is defined as a connection between the uppermost group of upright trees of a certain height12,21, could theoretically reach their physiological limits set by temperature (referred to as the treeline position)21,22 but it is often confined to lower elevations due to environmental stresses and stochastic effects such as human disturbance12,21,22,23,24. Our previous understandings of paleo changes in the upper range limit of trees are mostly derived from analysis of fossil plant data including pollen, macrofossils, and charcoal. For example, classical paleoecological records and ancient DNA in lake sediments indicate that, the upper range limit of trees at some locations may have reached 3500 m elevation during the LGM25, and 400–600 m higher than today in the Holocene Climate Maximum26,27. While, relying solely on fossil data at the site level to reconstruct the spatiotemporal evolution of the upper range limits of trees remains highly uncertain, especially on the Tibetan plateau where a handful of sites were available for a few time periods since the Last Glacial Maximum (LGM, ~22 kyr BP).

Evidence from paleoecological records show that modern trees survived the extremely harsh climatic conditions during the most recent glacial period25. This knowledge opens up the possibility of reconstructing the upper paleo-range limit of trees from paleoclimate data based on the modern climatic responses of the upper-range limit of trees. Notably, the existence of cryptic forest refugia, which provide long-term shelters for trees, could facilitate the application of a climate-driven modeling approach for paleo reconstruction28. These refugia, found in heterogeneous topographies within high mountain areas and intersecting valleys of large rivers on the Tibetan Plateau25,29, could result in high dispersal rates and enable tree species to respond within decades to deglacial climate change30.

Here we derived past dynamics in upper range limits of trees using paleoclimate data at a spatial resolution of ~1 km at 100-year intervals since LGM, using the modern relationship between the satellite-derived upper range limit of trees and climate. The period since the LGM was used because its significant paleoclimatic changes led to notable vegetation and floristic shifts, as evidenced by fossil records preserved in sediments19,31,32,33, potentially driving legacies in modern diversity patterns. In addition, our paleo-reconstruction of the upper range limit of trees was based on the assumption that the relationship between the upper range limit of trees and climate, extracted from modern observations, has remained constant through time since LGM21,22,34. This assumption, which is well known as the uniformitarian principle, has been widely used in previous pollen-based paleoclimate reconstructions35,36. We further estimated how paleo fluctuation in the upper range limit of trees affects the current spatial pattern of beta-diversity in alpine endemic flora over the Tibetan Plateau.

Results and discussions

Spatial pattern of current upper range limit of trees and its drivers

We delineated the spatial distribution of the current upper range limit of trees using satellite-derived forest extent data at a high spatial resolution of 30 m12 (see Methods). To validate the robustness of this satellite-derived upper range limit of trees, we generated 602,879 manually interpreted samples using Google Earth high-resolution (<5 m) images as an independent validation dataset, with 5.1% (N = 30,746), 84.6% (N = 510,035), and 10.3% (N = 62,098) located in the northern, southern and inner Tibetan Plateau, respectively12,37. Across all samples, the satellite-derived upper range limit of trees showed a highly accurate alignment with manual interpretations of Google Earth high-resolution images (R2 = 0.98, Slope = 0.98, ME = 17 m, N = 602,879) (Fig. 1b).

Fig. 1: Elevation distribution of the multi-satellite-derived upper range limit of trees across the Tibetan Plateau.
figure 1

a Spatial distribution of the upper range limit of trees at 1 km2 resolution (left), and longitudinal/latitudinal distribution of treeline elevation and upper range limit of trees elevation (top/right). The solid lines represent the median values for each elevational category, calculated at 1 km intervals longitudinally. The curves have been smoothed using a Gaussian function, with the shaded regions indicating one standard deviation, and the background terrain obtained from ArcGIS basemap dataset. b The relationships between the multi-satellite-derived upper range limit of trees and visual interpretation from Google Earth images, with the 1:1 line (dashed) and the regression line (continuous). R2 and ME indicate the coefficient of determination and the mean error, respectively.

Our analysis of the satellite-derived upper range limit of trees reveals that almost 94.0% (N = 11,284,982 pixels at a 30-m resolution) of pixels depicting the upper range limit of trees are distributed in the northeastern and southeastern margins of the Tibetan Plateau, including the southern Qilian (3294 ± 314 m, mean elevation ± SD), eastern Himalaya (4206 ± 342 m) and southern Hengduan (3995 ± 355 m) mountain ranges. The remaining 6.0% (N = 721,452) of pixels depicting the upper range limit of trees are scattered throughout the headwater region of the three rivers (Yangtze, Yellow, and Lancang River) (3971 ± 353 m) and Tanggula mountain range (4211 ± 222 m) (Fig. 1a). The upper range limit of tree elevation displays a latitudinal gradient, with elevation in the southern region (4039 ± 359 m) higher than that in the northern (3295 ± 315 m) and inner Tibetan Plateau (3859 ± 387 m). The mean elevation of the upper range limit of trees over the Tibetan Plateau has reached 4009 ± 383 m, which is much higher than that of the Himalayas12 (3633 ± 475 m), and 17 locations in four counties (e.g., Bayi, Gongbu Jiangda, Milin and Lang) have higher elevations than highest records according to field observations38 (4900 m).

Furthermore, we explored the drivers of spatial changes in the current upper range limit elevation over the Tibetan Plateau. Trees do not always reach their thermal treeline positions approximated by an isotherm of mean growing-season temperature18,21,22,24 (see Methods), and they often remain confined to lower elevations due to human activities12,21,39 (such as grazing, logging and/or burning) or other disturbances, or critical water shortage. Our results show that nearly 85.2% of pixels have agreement between the upper range limit of the tree and the thermal treeline21,37 positions over the Tibetan Plateau, suggesting a predominant role of temperature in determining upper range limits of trees. In the southern and inner Tibetan Plateau, nearly 87.5% and 51.8% of pixels have their upper range limit of trees reaching thermal treeline positions, respectively, while only 27.9% of pixels in the northern Tibetan Plateau reached their thermal positions. We then explored which factors could explain the remaining 14.8% of pixels that fall below the thermal treeline, especially in the northern Tibetan Plateau (Supplementary Fig. 1 and Supplementary Table 1). To do so, we calculated the elevational difference between the satellite-derived upper range limit of trees and thermal treeline12,21,22, denoted as DTreeline, and used a random-forest algorithm40 to rank the importance of several variables (N = 19) belonging to categories of environmental stresses and disturbances on DTreeline (see Methods). Our results showed that DTreeline has a low value in the southern (54 ± 162 m) and inner Tibetan Plateau (186 ± 223 m), but a high one in the northern Tibetan Plateau (271 ± 217 m). Both cumulative climatic water deficit (CWD) during spring (March to May) and vapor pressure deficit (VPD) were the most important variables driving the high DTreeline values in the northern Tibetan Plateau, suggesting a regional drought-induced absence of trees from the thermal treeline position12,41 (Supplementary Fig. 1). Furthermore, we assessed how spatial changes of the leading factors (CWD and VPD) modulated the spatial distribution of DTreeline. For example, the abrupt DTreeline transition occurring at the CWD threshold of ~100 mm yr−1 (Supplementary Fig. 1b) separated the southern Tibetan Plateau, with low CWD and DTreeline values, from northern Tibetan Plateau with high ones at ~35°N (Fig. 1a). The transition of DTreeline along the VPD gradient further sharpens the observed south–north contrast, with DTreeline exhibiting a pronounced increase when the VPD exceeded 0.4 hPa (Supplementary Fig. 1c).

Developing a predictive model of the upper range limit of trees

We developed a predictive model to reconstruct and project changes in the upper range limit of trees. As there are time lags in seed dispersal and tree establishment, the shift in the upper range limit of trees may not occur in synchrony with climate change21,34,42,43,44. We included both climate change and migration lag in our predictive model. We used growing season temperature along with climatic stressors (especially spring CWD and VPD) to parameterize the effect of climate change on temporal shifts in the upper range limit of trees, and developed a climate-based predictive model using a training dataset (N = 116,337) (see Methods). Our climate-driven predictive model could explain nearly 84% of spatial variation of the upper range limit of trees using an independent validation dataset (N = 11,634) (Supplementary Fig. 2).

We then incorporated the impact of migration lag on shifts in the upper range limit of trees into the climate-driven predictive model. To determine the optimal lag, we tested time lags from 10 years to 100 years in our climate-driven predictive model, by minimizing the difference between simulated and in situ shifts in the upper range limit of trees over the past century across 20 sites of the Tibetan Plateau. These range limit elevation dynamics were reconstructed from tree ring data using the dendrochronological method39,45,46,47,48,49,50,51,52 (Supplementary Fig. 3 and Supplementary Table 2). Among all-time lags tested, incorporating a 50-year time lag into the climate-driven predictive model maximizes its explanatory power in representing in situ shifts in the upper range limit of trees over the last century. Specifically, the predictive model explains nearly 70% of the spatial variation in range limit shifts across 20 sites on the Tibetan Plateau (R2 = 0.68, P < 0.01) (Supplementary Fig. 4). Furthermore, after including the time lag, the trees across all sites are predicted to shift upward at a rate of 20  ± 21 m decade−1, which becomes closer to in situ shifts (18  ± 23 m decades−1). This confirms the robustness of our constructed model in the prediction of the upper range limit of trees at the century timescale (Supplementary Fig. 3b).

Prediction of upper range limit of trees changes from the LGM to the end of this century

We used this predictive model to reconstruct spatial-temporal changes in the upper range limit of trees at a temporal resolution of 100 years from the LGM to the present, based on a paleoclimate dataset53. We then projected changes to the end of this century (2080–2099) under the three emissions scenarios (see Methods). This model mapped upper range limit of trees at a spatial resolution of ~1 km over time.

According to the predictive model, the reconstruction of upper range limit elevation spanning the past 22 kyr BP to the end of the century varied by more than 850 m, which is around one-fifth of the present-day range limit elevation (4009 ± 383 m) (Fig. 2). Across the past 22 kyr BP, the elevation of the upper range limit of trees reached the lowest position during the LGM (22–19 kyr BP), gradually peaking in the warmest period of the Holocene (HCO, roughly 10–6.5 kyr BP), and then gradually decreasing towards the present day (4009 ± 383 m). By the end of this century, trees are predicted to migrate to values comparable to those during HCO (4185 ± 362 m) under the SSP2–4.5 scenario, while under the SSP5–8.5 scenario, they are expected to reach elevations ~240 m higher than the present day.

Fig. 2: Paleochanges in upper range limit of trees at a 100-year interval.
figure 2

Paleoreconstruction of upper range limit of trees since 22 kyr to 1950 CE using CHELSA TraCE-21k (left) and projection to the end of this century (right) using CMIP6 multi-model ensemble mean under the SSP1–2.6 (yellow), SSP2–4.5 (blue) and SSP5–8.5 (red) scenario, respectively. The solid line represents the mean elevation of the upper range limit of trees (N = 116,337) on the left, with the dotted line and shaded regions represent the 5th, 25th, 75th, and 95th percentiles. The gradients of red and blue represent the four distinct phases. The gray box on the right represents satellite-based observation. The whiskers and boxes represent the 5th, 25th, 75th, and 95th percentiles and the white lines show mean values.

Our model predicted that elevational changes in the upper range limit of trees reveal a distinct four-phase pattern, generally corresponding to changes in growing-season temperature. The first phase covers the LGM period (22–19 kyr BP), when the mean growing-season temperature was 2.7 °C lower than the present (Supplementary Fig. 5). The upper range limit of trees then reached the lowest elevation (3426 ± 347 m), which is ~583 m lower than the present day (4009 m). During the second phase from the Last Deglaciation period (19–12 kyr BP) to the middle Holocene (9–5 kyr BP), the climate transitioned from a glacial cold to a Holocene warm period, and the upper trees gradually increased towards the highest predicted elevations (4187 ± 377 m), which is 178 m higher than the present (4009 m). Note that this gradual increase in the upper range limit of trees was interrupted by the two millennial-scale events, the Bølling-Allerød warm event (14.7–12.9 kyr BP) and Younger Dryas cold event (12.9–11.7 kyr BP), which caused a marked increase and decline in elevations, respectively.

Throughout the third phase from Late Holocene (5–0 kyr BP) until the preindustrial era, the climate underwent a period of steady cooling and fluctuating warm and cold phases, resulting in a gradual decrease of the upper range limit of trees that eventually stabilized at a value ~61 m lower than the present (4009 m). The fourth phase, referred to the anthropogenic warming period from the pre-industrial era to the end of this century, has and will continue to experience a steady upslope of the upper range limit of trees. Relative to the present, the mean elevational increase in the range limit over the Tibetan Plateau are 102 m, 176 m, 243 m for SSP1–2.6, SSP2–4.5, and SSP5–8.5, respectively (Fig. 2 and Supplementary Table 3). We also calculated the alpine land area above the mean reconstruction of upper range limit of trees (see Methods). Our results show that the alpine land area reached a minimum extent of 503,035 km2 during HCO and a maximum extent of 849,574 km2 during LGM, 75% and 150% of the present-day area of alpine habitat (597,788 km2), respectively (Supplementary Fig. 6).

To verify the reliability of our model in reconstructing historical changes in the upper range limit of trees, we used fossil pollen abundance records from lake sediments to determine the presence and absence of trees at sampling sites throughout the Holocene. By integrating data from six lakes situated below, at and above the present upper range limit of trees26,35,54,55,56,57 (Supplementary Fig. 7 and Supplementary Table 4), this elevational transect of fossil pollen records provides observational evidence of changes in the upper range limit of trees over the Holocene. Specifically, for a given paleo time period, we obtained the elevational range of the upper range limit of trees by identifying the presence and absence of trees from fossil pollen records in lake sediments at different elevations. For example, during the early to middle Holocene (~10–4 kyr BP), trees were detected in Wuxu Lake (3710 m), Muge Co Lake (3780 m), and Yidun Lake (4470 m), but were absent in Koucha Lake (4540 m) and Saiyong Co Lake (4545 m) (Supplementary Fig. 7b). We then obtained that the elevation of paleo-upper range limit was situated between 4470 m (Yidun Lake) and 4540 m (Koucha Lake) during ~10–4 kyr BP. By further comparing elevational changes between different paleo time periods, we could obtain paleochanges in upper range limit of trees using fossil pollen records. To compare such pollen-derived data with our model reconstruction, the same grid cells covered by the pollen-derived data are selected from our high-spatial-resolution reconstructed data (Supplementary Fig. 7). The pollen-derived data are consistent with our model reconstruction showing that the elevation of upper range limit of trees increased in the early Holocene, held a consistent elevation between ~10 kry BP and ~4 kry BP during the middle Holocene and then decreased in the late Holocene (Supplementary Fig. 7b). The pollen-derived data further reveals a descent of upper range limits of trees by nearly 270 m between the early Holocene and the present over the Tibetan Plateau, which is also consistent with our model results (239 m). Interestingly, this range shift is also consistent with that of the European Alps during the Holocene (200–250 m) using macrofossil data24,58. However, determining precise elevational shifts in the upper range limit of trees remains a challenge due to the relatively coarse elevational resolution of fossil pollen records along our constructed elevation transect. Thus, it becomes necessary to either increase the elevational and temporal resolution of fossil pollen records or include analysis of terrestrial plant macrofossil and macroscopic charcoal data at and above the upper range limit of trees.

Impact of paleo-range limit fluctuation on current spatial pattern of beta-diversity

Next, we assessed the legacies of paleochanges in the upper range limit of trees on the spatial pattern of beta diversity over the Tibetan Plateau. For this purpose, we collated elevational and spatial distributions for 2111 endemic alpine species in around 100 counties over the Tibetan Plateau (Supplementary Data 1), with the upper limit of species distributions extending past the current upper range limit of trees (see “Methods”). We used Sørensen-based multiple-county dissimilarity59 to calculate the beta-diversity (β-diversity) of alpine endemic flora within a region that is defined to consist of a target county and its surrounding counties (see “Methods”) (Supplementary Fig. 8). To understand mechanisms underlying the spatial pattern of β-diversity, we obtained the regional-mean value of potential drivers (e.g., paleo-range limit fluctuation and paleoclimatic fluctuation), by averaging them across the target county and its surrounding ones. The paleo-range limit fluctuation, represented by the standard deviation of the high-frequency component of time series of the upper range limit of trees (see Methods), was negatively correlated with β-diversity (P < 0.01), with a lower β-diversity found in regions with greater temporal variability in paleo range limits. Paleo-range limit fluctuation (R = −0.53, P < 0.01) is more important than both paleo-growing-season temperature (R = −0.33, P < 0.01) and precipitation (R = −0.09, P > 0.1) fluctuation in explaining the current spatial pattern of β-diversity (Fig. 3a–c). By contrast, β-diversity did not vary significantly across gradients in mean climate and seasonality (Supplementary Fig. 9). Our results suggest the important role of both paleo-range limit and paleo-temperature fluctuations in determining the current spatial pattern of β-diversity of alpine endemic flora across regions. Specifically, both increased paleo-range limit and paleo-temperature fluctuations lead to spatial homogenization of endemic species composition.

Fig. 3: Relationships of spatial dissimilarity of endemic alpine species with paleo fluctuation in upper range limit of trees, paleotemperature, and paleo precipitation since the Last Glacial Period (LGM) across all regions over the Tibetan Plateau.
figure 3

The relationships of (ac) beta-diversity (β-diversity), (df) species replacement and (gi) nestedness to paleo fluctuation in upper range limit of trees (gray circles), growing-season temperature (red circles), and precipitation (blue circles) since LGM across all regions over the Tibetan Plateau. Here the region is defined to be consisted of a target county and its surrounding counties. The σ represents the paleo fluctuation for each variable and is calculated as the standard deviation of its high-frequency component based on the Empirical Mode Decomposition method. The black lines were fitted with linear regressions. The asterisks indicate statistical significance by two-tailed Student’s t-test (P < 0.01).

To understand why β-diversity reduced as paleo fluctuations intensified, we partitioned β-diversity into two components: dissimilarity due to species replacement (species present at one county are absent at another, but are replaced by other species absent from the first) and dissimilarity due to nestedness (species present at one country are absent at another, but are not replaced by additional species) (Supplementary Fig. 8) (see Methods). The species replacement component predominated in regions with low paleo-range limit fluctuations, while the nestedness component predominated in those with higher fluctuations (Supplementary Fig. 8 and Supplementary Fig. 10). Consistent with β-diversity patterns, the species replacement component decreased with increasing fluctuation, while the nestedness component showed the opposite trend (Fig. 3d, g).

This pattern is expected because gradients in paleo-range limit fluctuation could not only drive species replacement but also species loss and gain. Specifically, decreasing paleo-range limit fluctuations suggest increasingly stable alpine environmental conditions. This increased stability could enhance the capacity of alpine environments to promote niche differentiation and allow diverse assemblages with more specialized habitat preferences19,60,61,62,63,64, thereby increasing β-diversity through species replacement. In contrast, increasing paleo-range limit fluctuation could lead to increased fragmentation (e.g., reduced fragment area and increased isolation) and the contraction of suitable habitats for the survival of endemic alpine species. Increased fragmentation leaves narrowly-ranged endemic alpine species, which typically have small effective population sizes and limited dispersal capacities1,4,5,6,7, particularly vulnerable to extirpation and recolonizations, thereby increasing β-diversity through nestedness65,66,67. Therefore, as paleo fluctuation intensified, the decline in β-diversity resulted from a larger decrease in the species replacement component offsetting an increase in the nestedness component.

In addition, we observed a stronger response of β-diversity to paleo-range limit fluctuation than paleo-temperature fluctuation across regions (Fig. 3a, b). This result is expected because shifts in the upper range limits of trees could induce fragmentation and expansion of suitable habitats for alpine endemism to a larger extent than temperature changes alone. Tree expansion into alpine areas could directly lead to the loss of suitable habitats, and also cause habitat displacement by altering microclimate conditions through increased sheltering effects12,13. Although warming could potentially reduce the availability of suitable habitats for endemic alpine species, topographically diverse environments in alpine areas may provide suitable microhabitats68. These microhabitats allow endemic alpine species to track suitable habitats over short distances69. The presence of these microhabitats could then potentially buffer large-scale atmospheric warming impacts on the survival of alpine endemism. In the coming few decades, projected rates of the upper range limit of trees are several orders of magnitude larger than those occurring over the last millennia. In addition to drastic warming, accelerated tree expansion70 would exacerbate alpine habitat fragmentation and spatial isolation of remaining microhabitats12. Consequently, alpine endemics are not equipped to respond to the dual force from both accelerated tree expansion and drastic warming. This likely would accelerate spatial homogenization of endemic alpine species diversity, threatening the total endemic species pool on the Tibetan Plateau.

Summary

In summary, we used satellite-derived upper range limits of trees to reconstruct a spatio-temporal distribution of range limits since the LGM from a paleoclimate dataset. Our results suggest an important role of paleo fluctuations in the upper range limit of trees in shaping the spatial pattern of β-diversity across alpine regions. Our findings emphasize the need to include both past climate change and paleo fluctuations in upper range limit of trees to understand the current spatial patterns of β-diversity of alpine flora11. These fluctuations are expected to intensify in a warmer world, potentially leading to the homogenization of endemic alpine species diversity across regions, thereby threatening Tibetan endemic species pool. Thus, in regions with high paleo fluctuations, we should design corridors with optimal locations and orientations to enhance alpine habitat connectivity and facilitate the mixing of endemic species among regions. These corridors may initially reduce β-diversity in these regions; however, they would decrease the likelihood of endemic species extinctions and help maintain the Tibetan endemic species pool in the long term.

However, paleoreconstruction is still subject to uncertainties. The implicit assumption of paleoreconstruction is that vegetation and climate are in dynamic equilibrium, with a relatively fast response of vegetation to deglacial climate change. Although previously identified refugia in gorges suggest the possibility of high dispersal rates and rapid tree migration, further paleoecological data, in particular paleogenomic data (such as sedimentary ancient DNA), is needed to pinpoint actual refuge locations and assess the degree to which vegetation lags behind climate changes. By deploying a plateau network of collecting paleoecological data to infer vegetation response to past climate change, similar studies are likely to provide further value to this area of research.

Furthermore, the robustness of our paleoreconstruction model relies upon the accuracy of the paleoclimatic forcing inputs. For example, annual temperature changes during the Holocene are currently the subject of a broad debate, with proxy-based reconstructions indicating a cooling trend71,72,73 while climate models suggest warming74. Thus, reconstructing particular climate variables from paleoecological data is challenging over the Tibetan Plateau, due to difficulties in distinguishing moisture from thermal signals36. The tracking ability of upper trees with rising temperatures has given us the confidence to use paleoecological data located near their current upper range limits for reconstructing warm-season temperatures. After developing a robust seasonal temperature reconstruction, we could then separate the moisture signal from the temperature one in paleoecological data. This approach allows us to overcome the limitations of traditional paleoecological proxies and climate transfer functions that are notoriously complicated or distorted by the potential high collinearity between moisture and temperature75.

Methods

Application of multi-source satellite data in mapping upper range limit of trees

To map the spatial distribution of the upper range limit of trees over the Tibetan Plateau, we first developed a fusion method to map forested extent for the year 2020 (30 m spatial resolution) by integrating multiple-data sources including optical, microwave, and LiDAR data. Forested regions were defined as regions with tree canopy cover larger than 10% because the extracted upper range limit of trees using this value had the highest correlation with in situ observations12.

Generating the spatial distribution of forest extent over the Tibetan Plateau

Existing publicly-accessible datasets on forest extent systematically underestimate the presence of sparse tree cover76, leading to a direct underestimation of the upper range limit of trees. As such, we have constructed a tripartite upscaling method that integrates in situ identification of landcover types, visual interpretations from Google Earth’s high-resolution (<5 m) images, and data from Landsat multi-spectral images, Sentinel-1A microwave data, and GEDI-derived canopy height. These data sources were integrated using a Convolutional Neural Networks (CNN)77 classification algorithm to produce a high-precision map of forest extent for the year 2020 at a 30 m resolution.

We used space-borne LiDAR measurements of canopy height over 39,675,123 valid laser shots covering a resolution of 25 m on the Tibetan Plateau, collected by the GEDI instrument aboard the International Space Station78. We first randomly selected 47,658 sites and visually interpreted their corresponding vegetation classes using high-resolution (<5 m) Google Earth images produced between May and September 2020 and finally identified a total of 26,541 forested sites. In addition, we also surveyed 346 forested sites in the field during the autumn of 2020 and measured their tree heights.

We then built a CNN classification model linking the landcover types (presence of forest) to structural (GEDI-derived tree height), texture79 (Sentinel-1A microwave-derived co-polarization and cross-polarization data), and spectral attributes80 (Normalized Difference Vegetation Index, and seven spectral reflectance bands from Landsat images) during the period from June to September across GEDI sites. We retained 10% of the data (4765 sites) for independent validation of our constructed CNN model (R2 = 0.89, P < 0.01). We then use the remote sensing data (except GEDI data) as predictors in the CNN classification model to generate a high-resolution (30 m) map of forests for the year 2020.

Generating spatial distribution of upper range limit of trees over the Tibetan Plateau

We then used the developed forest extent map as the input for forest edge detection based on the Canny algorithm81, which extracted the forest boundary by finding abrupt changes in the forest image. We removed the lower forest boundary and only maintained the upper forest edges by defining a local adaptive elevation threshold, below which pixels were removed following Wang et al.12. Specifically, the process begins with constructing a histogram of forest edge elevation within a specified window size centered on a geographic point. The choice of window size is based on empirical analysis (ranges from 0.1 to 1°), balancing the inclusivity of upper and lower edges. We then assessed the unimodality of the distribution within the window size by smoothing the histogram using a Savitzky-Golay filter and determining the number of Gaussian functions in the data. If the number of Gaussian functions exceeds one, we found a threshold to separate the upper and lower ones, based on the position that contains a 99% signal in a Gaussian distribution. Subsequently, we eliminated the lower forest edge and iterated through these steps until a unimodal distribution was achieved (i.e., the number of Gaussian functions equals one).

Validating satellite-derived upper range limit of trees

We validate the satellite-derived upper range limit elevation of trees using manually interpreted samples from high-resolution (HR) (<5 m) satellite imagery provided by the Google Earth platform. To create an equal probability sample, the number of sample points is directly related to the number of pixels that have their upper range limit tree observed by the satellite. We generated a geographical 1 km grid over the Tibetan Plateau, and then randomly selected ten validation samples at a 30 m resolution for each 1 km grid to ensure even coverage across the region. Six well-trained workers visually interpreted and cross-validated all samples using Google Earth and the ArcGIS platform. Furthermore, a single quality controller checked all the results. We therefore obtained 602,879 validation samples evenly distributed across the Tibetan Plateau, with 30,746 (5.1%), 510,035 (84.6%), and 62,098 (10.3%) distributed in the northern, southern, and inner regions, respectively. Validation shows that the satellite-derived range limit has a high degree of consistency with manual interpretations from Google Earth images (R2 = 0.98, Slope = 0.98, ME = 17 m; Fig. 1b).

Identification of drivers of current spatial patterns in the upper range limit of trees

Not all upper range limits of trees reached the thermal treeline position determined by temperature12,21,22. In reality, the upper range limit of trees is often lower than the thermal treeline position due to environmental stresses12,41 (e.g., critical water shortage) and disturbances12 (e.g., human activities). We first compared the satellite-derived upper range limit of trees to the thermal treeline position, and then evaluated which factors were responsible for the deviation of satellite-derived range limit from thermal treeline positions.

To do so, we first calculated the thermal treeline elevation using growing-season temperature at a spatial resolution of 1 km. The thermal treeline is the cold edge of the fundamental niche of trees21, and was calculated as the isotherm of the mean growing-season air temperature at 2 m above the ground (Tair) of 6.4 ± 0.7 °C37 using the WorldClim dataset and the mean growing-season land surface temperature (Tsurface; the radiative temperature of the land derived from infrared radiation emitted from the surface) of 7.6 ± 1.0 °C12 using the Terra Moderate Resolution Imaging Spectroradiometer (MODIS). A global meta-analysis of more than 30 thermal treeline sites from different climate zones showed that thermal treeline positions generally occur at a mean growing-season ground temperature of 6.7 °C at 10 cm depth, with a narrow amplitude of 2.2 °C23,37. Since there was a strong correlation12 between growing-season mean Tsurface and ground temperature at a 10 cm depth across these global treeline sites21, the treeline was then estimated to occur at the growing-season mean Tsurface of 7.6 ± 1.0 °C. In addition, a meta-analysis of thermal treeline sites at the global scale also showed that the start and end of growing season was defined as date when the ground temperature at 10 cm depth exceeds 3.2 °C23. Such ground-temperature threshold corresponds to MODIS-derived land surface temperature threshold of 0.7 °C. Therefore, the start and end of the growing season were determined as the date when daily Tair rises above 0.9 °C, the date at which it falls below 0.9 °C, and the daily Tsurface passes 0.7 °C, respectively. Furthermore, the growing season length must be longer than 94 days37.

We then calculated the deviation of satellite-derived upper range limit of tree elevation from thermal treeline elevation as DTreeline (the difference between the edges of the realized and the fundamental niche)12,22. Before calculating DTreeline, we aggregated satellite-derived upper range limit of trees elevation from the original 30 m to 1 km resolution by calculating the mean value in each 1 km grid cell. To understand the drivers of DTreeline, we compiled several variables (N = 19) belonging to the four different categories (climate limitation, disturbance, soil, and topography factors) (Supplementary Table 1), and then used a random forest model to rank the importance of these variables and calculate the partial contribution of each variable on DTreeline. The entire dataset, including cases where DTreeline ≥ 0 (N = 116,337), was used in our analysis. We generated the partial-contribution plot using the “forestFloor” package in R statistical software (http://cran.r-project.org/), to visualize how each independent variable impacts the prediction while controlling changes in all other variables (Supplementary Fig. 1b to t).

Constructing a climate-based predictive model of upper range limit of trees

In constructing the model, we first used a growing-season temperature threshold to represent the thermal treeline position, and then included several variables including environmental stresses and disturbances, to account for the observed deviation of the satellite-derived upper range limit from the treeline position (DTreeline). Thus, the modeled upper range limit of trees is the sum of the thermal treeline position and DTreeline due to environmental stresses and stochastic effects. We separated temperature from other climatic variables when modeling DTreeline, as there was significant covariation between temperature and other climatic variables, such as indicators of critical water shortages (e.g., vapor pressure deficit and cumulative water deficit), both of which are influenced by temperature.

In modeling DTreeline, we employed a recursive feature elimination method to identify the most relevant variables12. The error decreased when using the six most important variables in determining the spatial variation of DTreeline (Supplementary Fig. 11) but decreased by less than 1% when additional variables were included. We then constructed a parsimonious climate-DTreeline model by selecting the following six variables: spring cumulative climatic water deficit (CWD), spring vapor pressure deficit (VPD), cloud cover, minimum temperature of the coldest month, summer dehydration, and surface curvature (Supplementary Fig. 1 and Supplementary Table 1). The mechanisms for the potential impact of these variables on tree growth and establishment at thermal treeline positions were formulated as follows. For example, the water availability (represented by CWD and VPD) could constrain tree growth and become decisive for seedling establishment, therefore affecting the tree establishment at the thermal treeline position. The CWD, which is calculated as potential evapotranspiration minus precipitation, indicates the quantity of water needed to meet evaporative demand and measures the drought stress on plants, and VPD, as a function of relative humidity and temperature, represents an independent moisture stress on plant photosynthesis. Furthermore, summer dehydration, which is calculated as a product of wind speed, temperature, and relative humidity82, generally occurs during daylight hours under conditions characterized by high temperatures, low relative humidity and strong winds, thereby leading to trees transpiring moisture at a faster rate than their roots take it up from the soil. Such summer dehydration could further aggravate the moisture stress (CWD and VPD) on tree growth and tree establishment at thermal treeline positions. The relationship between cloud cover and tree growth at thermal treeline position is complex. The high cloud cover could reduce the solar radiation, and potentially undermine the plant photosynthesis and tree growth at thermal treeline positions. While the cloudy days are generally associated with the relatively high-water availability and reduced transpiration during the growing season; and this could reduce the radiative cooling and trap heat during the night, potentially allowing trees to survive extreme cold temperatures at the relatively high elevation24. The minimum temperature of the coldest month is used as the freezing stressor that could directly control the absence of trees from the thermal treeline position21. In addition to these macroclimatic drivers, we also include local topographical factors (e.g., surface curvature) as a surrogate of microclimate conditions, because of the potential deviation of microclimate from macroclimate at the high elevation. After including these variables, the parsimonious model achieved a high prediction power (R2 = 0.84, P < 0.01; Supplementary Fig. 2), and is less susceptible to errors in model extrapolation than full models considering all potential drivers. We then applied our parsimonious climate-DTreeline model to predict changes in DTreeline at a spatial resolution of ~1 km at 100-year intervals from LGM to the end of this century (mean over the period 2080–2099). All of these predicators from LGM to the end of this century were derived from CHELSA-TraCE21k, WorldClim, and CMIP6 model projections (see “Assembling climate variables”).

Reconstruction and prediction of changes in the upper range limit of trees

We reconstructed spatiotemporal dynamics of the upper range limit of trees spanning from 22 kyr BP to the end of this century at a spatial resolution of 1 km for every 100 years. Specifically, we determined the thermal treeline elevation using the Tair of 6.4  ±  0.7 °C37 and the Tsurface of 7.6  ±  1.0 °C12, with the growing season length longer than 94 days37, and then predicted changes in DTreeline due to regional environmental stresses using the parsimonious DTreeline model.

Assembling climate variables

Climate data from 22 kyr BP to 2100 CE was compiled from three different data sources: (1) monthly climate data from CHELSA-TraCE21k (Climatologies at High resolution for the Earth’s Land Surface Areas) at a temporal and spatial resolution of 100 years and 1 km, respectively, for the last 22,000 years53; (2) current climatology (2020 CE) from the WorldClim dataset83; and (3) future projections from 9 Earth system models at the end of this century (mean over 2080–2099) participating in CMIP6 under low (SSP1–2.6), intermediate (SSP2–4.5) and high emission scenarios (SSP5–8.5) (Supplementary Table 3). Here, SSPxy represents an integrated scenario of future climate and societal change, with a peak in radiative forcing at y W m−2 before 2100.

We used monthly CHELSA-TraCE21k at a spatial resolution of ~1 km and a temporal resolution of 100 years. This CHELSA-TraCE21k dataset was downscaled from the transient climate simulations (CCSM3 TraCE-21k) from the LGM to present day using the CHELSA V1.2 algorithm53. CHELSA is a high-resolution climate data set for the time period 1979–201384, which uses statistical downscaling of the ERA-interim reanalysis with a bias correction of monthly precipitation from Global Precipitation Climatology Centre. The downscaled paleoclimate data have included the impact of paleo-orography on temperature and precipitation fields given that topography could have changed drastically due to the ice retreat along poles and high mountain areas through time85. The CHELSA TraCE21k was shown to produce a reasonable representation of high-spatial-resolution temperature and precipitation through time, and simulations using this dataset could be capable of detecting effective LGM species refugia53. Additionally, we analyzed changes in orography since LGM and found that a decrease in elevation is mainly located in the river valleys of the southeast Tibetan Plateau and over Hengduan mountain ranges (Supplementary Fig. 12). By contrast, orography changes within the 1 km buffer around the current range limit of trees are almost negligible.

Note that we used WorldClim data to map the present-day upper range limit of trees, and estimated the paleo-range limit by combining the current climatology from WorldClim with paleo-climatic changes derived from CHELSA. Furthermore, we generated future climate data at a spatial resolution of ~1 km by downscaling and bias-correcting CMIP6 model simulations using WorldClim data. Therefore, discrepancies among different data sources to estimate dynamics in the upper range limit of trees could be well resolved through bias correction.

Validating changes in the upper range limit of trees using fossil pollen data analysis at the millennium scale

We then used fossil pollen assemblages preserved in lake sediments over the Tibetan Plateau to determine if the trees were present in the vicinity of the sampling site throughout the Holocene. By incorporating data from an elevational transect, this approach can be used to delineate the upper range limit of trees and their elevational changes over time. We compiled a list of taxonomically harmonized and temporally standardized fossil pollen data from lake sediments over the Tibetan Plateau86 (Supplementary Fig. 7). We only included fossil pollen abundance data that falls within a 500 km radius of the present forest communities, to minimize the risk of misinterpreting the occurrence of trees due to long-distance transport of exogenous arboreal pollen into lake sediments87. This filtering resulted in a total of six lakes distributed across a range of elevation from 3700 to 4500 m26,35,54,55,56,57 that had data for at least 500 years (Supplementary Table 4).

We concentrated on the four most common high-elevational arboreal tree taxa—Betula, Abies, Picea, and Pinus. These tree species have different production and dispersal strategies, which could greatly affect the representation and abundance of tree taxa in fossil pollen records88,89,90. For example, Betula and Pinus have high pollen production rates, good wind transportability and good preservation characteristics88,91,92,93. Therefore, even a relatively high percentage at a given elevation might report a false presence. Since there were no available estimates of pollen production for the main tree taxa on the eastern Tibetan Plateau, we then used the reliable threshold of pollen abundance to determine if one of the tree taxa was present in the vicinity of the sampling site. Specifically, the pollen abundance thresholds with different probabilities were determined by performing a logistic regression using pollen abundance (0–100%) as the predictor and taxon classification (0 and 1 represent negative and positive classes, respectively) as the response variable using 2434 modern pollen samples of the 14 key arboreal taxa from China and satellite-based observation and the Vegetation Atlas of China of modern trees taxa distribution89. We used the pollen abundance threshold at a probability of 0.9 to determine the presence and absence of trees. The thresholds for Betula, Abies, Picea, and Pinus are 16%, 5%, 9%, and 37%, respectively89. Upon identifying any one of the four tree taxa, we assumed that the trees were present in the vicinity of the sampling site. Not all pollen grains will be equally well preserved throughout the time due to their differing sensitivities to environmental changes94,95, and thus, some tree taxa with very low pollen production rates and/or poor preservation could be missed. However, the probability of such underrepresentation is unlikely in this study. For example, common tree species including Corylus, Nitraria, Tamarix, Cyperaceae, Poaceae, and Fabaceae, which were widely reported to be greatly under-represented in China based on 898 modern pollen sampling sites and 2115 pollen data96, were not found at the current realized range limit of trees over the Tibetan Plateau.

Quantifying beta-diversity of endemic alpine species and its drivers

We aimed to estimate the role of paleo fluctuations in the upper range limit of trees on the spatial pattern of beta-diversity. We initially obtained geographical coordinates and elevational ranges for endemic alpine species at the county level, using the data from the Flora of China97 and the Flora of Tibet (Supplementary Data 1). The selection of endemic species is based on the Chinese Endemic Species list, and our focus was primarily on alpine species whose uppermost distribution surpassed the current upper range limit of trees. In this way, we have identified 2111 endemic alpine species in over 100 counties of the Tibetan Plateau.

First, we calculated the β-diversity of endemic alpine species using indices from the Sørensen-based multiple-county dissimilarity measures59 (Supplementary Fig. 8a). The Sørensen-based multiple-county dissimilarity indices were calculated using the following Eq. (1),

$$\beta -diversity=\frac{a+b}{2\times [{\sum }_{i}{S}_{i}-{S}_{T}]+a+b}$$
(1)
$$a={\sum }_{i < j}\min ({b}_{ij},{b}_{ji})\, and\,b={\sum }_{i < j}\max ({b}_{ij},{b}_{ji})$$

For a given focal county \(i\) with N neighbors, \({S}_{i}\) is defined as the total number of endemic alpine species in \(i\) (alpha diversity from endemic species) and \({S}_{T}\) is the total number of endemic alpine species in all neighbors (gamma diversity). \({b}_{ij},{b}_{ji}\) are the number of endemic alpine species only present in the county \(i\) and \(j\), respectively. We calculated β-diversity for all 100 counties selected.

We further partitioned β-diversity into two separate and antithetic components: dissimilarity due to species replacement and dissimilarity due to nestedness using Eqs. 2 and 359. The species replacement component refers to the replacement of some species with others among counties98, and there is no difference in species richness among counties. On the other hand, the nestedness component reflects differences in species richness, when species-poor assemblages are nested in species-rich assemblages. Different counties would then have different species compositions in the absence of species replacement.

$$species\,replacement=\frac{a}{[{\sum }_{i}{S}_{i}-{S}_{T}]+a}$$
(2)
$$nestedness=\frac{b-a}{2\times [{\sum }_{i}{S}_{i}-{S}_{T}]+a+b}\times \frac{{\sum }_{i}{S}_{i}-{S}_{T}}{[{\sum }_{i}{S}_{i}-{S}_{T}]+a}$$
(3)

The accuracy of the Sørensen index to estimate beta-diversity was recently suggested to be heavily dependent on species prevalence, with high species prevalence generally yielding misleading inferences about species associations99. We then quantified the extent to which species prevalence affected the Sørensen index in estimating β-diversity on the Tibetan Plateau, using a statistical mathematical metric generally used for evaluating pairwise similarities instead of assessing similarities among multiple (>2) species/sites. The prevalence of pair-wise species is estimated by calculating the log odds ratio of the conditional occurrence probability (the occurrence rate of pair-wise species in total counties)99. The endemic alpine species on the Tibetan Plateau have a generally low prevalence of 0.04 ± 0.05 (Supplementary Fig. 13a), and thus, it is unlikely that pairwise species would invalidate the interpretations of the traditional Sørensen-based metrics of species co-occurrence. This is confirmed by our analysis showing that there is a good correspondence (positive correlation) between the affinity metric of co-occurrence (Alpha MLE) that is not affected by species prevalence99 and the most popular co-occurrence metrics such as those due to Jaccard (R = 0.75, P < 0.01), Sørensen-Dice (R = 0.75, P < 0.01), and Simpson (R = 0.83, P < 0.01) (Supplementary Fig. 13b–d). We thus confirmed the robustness of using Sørensen-based multiple-county dissimilarity measures to estimate β-diversity.

Second, in terms of paleo-temporal fluctuations in the upper range limit of trees and climate drivers, we only retained the high-frequency component of the paleo time series. Specifically, we decomposed the original time series into a finite set of oscillatory components called Intrinsic Mode Functions (IMFs) and a residual trend component, using Empirical Mode Decomposition method100. IMF1 stands for the first IMF obtained during the decomposition process, which is the dominant oscillatory component extracted from the original time series of each variable. The standard deviation of the IMF1 for each county was used to represent paleo temporal fluctuations in the upper range limit of trees and climatic drivers (growing-season temperature and precipitation). In addition, to assess whether warming-induced upslope tree expansion would potentially lead to habitat loss, we also calculated the alpine land area by summing the surface area above the mean elevation of the reconstructed upper range limit of trees in each county, using a 30-m spatial resolution digital elevation model101. For each mountain defined in the GMBA mountain inventory v2.0 dataset102, we created a 100 km radius buffer around a target upper range limit pixel, and then summed up the surface area of all 30 m resolution pixels with elevations higher than that of the target pixel.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.