Background & Summary

Solar-induced chlorophyll fluorescence (SIF) has been demonstrated to be closely linked to vegetation photosynthesis in terrestrial ecosystems and served as a superior indicator for estimating gross primary productivity (GPP)1,2,3,4,5,6,7. Several remote sensing satellite SIF datasets have been successfully generated globally based on available satellite datasets8,9,10 and successfully used for global GPP estimation11,12.

From the perspective of vegetation photosynthesis monitoring, satellite SIF products with km-scale spatial resolution are essential for photosynthesis status at landscape scales. Currently, satellite-based SIF datasets can be categorized into two groups based on their spatial resolution. The first group of SIF products is the group with higher spatial resolutions (<2 km), and the second group of SIF products is the group with lower spatial resolutions (>10 km). Satellite SIF products with higher spatial resolution comprise the Orbiting Carbon Observatory-2 (OCO-2) SIF product13 and OCO-3 SIF product14, with spatial resolutions of 1.3 × 2.25 km and 1.6 × 2.2 km. In addition, the Chinese Carbon Dioxide Observation Satellite Mission (TanSat) SIF15 and the Terrestrial Ecosystem Carbon Inventory Satellite (TECIS-1) SIF16 product have a 2 × 2 km spatial resolution. However, the above fine-resolution SIF products have spatial discontinuity problems.

The Scanning Imaging Absorption spectrometer for Atmospheric Chartography (SCIAMACHY) SIF17, the Global Ozone Monitoring Experiment (GOME), and GOME-2 SIF products18,19,20 cover the global scale but have a lower spatial resolution of 0.5° or even less. Therefore, they are limited to monitoring vegetation photosynthesis at a coarse spatial resolution21 and integrating ground GPP measurements from flux towers22.

Because of the current situation in remote sensing SIF products, research on acquiring finer spatial resolution and spatially continuous SIF datasets is constantly emerging. Among such studies, machine learning (ML) methods are commonly used to obtain global SIF products at a relatively fine resolution and high continuity, including the contiguous SIF dataset (CSIF), a new OCO-2 SIF dataset (GOSIF), a reconstructed TROPOMI SIF (RTSIF), and a continuous TanSat SIF product, with a 0.05° spatial resolution23,24,25,26. These products utilized ML methods to derive statistical relationships between SIF (the target variable) and explanatory variables using a subset of the original SIF (OCO-2 or TROPOMI mission) periods and extrapolated the derived relationship in space and in time27. Based on such mechanisms, ML-based algorithms provide a way to link multiple explanatory variables to SIF without an explicit physical and radiative transfer relationship24. ML-based SIF products are predicted results trained using ML methods based on several explanatory variables to attain a predicted SIF (PSIF) with higher spatial resolutions at km scales28. However, ML-based SIF products are actually simulated results derived from ML models based on different explanatory parameters. Instead, the downscaling methods29,30 are different from the ML-based approach, which uses PSIF as an intermediate variable to redistribute the original coarser SIF datasets. The PSIF is used as a weighted coefficient to reproduce the coarse SIF retrievals. Therefore, downscaling methods can better preserve information from the original SIF retrieval while enhancing the spatial resolution compared with ML methods (ML methods use the PSIF as a target for SIF retrieval). That is to say, SIF based on the downscaling method can still be preserved as a measured signal while improving the spatial resolution, which provides a good solution for obtaining SIF datasets with higher spatial resolutions derived from the original SIF retrieval. At a finer spatiotemporal resolution of 3.5 × 7.5 km (3.5 × 5.5 km in the nadir mode) and a 16-day revisit time, the TROPOspheric Monitoring Instrument (TROPOMI) SIF10,21,31,32 is better suited to obtaining an improved SIF dataset at a finer spatial resolution of 500 m based on the downscaling method.

The objectives of this study are (1) to generate a new SIF product with a spatial resolution of 500 m derived from the original 0.05° TROPOMI SIF using a spatial downscaling method; (2) to comprehensively assess the downscaled SIF product using tower-based SIF observations, original TROPOMI SIF (OSIF) and a ML-based SIF product (RTSIF); (3) to investigate the enhanced performance of the downscaled SIF (TroDSIF) in estimating GPP.

This study produced a downscaled SIF dataset (TroDSIF) with a spatial resolution of 500 m and well solved the above issues. TroDSIF agrees well with OSIF than RTSIF with higher R2 and lower RMSE values across main vegetation types. Furthermore, TroDSIF enhanced the relationship towards tower-based GPP and SIF measurements. TroDSIF will serve as a new data source for vegetation photosynthesis monitoring, carbon cycling, climate change and other terrestrial ecosystem-related studies.

Methods

Datasets

The Caltech TROPOMI SIF data (https://doi.org/10.1029/2018GL079031) established by Kohler et al.21,31 between March 2018 and July 2021 are derived from the Sentinel-5 Precursor satellite, with a swath of 2600 km, a 16-day revisit time, and an overpass time of 13:30, providing nearly daily global coverage. These instantaneous SIF retrievals are derived from a data-driven approach based on a linear forward model fitting top-of-atmosphere (TOA) radiance within two spectral regions (743–758 and 735–758 nm). The gridded far-red SIF (740 nm) daily corrected SIF (SIFdc) is further obtained by accounting for the overpass time, length of the day, and SZA based on Frankenberg’s method2. We selected SIFdc at a 0.05° spatial resolution and a 16-day revisit time as the input original SIF (OSIF) to obtain the downscaled SIF. To match the explanatory variables from MODIS product with a sinusoidal projection, the original SIFdc was first reprojected to the same projection and resampled to a spatial resolution of 5 km using a linear interpolation method before the redistribution.

As RTSIF23 is a machine-learning SIF product derived from TROPOMI SIF, we selected it here for comparison with the downscaled SIF to explore the difference among the ML-based SIF and the downscaled SIF. Surface reflectance, photosynthetically active radiation (PAR), land surface temperature (LST), land cover, and C3/C4 fraction were used as explanatory variables for RTSIF modeling based on the extreme gradient boosting (XGBoost) approach. RTSIF has a fine spatial and temporal resolution of 0.05° and 8 days over the 2001–2020 period in clear-sky conditions.

MCD43C4 V00633 is the MODIS Nadir bidirectional reflectance distribution adjusted reflectance (NBAR) product and was used to collect blue, green, red, and near-infrared (NIR) band reflectance as SIF explanatory variables for characterizing structural information on vegetation. NBAR from MCD43C4 has a moderate spatial resolution of 0.05° and a 16-day revisit, obtained at a nadir-viewing angle. Simultaneously, we calculated the normalized difference vegetation index (NDVI) by combining the red and NIR NBAR data. Additionally, the MCD43A4 (V061) NBAR product was used for obtaining 500 m reflectance and the NDVI.

MCD12Q134 is the MODIS land cover type dataset, which provides global land cover classification maps at a spatial resolution of 500 m per year. Multiple classification frameworks, including the International Geosphere–Biosphere Programme (IGBP), FAO-Land Cover Classification System (LCCS1), University of Maryland (UMD), and Plant Functional Types (PFT). The MCD12Q1 product was generated based on MODIS reflectance data35,36 using a decision tree method and a boosting approach37. The IGBP scheme was used for temporal pattern analysis of SIF across different vegetation types in this study.

Air temperature (AT) was considered a physiological vegetation signal and therefore served as an explanatory variable of SIF, which was obtained from the ERA5 dataset38. TA data from ERA5 have a spatial resolution of 0.1° over hourly scales39. For the RF model training, the original AT needs to be aggregated at 0.05° to match other explanatory variables. To obtain the weight coefficient at 500 m, AT needs to be reprojected to a sinusoidal projection and resampled with a spatial resolution of 500 m.

ChinaSpec is a network of tower-based continuous SIF measurements across mainland China40. Continuous spectral measurements are collected using an automated SIF system with a QE 65Pro spectrometer within the wavelength range of 680–840 nm synchronously with flux observations. Six sites (XTS, DM, AR, HL, GC, and PYH) are collected to match the downscaled TROPOMI SIF for direct validation. Site information can be found in Supplementary Table S1. AmeriFlux measures the CO2 exchange in ecosystems, energy fluxes, and water in most regions of North, Central, and South America, providing fluxes and meteorological observations at hourly scales41. The establishment of the AmeriFlux network was to provide critical flux measurements across different ecosystems and climate zones. Daily GPP estimates were calculated from hourly GPP observations based on partitioning NEE measurements and aggregated to a 16-day scale. Overall, 67 AmeriFlux sites that have synchronous measurements with TROPOMI SIF in 2019 were selected for this analysis (see Supplementary Table S1 for site information). Ten IGBP PFTs42 are included among these sites: CRO: croplands, DBF: deciduous broadleaf forests, ENF: evergreen needleleaf forests, GRA: grasslands, MF: mixed forests, OSH: open shrublands, SAV: savannas, WET: wetlands, WSA: woody savannas, and CSH: close shrublands.

Downscaling approach to reproduce TROPOMI SIF at 500 m

Figure 1 depicts the flowchart of the scheme to produce the spatial downscaled TROPOMI SIF (TroDSIF) with a spatial resolution of 500 m. Three main steps include A) the establishment of the RF model, B) the generation of the weight coefficient and C) the redistribution of the OSIF.

Fig. 1
figure 1

Flowchart of the downscaling approach of TROPOMI SIF from 0.05° to 500 m using explanatory variables containing AT, surface reflectance, NDVI, and daily averaged cosine SZA values. Detailed procedures for the redistribution of the OSIF are depicted in Fig. 2.

The RF model to predict TROPOMI SIF

Recently, ML algorithms have been used for remote sensing studies, especially for carbon and water flux research43,44,45. For example, neural networks24,27,46,47 and tree-based methods25,26,30 have been successfully used to produce finer spatial resolution and temporally continuous SIF datasets. Among them, random forest (RF) model was first produced by Leo48 and had been widely used in remote sensing applications, such as land cover classification and biomass estimation49,50. Unlike other ML models, RF model is not sensitive to the unbalanced distributions and missing issue of the input samples. In addition, due to its random style in splitting tree nodes, RF model is also insensitive to overfitting problems48,49. Besides, it performs better in dealing with large, high-dimensional datasets and multicollinear datasets51,52, with stronger robustness in selecting noise and features50,53. Therefore, we selected the RF model to establish the relationship between driving parameters and the TROPOMI daily corrected SIF (SIFdc, with a coarser spatial resolution of 0.05°).

The determination of explanatory variables for SIF was mainly referred to Ma et al.26,30. Based on the basic equation of SIF (Gu et al.54), SIF can be expressed as:

$${\rm{SIF}}={\rm{PAR}}\times {\rm{FAPAR}}\times {\rm{\varepsilon }}\times {{\rm{\varnothing }}}_{{\rm{SIF}}}$$
(1)

where ε is the escape ratio of SIF photons from the canopy, FAPAR is a fraction of PAR which green leaves absorbs, SIF is the amount of SIF photons, namely the yield of the fluorescence quantum.

Two types of information are included in Eq. (1): canopy structure-related (ε, FAPAR) and physiological-related information (SIF). As ε and FAPAR are both structure-related factors, canopy bi-directional reflectance and its different combinations (i.e. vegetation index) can be used to estimate ε and FAPAR41. Specifically, NIR reflectance is closely related to the SIF escape fraction ε, while the red band is associated with the absorption process55,56. Blue band reflectance was selected because of its tight relationship with chlorophyll and carotenoid absorption56. Green band reflectance may highly fluctuate under higher absorption conditions. Therefore, we used MODIS reflectance at NIR, red, blue and green bands as SIF driving variables. Simultaneously, the physiological component of SIF, ϕSIF, depends on the heat dissipation (NPQ) and the fraction of the PSII reaction center (qL)54. Since both NPQ and qL are related to the illumination conditions under clear skies, the cosine function of the solar zenith angle (cos(SZA)) was selected for characterizing the illumination conditions2,54,57. In addition, the AT was included to providing auxiliary information in characterizing vegetation physiological information.

To sum up, seven datasets in total were selected as explanatory variables, including four BRDF-corrected reflectance datasets at the red, NIR, green and blue bands, NDVI, cos (SZA), and TA, as shown in Eq. (2). All samples for each year were divided into two datasets, 70% for training and 30% for validation to satisfy larger training data size and higher accuracy standard simultaneously58.

$$\text{SIF}\,=\,\text{RF}\left(\text{R}1,\text{R}2,\text{R}3,\text{R}4,\text{NDVI},\cos \left(\text{SZA}\right),\text{TA}\right)$$
(2)

where R1–R4 are the four NBAR values at the red, NIR, blue, and green bands derived from the MCD43C4 product, the NDVI is derived from MCD43C4 reflectance using the red and NIR bands (NDVI = (NIR-Red)/(NIR + Red)).

Redistribution of the original TROPOMI SIF

Explanatory variables with a 500 m spatial resolution were prepared for SIF prediction based on the RF model established in Procedure A. The predicted 500 m SIF (PSIF500m) was subsequently used as a weight coefficient in the redistribution process, which was based on the assumption on the robust linear relationship between OSIF and PSIF (Procedure C). Specifically, the downscaled 500 m TROPOMI SIF (TroDSIF) was redistributed from the original 5 km reprojected TROPOMI SIF (OSIF5km). Detailed descriptions of this redistribution principle are illustrated in Fig. 2. As the missing 500 m explanatory variables, PSIF500m obtained from the RF model may also have some discontinuous values. Therefore, we first smoothed PSIF using a two-degree Gaussian function with a standard deviation (SD) of 25 km before it was used as a weight coefficient for OSIF redistribution. Each 500 m TroDSIF pixel in the central 5 km OSIF was calculated based on 5 × 5 window of 5 km TROPOMI SIF (OSIF5km) pixels, 50 × 50 window of 500 m predicted SIF (PSIF500m) pixels and 2D Gaussian function weights over 5 km and 500 m scales (Weights5km and Weights500m). The formulas involved are as follows:

$${\rm{If}}\,{\rm{ratio}}=\frac{{\sum }_{{\rm{i}}=1}^{{{\rm{n}}}_{1}\le 5\times 5}{{\rm{OSIF}}}_{{\rm{5km}}}\times {{\rm{Weights}}}_{{\rm{5km}}}({{\rm{OSIF}}}_{{\rm{25km}}})}{{\sum }_{{\rm{j}}=1}^{{{\rm{n}}}_{2}\le 50\times 50}{{\rm{PSIF}}}_{{\rm{500m}}}\times {{\rm{Weights}}}_{{\rm{500m}}}({{\rm{PSIF}}}_{{\rm{25km}}})} > 0\,{\rm{(Condition\; 1)}}$$
(3)
$${{\rm{DSIF}}}_{{\rm{500m}}}={{\rm{PSIF}}}_{{\rm{500m}}}\times \frac{{\sum }_{{\rm{i}}=1}^{{{\rm{n}}}_{1}\le 5\times 5}({{\rm{SIF}}}_{{\rm{5km}}}\times {{\rm{Weights}}}_{{\rm{5km}}})/{{\rm{n}}}_{1}}{{\sum }_{{\rm{j}}=1}^{{{\rm{n}}}_{2}\le 50\times 50}({{\rm{PSIF}}}_{{\rm{500m}}}\times {{\rm{Weights}}}_{{\rm{500m}}})/{{\rm{n}}}_{2}}$$
(4)

Else (Condition 2)

$${{\rm{DSIF}}}_{{\rm{500m}}}={\sum }_{{\rm{i}}=1}^{{{\rm{n}}}_{1}\le 5\times 5}({{\rm{OSIF}}}_{0.5^\circ }\times {{\rm{Weights}}}_{0.5^\circ })/{{\rm{n}}}_{1}-{\sum }_{{\rm{j}}\,=1}^{{{\rm{n}}}_{2}\le 50\times 50}({{\rm{PSIF}}}_{0.05^\circ }{\times {\rm{Weights}}}_{0.05^\circ })/{{\rm{n}}}_{2}+{{\rm{PSIF}}}_{{\rm{500m}}}$$
(5)
Fig. 2
figure 2

The redistribution of the original 5km TROPOMI SIF using a 500 m predicted SIF and Gaussian weights within the spatial range of 25 × 25 km2 (consisting of 5 × 5 pixels in the 5 km OSIF and 50 × 50 pixels in the 500 m PSIF).

The ratio of OSIF25km (consisting of 5 × 5 OSIF pixels) to PSIF25km (consisting of 50 × 50 PSIF pixels) served as the condition to redistribute OSIF. When the ratio is greater than 0 (Condition 1), the TroDSIF is calculated using Eq. (2). However, because of the negative OSIF values, the ratio may be negative, resulting in the opposite information from OSIF5km. Therefore, under this condition (Condition 2), we obtained the TroDSIF based on Eq. (3). In addition, PSIF may not exist everywhere, resulting in no PSIF25km in some specific areas. Under this condition, we used OSIF25km as a replacement. In addition, if OSIF25km is absent, PSIF will be used. A 2D Gaussian function with an SD value of 15 km was used for gap filling in a 3 × 3 coarse window22 when there are no available PSIF or OSIF pixels. To distinguish different ways of obtaining TroDSIF, labels were designated for each TroDSIF pixel (Table 1).

Table 1 Label of the TroDSIF dataset.

Data Records

Our improved spatially downscaled solar-induced chlorophyll fluorescence dataset (TroDSIF), is available at Zenodo https://doi.org/10.5281/zenodo.1006055059. The data record contains TroDSIF data covering the range from April 2018 to July 2021 at a 500 m, 16-day spatio-temporal resolution. Approximately h5 format files per year. The unit is mW/m2/nm/sr. The file name SIF500m_corr_SIF_predict_ < YYYYDDD > _ < h**v** > .h5 provides information on the year, day of year, and the ___location index of each file (e.g., SIF500m_corr_SIF_predict_2018091_h00v08).

Technical Validation

Performance of the RF model

The built RF model was first validated using both the 70% training dataset and the 30% validation dataset. The coefficient of determination (R2), slopes, and the root mean square error (RMSE) were used as accuracy metrics. Figure 3 displays the performance of the RF model in 2019 in both the training and validation datasets. The PSIF well reproduced the OSIF with an R2 of 0.908, an RMSE of 0.059 mW/m2/nm/sr when training, and gave an R2 of 0.893 and an RMSE of 0.064 mW/m2/nm/sr when validating. We also evaluated the importance of the selected explanatory variables used in the RF model (Fig. 4). The results show that cos (SZA) and NIR reflectance are the top two critical variables, followed by the NDVI and green band reflectance.

Fig. 3
figure 3

The density scatter plots of the PSIF obtained from the RF model and the OSIFdc toward the (a) training and (b) validation datasets. The color bar denotes the density of the scatters on a logarithmic scale. Red lines represent the regression slope, and the black dotted lines represent the 1:1 line.

Fig. 4
figure 4

Importance evaluation of different explanatory variables in the RF model. Blue, Green, NIR, and Red are reflectance data from the MODIS product. TA is the air temperature from the ERA5 dataset.

The spatiotemporal patterns of the TroDSIF dataset

As global land is mainly located in the Northern Hemisphere, vegetation generally thrives during summer. Therefore, day of year (DOY) of 206 in 2019 was selected to display the global patterns of the downscaled 500 m TroDSIF. Figure 5 shows the global pattern of TroDSIF in 2019206, overall, TroDSIF successfully reproduced the spatial patterns of OSIF and is spatially continuous over a global scale. Moreover, higher values are concentrated in the northern hemisphere, which is in line with convention. Enlarged maps of TroDSIF and OSIF over north Guinea-Sierra Leone is also shown to stress the enhanced spatial details after downscaling (Fig. 6). Obvious patch effects are noticed in OSIF but smoother pixels in TroDSIF.

Fig. 5
figure 5

Global pattern of the downscaled 500 m TROPOMI SIF (TroDSIF) in 2019206. SIF unit is mW/m2/nm/sr.

Fig. 6
figure 6

Enlarged maps of OSIF (left) and TroDSIF (right) over north Guinea-Sierra Leone.

Globally averaged SIF values of TroDSIF and OSIF are displayed in Fig. 7. Similar patterns can be noticed for both SIF datasets and show a clear seasonality over the data coverage period.

Fig. 7
figure 7

Globally averaged SIF values for each 16-day revisit of TroDSIF and OSIF. The red line represents the TroDSIF, and the black line represents the OSIF.

Furthermore, SIF values of TroDSIF and OSIF were extracted across 10 major vegetation types (Fig. 8) based on MCD12Q1 land cover maps, including ENF, EBF, DNF, DBF, MF, SAV, GRA, CRO, OSH, and WSA. TroDSIF has similar variations with OSIF throughout each growing year in the selected vegetation types.

Fig. 8
figure 8

The global average of SIF values (unit: mW/m2/nm/sr) for each 16-day period of TroDSIF and OSIF across 10 major vegetation types.

Comparison of TroDSIF with RTSIF and OSIF

The spatially downscaled 500 m TroDSIF dataset was re-aggregated to 0.05° and compared with the OSIF and RTSIF at a global scale. Scatterplots between the re-aggregated TroDSIF (with a 0.05° spatial resolution), RTSIF and the original 0.05° TROPOMI SIF (OSIF) of two DOYs in 2019 (14 and 206) are shown in Fig. 9. TroDSIF is highly consistent with OSIF, with higher R2 values of 0.948 and 0.934, lower RMSE values of 0.057 and 0.067 mW/m2/nm/sr in 2019014 and 2019206, and is independent of latitude. RTSIF is relatively weakly correlated with OSIF, having smaller R2 values of 0.886 and 0.857 and larger RMSE values of 0.086 and 0.109 mW/m2/nm/sr on the same DOYs. In addition, RTSIF is constant with near zero values, while OSIF varies over a wide data range and seems to be dependent on latitude. Higher consistency between TroDSIF and OSIF over the global scale indicates a better robustness comparing to RTSIF, which suggests the reasonability and accuracy of the spatial downscaling method, as it firstly used the characteristics of OSIF (RF predicted SIF values) as the coefficients to reallocate the OSIF spatially. Instead, ML-based SIF products used the model predicted values as final SIF retrievals, which induced higher discrepancies between RTSIF and OSIF across the global scale.

Fig. 9
figure 9

Comparison between the TroDSIF (re-aggregated to 0.05°), RTSIF, and the OSIF on DOY of 14 (a,c) and 206 (b,d) with a 0.05° spatial resolution over a 16-day temporal scale. The color bar denotes the density of the scatters on a logarithmic scale.

Simultaneously, we also conducted biome-level comparisons for both TroDSIF and RTSIF with the OSIF using the definitions in the MCD12Q1 product in the same DOYs (14 and 206) (Fig. 10). Overall, TroDSIF showed a stronger relationship with OSIF in most selected biomes, including the ENF, EBF, MF, SAV, GRA, CRO, CSH, OSH, and WSA biomes, with the highest R2 and the lowest RMSE values. For RTSIF, it has lower R2 values and higher RMSE values coupled to OSIF compared with TroDSIF.

Fig. 10
figure 10

R2 and RMSE values between the re-aggregated 0.05° TroDSIF, RTSIF, and the OSIF on DOY of 14 and 206 across different vegetation types defined in the MCD12Q1 dataset.

In addition, to further assess the consistency of the downscaled dataset (TroDSIF) and ML-based dataset (RTSIF) with OSIF, we calculated the residuals between the TroDSIF/RTSIF and OSIF (the difference between the TroDSIF/RTSIF and OSIF). Global residuals between the TroDSIF, RTSIF, and the OSIF in 2019 are shown in Fig. 11. Overall, TroDSIF behaves with nearly zero residuals with OSIF at different latitudes. However, the ML-based RTSIF product has higher and inhomogeneous residuals across the global scale within the range of −0.2 to 0.2 mW/m2/nm/sr. The superior performance of TroDSIF is attributed to the principle of the spatial downscaling method, for it uses the ML-based SIF values as weight coefficients to redistribute OSIF for each pixel, which well solves the discrepancies across different latitudes in RTSIF.

Fig. 11
figure 11

Residuals of the re-aggregated TroDSIF and RTSIF with OSIF at different latitudes in 2019 over a 16-day temporal scale. The right boxes represent the corresponding averaged residuals at different latitudes.

Validation of TroDSIF with tower-based SIF

Tower-based SIF measurements provided by six ChinaSpec sites were used to validate the TroDSIF dataset. Half-hourly SIF estimates were first aggregated to a 16-day scale to match the TroDSIF pixels. Direct validation of TroDSIF and OSIF based on tower-based SIF measurements from six ChinaSpec sites is shown in Fig. 12. Overall, TroDSIF is highly consistent with the tower-based SIF comparing to OSIF. The RMSE values varied from 0.104 to 0.223, and mean absolute error (MAE) values were within the range of 0.077 and 0.163. Specifically, TroDSIF has the closest relationship with tower-based SIF with lower RMSE and MAE values at XTS than other sites. The improved spatial resolution of TroDSIF over a km-scale formed a good agreement on tower-based SIF measurements, which well reduced the spatial heterogeneity of the remote sensing SIF product.

Fig. 12
figure 12

Scatterplots of the matched tower-based SIF with the TroDSIF and OSIF over six ChinaSpec sites.

Improved performance of TroDSIF with tower-based GPP

In order to further test the performance of TroDSIF in improving the estimation of GPP, tower-based GPP estimates from the AmeriFlux network at 67 different sites were used as references. Meanwhile, OSIF was selected for comparison. The results show a good relationship between the original SIF and the tower-based GPP having an overall R2 value of 0.483 across different selected biomes (Fig. 13). Moreover, the downscaled SIF also performs better than the original TROPOMI retrievals with an R2 value of 0.542 among all selected biomes (Fig. 13(k)), which is mainly attributed to the spatial coverage of the finer resolution is closer to the footprint of the tower-based measurements. For each biome, TroDSIF improves the SIF-GPP relationships with R2 values varying between 0.193 to 0.894. Correspondingly, OSIF has lower R2 values with EC GPP within a range of 0.125 and 0.866. The improved relationship of TroDSIF with tower-based GPP indicates an enhanced performance of SIF in capturing vegetation GPP, and can therefore better serve as a proxy for the vegetation photosynthesis dynamics comparing to a coarser SIF product.

Fig. 13
figure 13

Scatterplots of the 500 m downscaled SIF (TroDSIF) and the original 0.05° TROPOMI SIF (OSIF) with tower-based GPP over a 16-day temporal scale across 10 biomes over the selected AmeriFlux sites (**denotes the significant level of the correlations with p values < 0.001).