Abstract
Excessive use of synthetic nitrogen (N) for Chinese wheat production results in high loss of reactive N loss (Nr; all forms of N except N2) into the environment, causing serious environmental issues. Quantifying Nr loss and its spatial variations therein is vital to optimize N management and mitigate loss. However, accurate, high spatial resolution estimations of Nr from wheat production are lacking due to limitations of data generation and estimation methods. Here, we applied the random forest (RF) algorithm to bottom-up N application rate data, obtained through a survey of millions of farmers, to estimate the Nr loss from wheat production in 2014. The results showed that the average total Nr loss was 52.5 kg N ha−1 (range: 4.6-157.8 kg N ha−1), which accounts for 26.1% of the total N applied. The hotspots for high Nr loss are the same as those high applied N, including northwestern Xinjiang, central-southern Hebei, Shandong, central-northern Jiangsu, and Hubei. Our database could guide regional N management and be used in conjunction with biogeochemical models.
Measurement(s) | reactive N loss |
Technology Type(s) | random forest model |
Sample Characteristic - Environment | cropland |
Sample Characteristic - Location | China |
Similar content being viewed by others
Background & Summary
China is the largest synthetic nitrogen (N) fertilizer producer and consumer in the world, and applied more than 28 Tg N fertilizer to cropland in 20181. Furthermore, China applied 256 kg N ha−1 of fertilizer in 2016, which is 3.3 times the global average2, while China’s nitrogen use efficiency (NUE) is only 0.25 compared to 0.68 in North America and 0.42 worldwide3. A high N input with a low NUE indicates that a considerable amount of N has been lost to environment, mainly in the form of reactive N (Nr; all forms of N except N2) including nitric oxide (NO), nitrous oxide (N2O), and ammonia (NH3) emissions, nitrate (NO3−) leaching and Nr runoff4. This can cause substantial environmental problems, such as soil acidification5, air pollution6, and eutrophication7. the Chinese government has implemented several policies to reduce the environmental risks associated with Nr loss from cropland, such as “zero increase action plan for fertilize use”, and “action plan for organic fertilizer instead of synthetic fertilizer”. These measures are important to optimize N management, improve the NUE, and mitigate Nr loss in China. Understanding Chinese Nr loss at a high-resolution scale is essential to address the variation in N management among crop systems and locations.
Previous studies that aimed to estimate Chinese Nr loss were partially successful8,9; however, they had certain limitations that could be addressed. The first limitation concerned the method used for obtaining information on N fertilizer inputs. Fertilizer is distributed to specific locations and crops by regional regulatory bodies based on the total fertilizer input in the entire country or an individual region10. Previous studies used information on N fertilizer inputs obtained from regional regularities to estimate Nr loss (top-down information). Although this method can provide rough spatial information for applied N and Nr loss, the application of N is highly ___location-, and farmer-specific. Consequently, to improve spatial information on Nr loss, an N application rate survey should be used to obtain information from numerous farmers and locations (bottom-up information). The second limitation of previous studies was their focus on NO3− leaching, N2O and NH3 emissions, without consideration of other Nr loss pathways; this led to underestimation of the potential risks of Nr loss11. For example, they did not consider NO, one of the most important potential precursors of nitric acid, which leads to acidification and eutrophication11. The third limitation of previous studies was that they adopted uniform emission factors (EFs), such as IPCC Tier 1, to estimate the Nr loss of entire countries or regions, rather than considering spatial variation within a country or region12,13. Nr loss is ___location-specific and strongly influenced by local environmental factors. Recent advances have improved spatial estimation of Nr loss by incorporate more environmental factors. For example, Shang et al. estimates national cropland-N2O emissions by spatially referenced nonlinear model, with spatially variable model parameters depending on environmental factors and crop types14. Ying et al. applied the random forest (RF) algorithm to estimate the NO3− leaching associated with Chinese maize production according to climate and soil variables15. These studies indicated that incorporating spatial variation could reduce uncertainties in Nr loss estimations and facilitate management and mitigation decisions. The fourth limitation of previous studies was that they lacked high-resolution Nr emission inventories for specific crops. Such inventories are indispensable for optimal N management.
Wheat is one of the major crops in China, playing a vital role in food security. The regions used for wheat production range from humid regions in the southeast to arid regions in the northwest, and from warm regions in the south to cool regions in the northeast. China accounts for around 20% of the global synthetic N fertilizer consumption for wheat16. Considering the substantial spatial variation and excessive N consumption associated with wheat production in China, it represents an excellent target Nr loss estimation methods aiming to overcome the above-mentioned limitations of previous techniques. Our study provides a comprehensive and high-resolution Nr database based on applied synthetic N. First, we developed RF models to predict the EFs of five loss pathways (NO, N2O, NH3, NO3−, and Nr runoff) based on a literature review. Second, we use N application rates derived from surveys of 2.23 million farmers to calculate Nr loss. High-resolution data on wheat production distribution in China17 are presented in 1 × 1 km grid scale. Our results could help farmers optimize N application within safe boundary and develop mitigation measures against Nr loss in specific locations, and evaluate the environmental effects of Nr loss from Chinese production.
Methods
Literature review
We conducted a comprehensive review of relevant literature published since 1995. Studies were extracted from the China National Knowledge Infrastructure and Web of Science using the following keywords: “N (nitrogen) loss OR NO (nitric oxide) emission OR N2O (nitrous oxide) emission OR NH3 (ammonia volatilization) emission OR NO3− (nitric leaching) OR N (nitrogen) runoff AND wheat AND China”. We excluded the following types of experiment: experiments not covering the entire wheat growing season, experiments conducted in greenhouses or laboratories, experiments without zero-N control, and experiments including manure, controlled release fertilizer, or inhibitors. In total, we extracted 941 observations from 138 articles, consisting of 121 observations of NO emission, 383 of N2O emission, 185 of NH3 emission, 188 of NO3− leaching, and 64 of Nr runoff. We also extracted data on N application rates, and climate and soil variables (Fig. 1). Missing climate data were obtained from China Meteorological Data Network (https://data.cma.cn/), miss values of soil organic carbon (SOC) and total N content were obtained from the National Scientific Fertilizer Network (http://kxsf.soilbd.com/), and missing soil silt, clay, sand content, bulk density, cation exchange capacity (CEC), and pH data were obtained from the Harmonized World Soil Database (HWSD) v. 1.2 (http://www.fao.org/soils-portal/soil-survey/soilmaps-and-databases/harmonized-world-soildatabase-v12/en). Based on this dataset, the EFs of Nr loss pathways were calculated by the following equation:
where i = 1–5, represented NO, N2O, NH3, NO3− leaching and Nr runoff, respectively. Etreatment is the loss rate of experimental treatments with applied N fertilizer, Econtrol is the loss rate of experimental control without applied N fertilizer, and N applied is the N application rate corresponding to Etreatment. The resulting data was used to develop RF models to predict EFs of the five Nr loss pathways.
RF models
RF models outperformed empirical models in previous studies15,18,19. We employed RF models to predict the EFs of NO, N2O, NH3, NO3− leaching, and Nr runoff. Environmental factors were selected via redundancy analysis20. Redundancy analysis, a basic ordination technique for gradients analysis, produces an ordination summarizing the variation in several response variables that can be best explained by a matrix of explanatory variables based on multiple linear regression. We conducted redundancy analysis using Canoco 5 to further analyze the effects of 10 environmental factors, including 4 soil physical factors (bulk density, silt, clay, and sand content), 4 soil chemical factors (pH, SOC, CEC and total N content), and 2 weather factors (total rainfall and mean temperature during the wheat growing period) of different EFs. Ultimately, the dataset of each pathway contained an ensemble of different environmental factors (Table 1).
When establishing the RF model, the first step was to select k features from a total of m (k < m) in the training dataset, to generate root node d and daughter nodes; the second step was to repeat the first step to generate a forest with n decision trees. Lastly, the testing dataset was used to create a final decision tree21. We randomly split the dataset, consisting of paired environmental factors and EFs of each Nr loss pathway, into 10 parts of equal size. Among these parts, 7/10 were used to train RF models for different pathways and 3/10 were used to test the performance of the models. We used “randomForest” R package (https://www.stat.berkeley.edu/~breiman/RandomForests/) to develop RF models in R software (https://cran.r-project.org/). To reduce random error, we ran each model 500 times and determined the performance based on the average value (Fig. 2).
Grid database
We categorized Chinese wheat production into four agroecological regions based on climate and soil variables: North China, North China Plain, South China, and Southwest China (Fig. S1)22. The grid layer of wheat distribution was derived from ChinaCropArea1 km (https://doi.org/10.17632/jbs44b2hrk.2), which provided a 1-km-grid crop-harvest dataset for wheat across China17. We selected the grid layer from 2014 and integrated nationwide climate and soil data, and N application rates derived via surveys of farmers, into grid layer (Fig. 1). We obtained climate and soil data from the same sources used for missing data. Climate data are in the form of 10-year averages23. The climate and soil data were extracted into each grid and used as input variables for the RF models.
Predicting EFs and calculating Nr loss
The EF of each pathway was predicted by corresponding developed RF model in each grid (Fig. 3). Nr loss was calculated by multiplying predicted EFs by N applied’ using the following equation:
where i = 1–5, representing NO, N2O, NH3, NO3− leaching and Nr runoff, respectively. And j = 1, 2, 3, … represented different grids. N applied’ was obtained through a nationwide survey of farmers from 2014. For the survey, 3–10 villages were chosen from each county, and 30–120 random farmers were surveyed. In total, 2.23 million farmers from 1,050 counties were surveyed22. The N application rates were extracted the average rate was determined for each county, superimposed using Kriging interpolation, and plotted on a map of China. Finally, average rates were extracted into grid layer of Chinese wheat production (Fig. 4a). Total Nr loss (Fig. 4b) was summed from five Nr loss pathways as Eq. (3) (Fig. 5).
Database structure
The Nr-wheat 1.0 database of Nr loss associated with Chinese wheat production consists of three files (Fig. 1). The ‘data file’ provides N application rates, EFs and Nr loss of five loss pathways (NO, N2O, NH3, NO3−, and Nr runoff). The ‘source file’ contains studies from which data were extracted to develop RF models, the code of RF model, and subregions of Chinese wheat production. The ‘readme file’ explains the abbreviations used in the ‘data file’ and ‘source file’, and provides the units of all variables included variables (Fig. 1).
Data Records
Data records are provided in three files, including ‘source file’, ‘readme file’, and ‘data file’. ‘Source file’ could be found in Supplementary Information, which contained all references used in the database, including 138 relevant papers, the code for the RF model, and four subregions of Chinese wheat cultivation. We divided the relevant papers into 5 subsets based on loss pathways. The ‘readme file’ explained the abbreviations and units. The synthetic N application rates surveyed from farmers, estimated EFs, and Nr loss were integrated into a map and are provided in ‘data file’. The map includes 229,366 1 × 1 km grids, which cover around 94% of wheat crop areas according to official statistics of which approximately 70% are located in the North China Plain. For each pathway, averaged rates and ranges of EFs and Nr loss were summarized (Table 2). The data (‘readme file’ & ‘data file’) can be accessed from National Tibetan Plateau Data Center and processed in ArcGIS, QGIS, R, or Python24.
Technical Validation
Our method and results can be discussed in terms of the (1) data sources, including data extracted from the literature, nationwide climate and soil data, and N application rates derived through surveys of farmers; (2) RF models; and (3) estimated EFs and Nr loss. Regarding (1), all studies from which data were extracted were obtained from authoritative database, including China National Knowledge Infrastructure and Web of Science databases. Each peer-reviewed study was checked by three researchers during the selection process. Nationwide climate and soil data were obtained from Chinese governmental observations and HWSD v1.2, which is widely accepted and used. The N application rates were obtained through surveys of millions of farmers across the entire country; the survey was supported by the Chinese government and many universities, and numerous professional teachers and students from universities were also involved. The data underwent multiple rounds of screening and extensive quality control, and has been published in high-quality international journals22,25. Regarding (2), we established RF models for each pathway to predict EFs. All models showed robust performance, with R2 values ranging from 0.66–0.80 and low root mean square errors (RMSE) for both training and testing sets (Fig. 2). Regarding (3), the Monte Carlo method was used to estimate the uncertainties of each pathway and total Nr loss; the uncertainties stemmed primarily from predicted EFs and grid-level N application rates. A Monte Carlo simulation was performed to estimate the uncertainty of grid-level N application rates among randomly varying county-level N application rates following Zhou et al.10, and the results showed that the average coefficient of variation (CV) of grid-level N application rates was 25.8%. The EFs of Nr loss explained more than 60% of the variance in RF models, and the CVs of Nr loss ranged from 20%-34% (Table S1). Assuming normal distributions for grid-level N application rates and EFs, the uncertainties of pathways and total Nr loss were low (Table S2), compared to previous studies9,26. Overall, the Nr-Wheat 1.0 database constitutes a robust Nr loss inventory of Chinese wheat production.
Code availability
All the code used to develop RF model is available in ‘source file’.
References
Food and Agriculture Organization of the United Nations. FAOSTAT http://www.fao.org/faostat/en/#data/RFN (2021).
Liu, Y. et al. Space-time statistical analysis and modelling of nitrogen use efficiency indicators at provincial scale in China. Eur. J. Agron. 115, 126032 (2020).
Zhang, X. et al. Managing nitrogen for sustainable development. Nature 528, 51–59 (2015).
Gu, B. J. et al. Nitrogen Footprint in China: Food, Energy, and Nonfood Goods. Environ. Sci. Technol. 47, 9217–9224 (2013).
Guo, J. H. et al. Significant acidification in major Chinese croplands. Science 327, 1008–1010 (2010).
Zhai, S. et al. Control of particulate nitrate air pollution in China. Nat. Geosci., (2021).
Yu, C. et al. Managing nitrogen to restore water quality in China. Nature 567, 516–520 (2019).
Gu, B., Ju, X., Chang, J., Ge, Y. & Vitousek, P. M. Integrated reactive nitrogen budgets and future trends in China. Proc. Natl. Acad. Sci. USA 112, 8792–8797 (2015).
Yue, Q. et al. Deriving emission factors and estimating direct nitrous oxide emissions for crop cultivation in China. Environ. Sci. Technol. 53, 10246–10257 (2019).
Zhou, F. et al. Re-estimating NH3 Emissions from Chinese Cropland by a New Nonlinear Model. Environ. Sci. Technol. 48, 8538–8547 (2015).
Liu, S. et al. A meta-analysis of fertilizer-induced soil NO and combined NO+N2O emissions. Global Change Biol. 23, 2520–2532 (2017).
Stocker, T. F. et al. Climate change 2013: The physical science basis. Contribution of working group I to the fifth assessment report of IPCC the Intergovernmental Panel on Climate Change. 18, 95–123, http://www.ipcc.ch/publications_and_data/publications_ipcc_fourth_assessment_report_wg1_report_the_physical_science_basis.htm (2014).
He, W. et al. Estimating soil nitrogen balance at regional scale in China’s croplands from 1984 to 2014. Agr. Syst. 167, 125–135 (2018).
Shang, Z. et al. Weakened growth of cropland‐N2O emissions in China associated with nationwide policy interventions. Global Change Biol. 25, 3706–3719 (2019).
Ying, H. et al. Safeguarding food supply and groundwater safety for maize production in China. Environ. Sci. Technol. 54, 9939–9948 (2020).
International Fertilizer Association. IFASTAT https://www.ifastat.org/plant-nutrition (2017).
Luo, Y. et al. Identifying the spatiotemporal changes of annual harvesting areas for three staple crops in China by integrating multi-data sources. Environ. Res. Lett. 15, 74003 (2020).
Saha, D., Basso, B. & Robertson, G. P. Machine learning improves predictions of agricultural nitrous oxide (N2O) emissions from intensively managed cropping systems. Environ. Res. Lett. 16, 24004 (2021).
Hamrani, A., Akbarzadeh, A. & Madramootoo, C. A. Machine learning for predicting greenhouse gas emissions from agricultural soils. Sci. Total Environ. 741, 140338 (2020).
P Šmilauer, J. L. Multivariate Analysis Of Ecological Data Using CANOCO 5. (Cambridge university press, 2014).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Cui, Z. et al. Pursuing sustainable productivity with millions of smallholder farmers. Nature 555, 363–366 (2018).
Yin, Y. et al. Calculating socially optimal nitrogen (N) fertilization rates for sustainable N management in China. Sci. Total Environ. 688, 1162–1171 (2019).
Tian, X. et al. Bottom-up estimates of reactive nitrogen loss from Chinese wheat production in 2014. National Tibetan Plateau Data Center, https://doi.org/10.11888/HumanNat.tpdc.272007 (2022).
Zhang, Q. et al. Outlook of China’s agriculture transforming from smallholder operation to sustainable production. Global Food Security 26, 100444 (2020).
Wu, S. et al. High-resolution ammonia emissions inventories in Fujian, China, 2009–2015. Atmos. Environ. 162, 100–114 (2017).
Acknowledgements
We thank the Taishan Scholarship Project of Shandong Province (No. TS201712082) for their financial support.
Author information
Authors and Affiliations
Contributions
Zhenling Cui, Xingshuai Tian and Minghao Zhuang designed the database, Xingshuai Tian, Yulong Yin, Jiahui Cong, Yiyan Chu, Ke He and Qingsong Zhang compiled the data, Xinghshuai Tian, Yulong Yin, Minghao Zhuang and Zhenling Cui wrote and revised the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Tian, X., Yin, Y., Zhuang, M. et al. Bottom-up estimates of reactive nitrogen loss from Chinese wheat production in 2014. Sci Data 9, 233 (2022). https://doi.org/10.1038/s41597-022-01315-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-022-01315-4