Introduction

According to World Cities Report 2022, released by UN-Habitat, urbanization remains a powerful mega-trend in the 21st century. The worldwide proportion of urban population is predicted to rise to 68% by 20501. China is a great agricultural and populous country, whose urbanization has been facing many challenges. China’s urbanization rate exceeded 50% in 2011 and reached 64.72% in 2021, with an average annual growth rate of 1.29%2. The expansion of cities has occupied a large number of rural areas. Small and mid-sized city (SMC) is defined as a population of a million or less in China, mainly including counties and towns. As the key to urban-rural integration at present, SMC is expanding faster than metropolis3. Especially in the central and western regions of China, its development is prone to be restricted by local environmental resources, population and socioeconomic level. Such problems like population loss, environmental contamination and industrial structural contradiction are prominent4,5,6. Urban-rural fringe is the transitional region at the periphery of urban construction area. It differs from the spatial structure of urban-rural dichotomy, presenting landscape and functional characteristics of both urban and rural area. It is easily influenced by urbanization, and that it doesn’t match up to urbanization quality in urban area7,8. The urban-rural fringe of SMC is considered to be frontier space of urban expansion at present and the most active area for land use and land cover change, at which serious human-land conflict has arisen9.

Soil is the sink of the vast majority of pollutants derived from human activities10. As a major inorganic pollutant, heavy metals (HMs) are non-biodegradable and may be accumulated in soil. It poses a significant threat to plant growth and human health when HMs concentrations increase and exceed certain levels11. A national report pointed out that varying degree of soil contamination existed in industrial and mining land, as well as part of farmland in China. 82.8% of soil contamination was caused by inorganic contaminants, among which the exceeding standard rate of Cd, Ni, As, Cu, Hg, Pb, Cr and Zn were 7.0%, 4.8%, 2.7%, 2.1%, 1.6%, 1.5%, 1.1%, 0.9% respectively12. HMs contamination in soils have aroused widely public concerns. Thus, soil surveys at different scales sponsored by government and research institution have been launched to ensure the safety of agricultural production as well as human health. A survey on HMs in soils normally includes soil sampling, testing and pollution evaluation. Besides, source apportionments have become necessary to understand the origin of HMs pollution and the amount they contribute to ambient background.

The distribution of HMs contamination in soil is spatially heterogenous. Soil from industrial estates, major road areas, derelict and unused land is more susceptible to HMs with exceeding standard rates over 20%12. Previous studies have revealed industrial production, agricultural activities and high natural background value are the main causes of soil HMs contamination13,14,15. Massive traffic emissions can induce HMs rapid accumulation in soil, which should not be ignored16,17. Geographic information system (GIS) and remote sensing (RS) technologies have been increasingly utilized to explore spatial heterogeneity of HMs pollution in soil, which supports high risk regions screening and sampling distribution. The distribution and quantity of sampling points directly affect the evaluation’s accuracy. Random sampling18,19 and grid sampling20,21 are commonly used methods to investigate HMs contamination in soil. Additionally, administrative division and land use classification are referred to set up soil sampling distribution, and to spatially classify HMs test results. Spatial interpolation method based on GIS can analyze and predict the overall pollution situation of research area with limited data from a small amount of sampling points22. However, it probably produces inordinate smoothing effect and results in inaccurate assessment. Sampling error can be 20 times larger than analytical error in some cases23. In order to improve the accuracy of assessment as well as to reveal the driving forces that shape spatial heterogeneity of HMs pollution, multi-factors including spatial structure24 land use25 and human activity intensity26,27 should be taken into consideration. Sampling points densification is regarded as an efficient way to improve the accuracy of pollution identification28,29. HMs pollution assessment characterizes adverse ecological and health risks. Current methods commonly use specific criterion values to determine how polluted it is, and background value is regarded as a significant threshold30,31,32,33. Soil background is originally used in geochemistry analysis to describe a natural and normal state34. Since soil is interfered by anthropogenic process, some derived concepts (e.g. anthropogenic background and area background) have been proposed to substitute the pristine background35. In this research, soil background represents the accumulation of naturally geogenic concentrations and concentrations from non-point anthropogenic sources with relatively low level36. Soil background value of HMs is a statistic that refers to national or provincial soil surveys in many researches. It normally reflects the average level at large-scale region, which cannot represent spatiotemporal heterogeneity at county-level or a smaller range37,38. As a consequence, measuring soil background of SMC can enhance the accuracy and reliability of HMs pollution assessment.

In recent years, the introduction of GIS, RS and portable X-Ray Fluorescence (pXRF) technologies have greatly reduced manpower and time costs, compared against traditional soil sampling and laboratory digestion methods by inductively coupled plasma-mass spectrometry (ICP-MS) and atomic fluorescence spectrometer (AFS). Furthermore, these techniques have been proven to be rapid, non-destructive and reliable as well39,40,41 based on which this research explored an innovative GIS-pXRF-based approach to identify ambient HMs concentrations in soil, to screen out high risk regions and to provide spatial interpretation in terms of pollution sources. This research focused on Shangluo, a typical small city in western China, and aimed at (1) classifying urban, rural and urban-rural fringe area with multisource data using GIS, (2) measuring soil background value based on pXRF, (3) identifying HMs contamination in soil and assessing its ecological and health risk of high soil background regions, and (4) source apportionment based on county-level urbanization and industrialization.

Materials and methods

Study area

Shangluo city is situated at the southeast of Shaanxi province, China (33° 02′ 30″–34° 24′ 40″ N, 108° 34′ 20″–111° 01′ 25″ E). It’s adjacent to Qinling Mountains with hilly feature. The total land area of Shangluo is 1.93 million hectares, among them the proportion of forest land exceeds 85% while the construction land is under 3%. Its urbanization rate is significantly lower than the average level in China, and more seriously population has decreased by 14.80% from 2016 to 2021. Shangluo is an archetypal SMC with relatively backward economy and prominent man-land contradiction that represents the current status of SMC in the central and western regions of China. This study focused on the core municipal district of Shangluo city. According to Shangluo Territorial Spatial Master Plan (2021–2035), study area was primarily defined by the central urban construction area with dense population and fast urban development. Manufacturing (i.e. non-ferrous metal metallurgy, electrical and equipment processing, concrete production), tourism and traditional farming have been developing these years. The central urban built-up area of the municipal district, with approximately an area of 80 km2locates along river valleys with obviously linear characteristics (Fig. 1).

Fig. 1
figure 1

Experimental area ___location (a) DEM of Shangzhou; (b) land use and land cover of Shangzhou; (c) remote sensing (RS) image of study area. Maps were prepared by the authors using ArcGIS software-version 10.3 (https://www.esri.com/) and basemaps were obtained from Geospatial Data Cloud (http://www.gscloud.cn/).

Multisource data fusion-based space identification

This study formed a spatial identification model to prioritize data classification, and then explored their differences and driving forces. Defining urban-rural fringe is considered to be the key to classification42. Information entropy model was used to calculate landscape disorder degree (LDD), and it can reflect the dynamic changes of land use43. Point of interest (POI) datasets contain a large quantity of geographical entities. As an important basis for urban functional zones division, it represents various urban activities44. Night-light intensity (NLI) can serve as a substitute data for integrated socio-economic indicators to evaluate the balance of urban rural development45. Spatial demographic data were introduced to assign weights to each criterion when combining above multisource data. These criteria have been widely used in geospatial analysis research to reflect landscape heterogeneity, spatial distribution characteristics of specific anthropogenic activities, as well as economic development levels46,47,48. In previous study, this fusion model has proved useful with good accuracy in urban-rural spatial classification49. Moreover, land-use type50 spatial structure51 and anthropogenic process52 which are the driving forces behind the variation of soil HMs were taken into account.

As shown in Fig. 2, high-resolution RS image (GF-2 image) was corrected, fused, tessellated and clipped by using Envi 5.3. Then land-use types were classified via object-oriented support vector machines (SVM), and then LDD was calculated. POI kernel density was evaluated after POI data filtered and reprojected. The dataset Suomi National Polar-orbiting Partnership Visible Infrared Imager Radiometer Suite (NPP-VIIRS) was used to analyze NLI. Worldpop data were processed and used to depict population spatial distribution. After stratified heterogeneity analysis by GeoDetector, weights of LDD, POI and NLI for general evaluation were determined by using ArcGIS 10.3. Urban area, urban-rural fringe and rural area, used to interpret spatial heterogeneity of HMs contamination in soil, can be identified via nature breaks method. Data sources are listed in Table S1 of Supplementary Materials.

Fig. 2
figure 2

Space identification and interpretation workflow. Maps were prepared by the authors using ArcGIS software-version 10.3 (https://www.esri.com/) and basemaps were obtained from Geospatial Data Cloud (http://www.gscloud.cn/).

Soil background value calculation

With the improvement of pXRF technology, it has been widely used to conduct environmental investigation and evaluation (e.g. soil, water, sediment and dust)40,41,53,54,55. Referring to The Technical Specification for Soil Environmental Monitoring (HJ/T 166–2004), this study used systematic pattern to established detection grids with the confidence level of 95%. A total of 80 grids were detected in-situ by pXRF and each unit was 1 km ×1 km. Each detection point should be close to its grid’s geometric center. These points were distributed across different land-use types (e.g. industrial land, residential land, transportation land, ecological land, etc.), moreover, for points in construction land, they should mainly from the urban green space built over 20 years ago in different land-use types, for instance, parks in the community and green belts along roads; for points in non-construction land, they should locate in undeveloped land, forests and grasslands, to avoid being influenced by measurable pollution sources56. The detection points are exactly shown in the results section after the spatial classification has been completed.

This study used pXRF analyzer (Sci-Aps, X-200) to test concentrations of Cd, Cr, Cu, Ni, Pb, Zn, Hg and As in topsoil. Soil moisture has been regarded as a key influencing factor. In most cases, the lower is the soil moisture, the higher is the measurement’s accuracy57,58. Thus, in-situ testing of soil by pXRF were conducted after prolonged dry weather. This analyzer used 6–50 kV Au anode excitation source and 20 mm2 silicon drift detector with 135 eV resolution full width at half maximum height. In Geo-Env soil mode, it was factory calibrated with fundamental parameters. Besides, this study confirmed the pXRF’s accuracy by comparing laboratory ICP-MS result. Each measurement was recorded using a 90-seconds exposure time including three x-ray beam periods for different detectable elements. The measurement fulfilled the criteria of United States Environmental Protection Agency (EPA) Method 6200. Its stability and accuracy were evaluated by repeatedly measuring national certified reference material (CRM). Geometric means of measured pXRF results were calculated to indicate their soil background values for each element.

Sampling and laboratory analysis at high risk regions

National technical specification HJ/T 166–2004 requires each sampling point for soil quality monitoring in urban and peri-urban area should be 2 km apart at county level. For this reason, 20 soil samples (25% of the total) were collected in 80 detection grids, based on soil background values and their ecological risks. Eash soil sample (> 1 kg) was collected by mixing five subsamples in X-shape and screening via quartering method. Topsoil (0–20 cm) was collected by wooden shovel, and then air dried, finely ground by agate mortar to penetrate 100 mesh (0.15 mm) nylon sieve. For measuring Cr, Ni, Cu, Zn, As, Cd and Pb concentrations in soil, samples were microwave digested by HNO3-HCl-HF-H2O2 and determined by ICP-MS (Agilent, 7700e); For measuring Hg, samples were digested by HNO3-HCl and determined by AFS (JSK, SK-2003AZ). CRM GSS-30 were used for quality assurance and control.

Indices of contamination evaluation

Index of geo-accumulation (Igeo)

Index of geo-accumulation (Igeo), proposed by Müller in 196959, is popularly used to evaluate the contamination of HMs in soil, water and sediment now. The contamination level is determined via comparing the measured concentration and soil background. In order to solve the uncertainty of HMs distribution and to more accurately assess soil background, researchers have made various improvements on Igeo calculation60,61. In this study, the Igeo was computed by following equation:

$$\:Igeo={{log}}_{2}\left(\frac{{C}_{n}}{1.5\times\:{B}_{pxrf}}\right)$$
(1)

.

where Cn represents the measured concentration of element n (mg·kg- 1) and Bpxrf is the local soil background of element n detected by pXRF (mg·kg- 1). Igeo is classified as follows: <0, unpolluted, 0–1, unpolluted to moderately polluted, 1–2, moderately polluted, 2–3, moderately polluted to heavily polluted, 3–4, heavily polluted, 4–5, heavily polluted to extremely polluted, and ≥ 5, extremely polluted.

Potential ecological risk index (RI)

Hakanson established potential ecological risk index (RI) by considering pollutants’ concentration, category, toxicity and receptor sensitivity62which has been introduced into the evaluation of multiple HMs contamination and accumulated ecological risks in soil. The RI was computed by following equation:

$$\:RI=\sum\:_{i=1}^{n}{E}_{r}^{i}=\sum\:_{i=1}^{n}{T}_{r}^{i}\:\times\:\:\frac{{C}_{n}^{i}}{{\:C}_{pxrf}^{i}}$$
(2)

.

Where Ci n stands for the measured concentration of a single HM i in soil (mg·kg- 1). Ci pxrf is the soil background of HM i via pXRF analysis (mg·kg- 1). Ti r is the biological toxic response factor for HM i. The biological toxic response factors for Cr, Ni, Cu, Zn, As, Cd, Pb and Hg are respectively 2, 5, 5, 1, 10, 30, 5, and 4062. Ei r is the potential ecological risk for a single HM i. RI is the accumulated potential ecological risk of multiple HMs and it is classified into 5 grades: <150, low risk, 150–300, moderate risk, 300–600, high risk, and ≥ 600 very high risk.

Health risk assessment model

This study used the health risk assessment model from United States EPA to discuss the health risk, originated from HMs in soil, of adults and children through different exposure paths63. This model was applied to assess human health risk in 20 areas exhibiting the highest HMs concentrations. The calculation equations were as follows:

$$\:{ADD}_{ing}=\frac{{C}_{n}\times\:IngR\times\:EF\times\:ED}{BW\times\:AT}\times\:{10}^{-6}$$
(3)
$$\:{ADD}_{inh}=\frac{{C}_{n}\times\:InhR\times\:EF\times\:ED}{PEF\times\:BW\times\:AT}$$
(4)
$$\:{ADD}_{derm}=\frac{{C}_{n}\times\:SA\times\:AF\times\:ABS\times\:EF\times\:ED}{BW\times\:AT}\times\:{10}^{-6}$$
(5)

.

Where Cn is HM concentration in soil (mg·kg- 1). ADDing, ADDinh and ADDderm are average daily intakes of HMs via ingestion, inhalation, and dermal contact respectively (mg·kg- 1·d- 1). Other parameters (IngR, InhR, EF, ED, PEF, BW, AT, SA, AF, ABS) are shown in Supplementary Materials Table S264,65.

$$\:{HI}_{i}=\sum\:\frac{{ADD}_{ij}}{{RfD}_{ij}}$$
(6)
$$\:THI=\sum\:_{i=1}^{n}{HI}_{i}=\sum\:_{i=1}^{n}\left({HI}_{ing}+{HI}_{inh}+{HI}_{derm}\right)$$
(7)
$$\:{CR}_{i}=\sum\:{ADI}_{ij}\times\:{SF}_{ij}$$
(8)
$$\:TCR=\sum\:_{i=1}^{n}{CR}_{i}=\sum\:_{i=1}^{n}\left({CR}_{ing}+{CR}_{inh}+{CR}_{derm}\right)$$
(9)

.

Where i stands for the varieties of HMs and j represents different exposure paths. RfDij and SFij are the noncarcinogenic average daily reference dose and carcinogenicity slope factor of HM i via exposure path j (mg·kg- 1·d- 1), listed in Supplementary Materials Table S366,67,68. HIi and CRi are the non-carcinogenic and carcinogenic health risk indices respectively of an individual HM element; THI and TCR are the total accumulated non-carcinogenic and carcinogenic health risk indices of multiple exposure routes for all HMs. The health risk increases as HIi and CRi rise. For non-carcinogenic risk, if HIi or THI < 1, there is no significant risk; if HIi or THI  1, there is potential risk. For carcinogenic risk, if CRi or TCR  10− 6, there is no risk; if 10− 6<CRi or TCR  10− 4, there is low and acceptable risk; if CRi or TCR > 10− 4, there is high risk64.

Statistical analysis

Basic descriptive statistics of HMs concentrations were performed in IBM SPSS Statistics 26, including average mean, median, standard deviation (SD), skewness, kurtosis, etc. Spatial analyst toolbox in ArcGIS 10.3 was used for spatial analysis and modeling, and it was also used to create distribution maps of HMs concentrations using inverse distance weighted (IDW) interpolation method. Data distribution characteristic without removing outliers of Cr, Ni, Cu, Zn, As Cd, Pb and Hg were found based on Igeo statistics. Laboratory test results of 20 soil samples were used to calculate RI and human health risk. Pearson correlation analysis was used to evaluate pairwise correlation for HMs in soil and a two-tailed p ≤ 0.01 was regarded as statistically significant. Principal component analysis (PCA) was adopted to conduct pollution source apportionment of HMs, and principal components were extracted based on the criterion of eigenvalues > 1.

Results

Space identification and detection points

In Shangzhou district, vegetation covered a total area of 595.39 km2accounting for 97.68%, while construction land only accounted for 1.84%. More than half a million people were living in the river valleys. LDD of urban-rural fringe was obviously higher than urban and rural area. Based on analysis grids (100 m×100 m), the spaces were preliminary regarded as urban-rural fringe when 0.26 < LDD < 0.73. The results of POI were basically consistent with NLI, indicating the close correlation between anthropogenic activity intensity and urban-rural spatial structure. When the band-width was 1 km, the computed results of POI kernel density stayed comparatively stable. The density of urban area, urban-rural fringe and rural area decreased sequentially and tended toward to 0. Considering population density, the weight of LDD, POI kernel density and NLI were 0.13, 0.46 and 0.41, respectively. A comprehensive evaluation index was established for defining urban area, rural area and urban-rural fringe. There were totally 5.73 km2 (0.22%) urban area, 2610.73 km2 (98.26%) rural area and 40.57 km2 (1.53%) urban-rural fringe. In order to focus on soil background of HMs influenced by urbanization, the research scopes were narrowed down by omitting extensive forest lands and mountainous regions. Totally 80 grids (1 km×1 km) with corresponding detection points inside were defined, including 6 points in urban area, 35 points in urban-rural fringe and 39 points in rural area (Fig. 3).

Fig. 3
figure 3

Space identification results of study area (a) general spatial range; (b) detection area based on RS image; (c) distribution of 80 detection points locations with identifiers. Maps were prepared by the authors using ArcGIS software-version 10.3 (https://www.esri.com/) and basemaps were obtained from Geospatial Data Cloud (http://www.gscloud.cn/).

Stability and accuracy of pXRF

According to Technical Guideline for the Development of Environmental Monitoring Analytical Method Standards (HJ168-2020), CRM GSS-30 sample was retested seven times. The limit of detection (LOD) was calculated as 3.143 × SD (n = 7). The pXRF instrument has comparatively low detection sensitivity and accuracy for low concentration HMs69. Cd and Hg cannot be detected by pXRF because they are less than 1 ppm in CRM GSS-30. Results showed the LODs of Cr, Ni, Cu, Zn, As and Pb were 3.75, 3.15, 2.92, 5.00, 1.61 and 4.38 mg·kg- 1, respectively, which not exceed the risk screening values for soil contamination in national standard (GB15618-2018), as well as the risk intervention values. The relative standard deviation (RSD) for repeatability of Cr, Ni, Cu, Zn, As and Pb were 2.32%, 4.75%, 3.48%, 1.70%, 4.86% and 3.17%.

Results of pXRF analyzes were compared against ICP-MS values of 20 soil samples, and linear regressions were performed to evaluate pXRF measurement validity (Fig. 4). It showed a good agreement between pXRF and ICP-MS values, with R2 ranging from 0.863 to 0.966 and slope ranging from 1.058 to 1.169 for Cr, Ni, Cu, Zn and Pb. It achieved the definitive level according to the criteria, which determined the pXRF analyzer’s level of data quality70. Relatively lower-levels of correlation and accuracy were observed for As, whose R2 and slope for As regression line were 0.757 and 0.667, respectively. In most cases, their data quality at least achieved the criteria of quantitative screening level57.

Fig. 4
figure 4

The comparative analysis of Cr, Ni, Cu, Zn, As and Pb concentrations at 20 soil samples obtained by pXRF and ICP-MS.

pXRF-based soil background value and Spatial interpretation

The general detection rate of Cd and Hg were only 26.25% and 6.25% in this study. Accordingly, the statistical analysis of Cd and Hg were excluded at primary screening stage. As shown in Table 1, the average concentrations of Cr, Ni, Cu, Zn, As and Pb were severally 53.77, 40.07, 48.37, 146.87, 14.80 and 49.16 mg·kg- 1, which were 86%, 139%, 226%, 212%, 133%, 230% of soil background values of Shaanxi province in 199071. By and large, the coefficient of variation (CV) of Pb and Zn were higher than other HMs. In this study, geometric mean was slightly lower than arithmetic mean, which was less affected by outliers and more suitable for reflecting the overall situation.

Table 1 Descriptive statistics of heavy metals (HMs) concentrations at 80 detection points in the study area analyzed by pXRF.

According to Fig. 5, the average soil concentrations of Cr, Ni, Cu, Zn and Pb in urban-rural fringe were 56.64, 41.57, 53.94, 177.20 and 58.28 mg·kg− 1, respectively, which were generally higher than urban and rural area. Notably, the concentrations of Zn and Pb in urban-rural fringe exceeded those in urban area by 28.90% and 19.93%, respectively, and surpassed those in rural area by 46.33% and 41.91%. Soil concentration of As exhibited a distinct spatial pattern, with rural area showing slightly higher (by 2.00%) than urban-rural fringe and substantially higher (by 17.43%) than urban area. Compared to urban (n = 6) and rural area (n = 39), soil samples from urban-rural fringe (n = 35) demonstrated greater spatial variability in terms of HMs concentrations, particularly for Pb (SD = ± 54.41 mg·kg− 1, CV = 93.37%), As (SD = ± 6.66 mg·kg− 1, CV = 44.97%) and Cr (SD = ± 27.14 mg·kg− 1, CV = 47.92%). In this study, a Kruskal–Wallis test indicated that there was statistically significant difference between rural area and urban-rural fringe for datasets Cu and Zn, whose p-value was less than 0.05 and 0.01 respectively.

According to land-use types, these points were classified into construction land (residential, transportation, education, logistics, industrial and commercial land) and non-construction land (ecological land and farmland). Results showed the soil background values of construction land were higher than non-construction land by 2.94-93.72% of different HMs, particularly the element Pb. High value zones of Cu, Zn and Pb agglomerated to form larger clusters in southeastern urban-rural fringe (Fig. 6). Concentrations of Cu and Pb reached maximum with 114.61 and 228.86 mg·kg− 1 at detection point SL-15-2 and SL-15-3 respectively. The maximum of Zn appeared at point SL-6-3, which were 278.10 mg·kg− 1 and situated in urban core area. Major high value zones of As were far from urban built-up area. It peaked at point SL-12-4 in rural area with 30.70 mg·kg− 1.

Fig. 5
figure 5

Comparative analysis of HMs concentrations across urban area (n = 6), urban-rural fringe (n = 35), and rural area (n = 39).

Fig. 6
figure 6

Spatial distribution of HMs concentrations in the study area based on inverse distance weighted (IDW) interpolation.

Laboratory test-based contamination risk

In high soil background areas, 20 soil samples were collected in top 20 high risk regions based on RI ranks (Fig. 7). There were 12 sampling points situated in urban-rural fringe, 7 sampling points situated in rural area and a sampling point located in urban area. Each of them was sampled from the primary and characteristic land-use type (e.g. industrial land) of each zone, to indicate potential pollution sources. The average concentrations of Cr, Ni, Cu, Zn, As, Cd, Pb and Hg were 83.54, 42.45, 41.74, 173.87, 14.30, 0.55, 34.21, 0.11 mg·kg− 1 respectively.

Fig. 7
figure 7

Distribution of 20 soil sampling sites in high risk HMs contamination regions. Maps were prepared by the authors using ArcGIS software-version 10.3 (https://www.esri.com/) and basemap was obtained from Geospatial Data Cloud (http://www.gscloud.cn/).

As shown in Fig. 8, most of the soil samples were not polluted (< 0) by HMs according to Igeo results. The data dispersion of Hg was the highest, and after that Zn and Cd. The average Igeo of Cd was 1.19, which could be deemed slightly polluted. Moreover, there were Cd outliers greater than its upper limit, showing heavy pollution at certain points. The average RI were 106.36 and its maximum reached 346.16 at point L17, considered high risk. Totally 4 points (20%) were between 150 and 300, considered moderate risk (Fig. 9). Cd and Hg were top contributors that caused ecological risk. It indicated that the inputs of Cd and Hg should be priority-reduced from the source to control pollution risk.

Fig. 8
figure 8

Statistics of index of geo-accumulation (Igeo) with box plot of 20 soil samples.

Fig. 9
figure 9

Accumulated potential ecological risks index (RI) of HMs at 20 sampling points.

Human health risk assessment

Health risk was assessed in 20 high risk regions. For non-carcinogenic risk, ingestion intake HIing accounts for over 85% of the total intake, followed by dermal intake HIderm and inhalation intake HIinh. Similarly for carcinogenic risk, ingestion intake of CRing accounts for over 90% of the total intake. Regarding single HM element, the non-carcinogenic health risk HI followed the order As > Cr > Ni > Pb > Cu > Cd > Zn > Hg, and each of them was less than 1 to both adults and children. As, Cr, Ni were major non-carcinogenic pollutants, accounting for 47.20% and 46.29%, 26.02% and 25.81%, 16.74% and 17.51% for adults and children respectively, and the sum of these both exceeded 85%. The carcinogenic health risk CR did not exceed 1 × 10− 4, and Ni was the most significant among them, which was 4.18 × 10− 5 to adults and 7.47 × 10− 5 to children, both in the order of Ni > Cr > As > Cd > Pb (Table 2). It indicated that the overall HMs condition in soil was in a safe state. However, Children, with faster intake rate (IngR = 200), lighter weight and specific behavior, normally were more susceptible to non-carcinogenic and carcinogenic health risk than adults in same environment64,72. In this study, the cumulative non-carcinogenic risk THI and carcinogenic risk TCR of children reached 1.46 (> 1) and 1.52 × 10− 4 (> 1 × 10− 4), which were 6.82 and 1.74 times higher than adults, respectively. The health risk to children should not be ignored.

Table 2 Evaluation of the carcinogenic and non-carcinogenic health risk of HMs at 20 sampling points.

HMs pollution source apportionment in SMC

This study explored the homology of multiple HMs via evaluating their pairwise correlations. The closer the correlation coefficient R-value runs to 1, the closer the sources of different HMs elements are73. In this research, concentrations of As were relatively independent, not affected by other HMs. The remaining elements shows a certain pairwise positive correlation. Cr-Ni, Cd-Zn and Pb-Zn in particular were significantly correlated, whose R-value reached 0.889, 0.837 and 0.882 respectively. Cu had wide correlation with Cr, Ni, Zn and Pb concurrently, whose R-value were between 0.6 and 0.8 (Fig. 10). The results suggested multiple and diverse pollution sources. Similar conclusions can be seen in other studies17,74,75.

Fig. 10
figure 10

Pairwise Pearson correlation analysis heatmap of HMs concentrations in 20 soil samples (*p ≤ 0.01).

Totally 2 principal components were extracted when eigenvalues > 1. Their cumulative contribution rate hit 75.24%, explaining and covering a majority of data information (Supplementary Materials Table S4). The contribution rate of component 1 was 46.97%, with strong positive loading on Cu, Zn, Pb and Cd (> 0.7) and moderate positive loading on Ni and Cr (0.5–0.7). It was basically consistent with correlation analysis results. The accumulation of various HMs in soils probably derived from analogously anthropogenic sources, such as metal processing, building material manufacturing etc. It therefore represented industrial activities, which was confirmed in other studies of SMC76,77. Component 2 accounted for 28.27% of the total contribution rate. It showed moderate positive loading on Hg and Cd, which were 0.578 and 0.528 (Supplementary Materials Table S5). Some researchers pointed out that Cd and Hg are likely to come from both fertilizers and pesticides in agricultural production, and exhaust gas and wheel wear can also affect their accumulation in soil78.

Discussion

Spatial interpretation of HMs pollution in SMC

This study prioritized the evaluation of soil background values of HMs in the region affected by urbanization at county level. Although the urbanization growth of SMC is not as fast as megacity in China, it has more than doubled over the past 20 years. Soil background of HMs increased with obviously spatial heterogeneity in urban area, urban-rural fringe and rural area. It has been confirmed that urbanization growth is the key factor causing an increase in overall level of HMs concentrations in soils79,80. Urban area is more densely populated and constructed, whereas its soil background value is lower than that of urban-rural fringe in Shangzhou district. Similarly, other researchers found that suburbs within urban areas are much more susceptible to be polluted by HMs than urban cores at scale on town level81,82 while this difference is not obvious in megacity83,84. Some conclusions directly pointed out urban area (an entire urbanized area) is more polluted than rural area85,86. Municipality boundary and urban planning scope were used to define urban area as a whole, which is rather simplistic, ignoring urbanization gradients87 as well as the type and intensity of land use88,89. Besides urban construction land, there are normally large quantities of unused land, agricultural land and rural construction land within the boundary. Long-term dynamics in these land resource use causes a wide variation in anthropogenic input of HMs between urban cores and suburbs. Therefore, to identify urban-rural fringe based on urban-rural dichotomy will be helpful to accurately control HMs pollution in soils.

Zinc-based alloy manufacturing is the pillar industry of Shangluo, whose output increased by more than 12 times from 2003 to 2023 according to local statistics yearbook. In addition, the production of cement and iron ore has risen substantially. The development of industrialization aggravates the HMs (Zn, Pb, Cu, etc.) accumulation in soils as a result. Restricted by terrain, urban development has elongated the space of Shangzhou district. This linear urban space determined the high traffic intensity of the main roads, which had an intensified impact on the surrounding environment, as well as everyday life. Gas and dust containing HMs generated by traffic can be deposited in the surrounding farmland through surface runoff and atmospheric diffusion90. Leaded gasoline, coal combustion and smelting were confirmed as the main sources of Pb accumulation in soils via isotopic tracing91. As an exception, soil background of As in rural area was higher than that of urban-rural fringe. This anomaly was strongly associated with chemical fertilizers, pesticides and feed containing arsenide in agricultural production92 which should be taken seriously.

Fig. 11
figure 11

Land use maps of study area and typical risky points in urban-rural fringe (L15-18) based on RS image. Maps Data: Google, ©2025 Airbus, Maxar Technologies.

Urban development and human health risk control

SMC has a tremendous potential for economic growth, regarded as a primary engine of China’s economy at this stage. Since 2022, ‘promoting high-quality urbanization at county-level’ policy has proposed forming advantageous industries at county-level but preventing pollution and protecting human health should have priority. Government established high-tech industrial technology zone in southeastern urban-rural fringe to boost industrial agglomeration and local economy; however, sampling areas from industrial land, wasteland, farmland and rural community next to main traffic roads of high-tech industrial technology zone were at more ecological risks of HM pollution, as well as human health risks (Fig. 11). Although these risks are within an acceptable range in general, children who are exposed to high soil background affected by urbanization and industrialization probably have been at nonnegligible risk. In Ankang city where is in the same province and adjacent to Shangluo city, CR of Ni and As exceeded 10− 4 as a result of industrial manufacturing, transportation and agricultural production93. In Chongqing, China, coal mining caused significant non-carcinogenic and carcinogenic risks to children with mean THI = 1.30 and TCR = 1.53 × 10− 394. An investigation in Mexico indicated over 25% of the total research areas were at considerable health risk for children with THI > 1. Given its elevated potential for health risk escalation, arsenic (As) was designated as a priority control target for risk management95.

Urban-rural fringe provided construction reserves for urban expansion via intensified construction-agricultural land conversion96. In addition, large-scale industrial factory relocation happened when industrial transformation of China implemented. Significant HM pollution was induced in the course of factory demolition and reconstruction, especially chemical manufacturing, ferrous and non-ferrous metal processing factories, which rose serious non-carcinogenic and carcinogenic risks to children65. Quite apart from industrial and agricultural production, the deficiency of urban planning and construction supervision at county-level may exacerbate the health risks. Some wasteland originally from suspended or delayed construction projects was used for construction material storage and even household waste disposal, and it was under the absence of supervision compared to urban cores. The symbiosis between agricultural and non-agricultural spaces introduced multiple pollution sources, which remained a huge challenge. Different from megacity, agricultural activity (e.g. fertilizers and pesticides containing As) is a major pollution source83,97.

Conclusions

This study combined GIS and pXRF techniques to screen high risk regions of HMs pollution in soils and to identify their ecological and health risks at county-level. However, pXRF technique exhibits several limitations. Soil moisture can affect in-situ measurement stability. The method demonstrates poor sensitivity for low-concentration HMs, soil concentrations of Cd and Hg in most samples were below LOD. The correlation between pXRF and ICP-MS results of As was suboptimal (R2=0.757), markedly lower than other HMs. Nevertheless, pXRF remains a cost-effective, efficient and reliable approach for large-scale screening of HM contamination. In the future, new technologies (e.g. micro-XRF) can be introduced to improve low concentration detection. Developing machine learning correction models using paired pXRF-ICP-MS datasets will improve measurement accuracy.

At present, people in SMC are exposed to high soil background that Ni, Cu, Zn, As and Pb exceeding the soil background values of Shangzhou in 1990 to varying degrees, at 1.39, 2.26, 2.12, 1.33, and 2.30 times, respectively. Soil background of HMs in urban-rural fringe was significantly larger than that of urban area and rural area, except for element As. In high risk regions, Cd posed a slight pollution ecological risk with its average Igeo was 1.19, while the remaining HMs were considered clean and safe. 25% of high soil background regions were at least moderate ecological risks with maximum RI up to 346.16. As, Cr and Ni were proved to be highly associated with non-carcinogenic and carcinogenic health risks. These human health risks were acceptable in general, whereas children were more susceptible and showed higher health risks than adults with THI = 1.46 and TCR = 1.52 × 10− 4. Industrial source (46.97%) and agricultural source (28.27%) were identified as principal components. In this study, the characteristic of linear urban space is conducive to HMs accumulation in surroundings of main roads. Besides, rapid land conversion and regulatory deficiency in urban-rural fringe may exacerbate the risks on human health. Therefore, urban-rural fringe should be priority-controlled at county-level. It is noteworthy that the urban-rural fringe has been increasingly recognized as a critical transitional zone requiring targeted environmental governance, with multi-tiered government interventions implemented in recent years. To facilitate the green industrial transition, priority shall be given to developing clean energy sectors (particularly photovoltaic recycling and battery technology innovation) and circular economy practices (e.g. industrial symbiosis for raw material utilization). Polluting enterprises need to undergo green transformation through advanced technology introduction. Industry development zone should be located away from agricultural production areas. Improving land use efficiency and reducing wasteland in urban-rural fringe will be beneficial to pollution prevention and risk control.