Introduction

Ground motion prediction equations (GMPEs) calculate the intensity measures of ground movements, essential for structural design and seismic risk evaluation. Several techniques and parameters, such as response spectrum, frequency content measures, earthquake magnitude, epicentral distance, fault type, soil type, and site classification parameters, have been adopted for developing GMPEs. The significant parameter usually predicted by GMPE is peak ground acceleration (PGA). In the study of PGA components, horizontal movements are deemed more critical for engineering than vertical ones, prompting most research to focus on horizontal aspects1,2,3,4. Typically, three approaches are used: first, the larger of the two horizontal components is selected; second, both horizontal components are considered; and third, the geometric mean of the estimates from both components is computed.

Mechanistic and empirical models, such as stochastic models and attenuation relations, have been developed, resulting in numerous GMPEs. From 1964 to 2021, 485 GMPEs for PGA and 316 for spectral ordinates were developed5. The accuracy of predictions relies on both the quality and quantity of data used to estimate the parameters. However, beyond the data itself, a key factor in inferring a GMPE is choosing the most suitable functional form. Initially, Esteva and Rosenblueth proposed a GMPE to predict PGA using a simple exponential function6. Several models based on synthetic data were proposed, but with advancements, models incorporating the effects of amplification, attenuation, fault mechanisms, etc., became more complex. Boore and Joyner presented strong ground motion prediction equations developed from earthquakes in California and Italy using peak acceleration as the most common measure, utilizing regression methods. The data was extracted from the updated NGA-West 2 database7. Bommer et al.. introduced an empirical approach to deriving GMPEs for estimating short-period response spectral ordinates across Europe and the Middle East, covering a magnitude range of 3.0 to 7.68. Boore and Atkinson formulated a GMPE with dependent variables include PGA, PGV, and 5% PSA. Notably, the horizontal components derived are not simple geometric means but are calculated using the 50th percentile values from all possible orientations3. Campbell and Bozorgnia developed an empirical ground motion model that estimates PGA, PGV, PGD, and 5% damped linear elastic response spectra for time periods between 0.01 and 10 s. It explicitly incorporates effects such as magnitude saturation, attenuation dependent on magnitude, style of faulting, rupture depth, hanging-wall geometry, and site response, both linear and nonlinear. The model’s results suggest that ground motion predictions were fairly consistent across different distances, with significantly increased readings during reverse-faulting scenarios where ruptures don’t reach the surface. Higher ground motions were also observed in some strike-slip ruptures with limited surface expression, though other strike-slip cases showed divergent results. The model’s reliability may decrease near the extremes of its magnitude range9. Ansari et al. developed advanced models for predicting tunnel damage in seismically active and landslide-prone regions, using Artificial Neural Networking (ANN) and deep learning techniques. The multi-hazard damage prediction model and the seismic tunnel damage prediction model both employ feed-forward neural networks trained with critical input parameters, such as PGA, source-to-site distance, overburden depth, tunnel depth, lining thickness, and geological strength index. Validated against significant seismic events like the Kobe (1995), Chi-Chi (1999), Mid-Niigata (2004), and Wenchuan (2008) earthquakes, these models demonstrated high predictive accuracy and reliability, providing crucial seismic design recommendations to enhance post-disaster infrastructure serviceability. However, limitations include the uneven distribution of seismic data and constraints on model generalizability across varied geological conditions, necessitating further validation for broader application. The research outputs also feature damage indexing, predictive graphs, and globally applicable design guidelines10,11. In the following years, the structure of GMPEs was adjusted several times to account for factors such as amplification, attenuation, faulting mechanisms, damping, and the uncertainties observed in ground motions. These changes resulted in highly intricate functional models. A significant limitation of this type of parametric model is the need for a predefined functional structure.

The essential need for reliable earthquake prediction has long been a subject of debate, challenging even the most esteemed scientists. Precisely predicting the exact time, ___location, and magnitude of an earthquake is exceedingly complex due to the multitude of seismic precursors and other indicators of potential quakes. However, advancements in artificial intelligence (AI) and the aggregation of data from these indicators offer a promising avenue for improving earthquake prediction accuracy12,13. The utilization of advanced technological models like machine learning algorithms, ANN, and fuzzy logic has become increasingly prevalent in the field of earthquake engineering to interpret trends and yield accurate outcomes. Furthermore, AI enhances computational efficiency and reduces error rates, offering significant advantages in tasks such as phase picking, early warning systems, ground-motion prediction, tomography, and geodesy. Herein, a ground motion prediction model utilizing an ANN specifically for shallow earthquakes, with the goal of enhancing earthquake hazard safety evaluation, is introduced. The model incorporates key input variables, including moment magnitude, fault type, epicentral distance, and soil type, and predicts PGA at 5% damping. After assessing the model, a genetic algorithm (GA) was integrated with the ANN model. This combination aimed to predict potential earthquake scenarios.

Modeling peak ground acceleration

Before explaining the modeling of PGA in detail, an overview of the methodologies adopted in this study is shown in Fig. 1. The entire study consists of two main stages: data processing and modeling for prediction. In the following sections, more details will be provided for each stage.

Fig. 1
figure 1

Work flow of the study to predict PGA at 5% damping using ANN-GA.

Data acquisition and preprocessing of PGA

The data was sourced from the Pacific Earthquake Engineering Research Center (PEER). This database encompasses a comprehensive collection of ground motions recorded during shallow earthquakes in active tectonic areas around the world. For this study, data from 100 unscaled original earthquake response spectra were gathered, specifically focusing on no-pulse records that do not consider the effects of velocity pulses due to rupture directivity. The input variables for the study include moment magnitude, fault type, epicentral distance, and soil type, with the output variable being PGA. Based on PEER, fault types are categorized by rake angle as summarized in Table 1. Meanwhile, soil type is grouped according to the National Earthquake Hazards Reduction Program (NEHRP) standards, utilizing (time-averaged shear-wave velocity to a depth of 30 m) Vs30 values as summarized in Table 2.

Table 1 Fault mechanism according to rake angle.
Table 2 Soil type classification on basis of Vs30.

In this study, ground motion data were collected from recordings featuring earthquake magnitudes between 5.0 and 8.0 and epicentral distances not exceeding 200 km. Any records lacking complete information were omitted from the dataset. After these adjustments, the study comprised a total of 883 ground motions from 89 events, as detailed in Table 3.

Table 3 Earthquake events used in analysis.

The statistical properties of the experimental data are summarized in Table 4. Each input parameter was coded as X1, X2, X3, and X4, corresponding to moment magnitude, fault type, epicentral distance, and soil type, respectively. The fault mechanism was assigned a number from 0 to 4, and the soil type was given a value from 1 to 4, representing soil types B to E, as shown in Table 2. No data were recorded for soil type A.

Table 4 Descriptive statistics of the data used in this research.

In this study, we conducted a comprehensive analysis of the dataset, including variables X1, X2, X3, and X4, by examining their pairwise associations through a Pearson correlation matrix. The results, shown in Fig. 2a, reveal that there is no substantial linear correlation among these variables. All correlation coefficients are either close to zero or fall significantly below the threshold for moderate correlation (< 90%), indicating a lack of strong linear relationships. These findings highlight the relative independence of the variables, suggesting that linear modeling techniques may not be suitable for capturing their interdependencies. In Fig. 2b the plot is comparing the distribution of PGA values with a normal distribution curve. The data appears to be skewed, with a large concentration of values at lower PGA levels, deviating from the normal distribution’s symmetry suggesting that the PGA data is not normally distributed.

Fig. 2
figure 2

(a) Pearson’s correlation heatmap matrix of data space and (b) data distribution of Y.

To mitigate quantitative effect of features, the datasets were subjected to a normalization process ranging between (\(\:a=\) 0.1) and (\(\:b=\) 0.9) prior to training the model, as given in Eq. (1)14.

$$\:{x}_{n}=\left(b-a\right)\left(\frac{x-{x}_{min}}{{x}_{max}-{x}_{min}}\right)+a$$
(1)

Modeling and prediction of PGA

An ANN is a computational framework inspired by the neural architecture of the human brain. It consists of interconnected units known as neurons, which are organized into various layers such as an input layer, several hidden layers, and an output layer. Each neuron in these layers is linked through weights that influence the signals transmitted between neurons. ANNs are employed in machine learning for their ability to adaptively learn from data, enabling them to perform tasks such as pattern recognition, classification, and regression15. For an ANN with two hidden layers, the final output \(\:y\) can be expressed as:

$$\:y=f\left(\sum\:_{k=1}^{{m}_{2}}{w}_{k}^{\left(3\right)}f\left(\sum\:_{j=1}^{{m}_{1}}{w}_{jk}^{\left(2\right)}f\left(\sum\:_{j=1}^{n}{w}_{ij}^{\left(1\right)}{x}_{i}+{b}_{j}^{\left(1\right)}\right)+{b}_{k}^{\left(2\right)}\right)+{b}^{\left(3\right)}\right)$$
(2)

Where \(\:x\) is Input features, \(\:w\) is weights matrix, \(\:b\) is biases values and \(\:f\) is activation function. \(\:{m}_{1}\), \(\:{m}_{2}\), and \(\:{m}_{3}\) are number of neurons in input, first hidden layer and second hidden layer, respectively16. In this study, an ANN model with two hidden layers utilizing the settings summarized in Table 5 under MATLAB software was developed to map four input features to a single output.

Table 5 Value of hyperparameter settings for ANN.

During the training and testing steps, the performance of ANN models with different nodes in the hidden layers was evaluated using the total goodness function (TGF), as utilized by several authors17,18,19, as given in Eq. (3). This evaluation is crucial because it allows for the selection of the best weight and bias values for the ANN model.

$$\:TGF=\frac{1}{N}\sum\:_{i=1}^{2}{{n}_{i}({R}^{2}+1/{e}^{MSE})}_{i}$$
(3)

In which

$$\:MSE=\frac{1}{n}\sum\:_{i=1}^{n}{\left[{a}_{i}-{O}_{i}\right]}^{2}$$
(4)
$$\:{R}^{2}=1-\frac{\sum\:_{i=1}^{n}{\left({a}_{i}-{O}_{i}\right)}^{2}}{\sum\:_{i=1}^{n}{\left({a}_{i}-\stackrel{-}{a}\right)}^{2}}$$
(5)

Where \(\:{a}_{i}\), \(\:{O}_{i}\), \(\:\stackrel{-}{a}\), n and \(\:N\) are the target, output, mean values of the target, number of data during the testing or training step of the network and total number of data, respectively. Following the modeling, a GA, originally developed by John Holland20, was utilized in conjunction with an ANN to predict PGA. To simulate earthquake scenarios, a cost function based on Euclidean distance was determined as follows:

$$\:For\:function\:f:\:{R}^{4}R,\:find\:\:\overrightarrow{x}\:\in\:\:{R}^{4}\:such\:that\:\sqrt{{f}_{n}\left(\overrightarrow{x}\right)-1.7016}=0\:\:\forall\:\:x\in\:\:{R}^{4}$$
(6)
$$\:{\overrightarrow{x}}_{i}=\left\{{{X}_{1}}_{i},{{X}_{2}}_{i},{{X}_{3}}_{i},{{X}_{4}}_{i}\right\}\:\:\&\:\:\:i=1.\dots\:.n$$
(7)

Where \(\:\overrightarrow{x}\) is a vectors that stores input parameters, and \(\:{X}_{1}\), \(\:{X}_{2}\), \(\:{X}_{3}\), and \(\:{X}_{4}\) are the input parameters. The hyperparameter settings of the GA under MATLAB software are summarized in Table 6. The population consists of 25 individuals with a double vector representation, and the algorithm runs for a maximum of 25 generations. A uniform creation function initializes the population, with rank scaling and roulette selection enhancing diversity and selection pressure. One elite individual is carried over to the next generation. The mutation function uses adaptive feasible strategies, applied to 50% of the population, while a two-point crossover method is employed with a crossover fraction of 50%. Migration occurs forward at 10% intervals, with 90% of the population migrating. A penalty-based approach addresses nonlinear constraints, and the cost limit is set to zero.

Table 6 Value of hyperparameter settings for GA.

Results and discussion

Figure 3a-b displays the performance of the ANN-based models during both the training and testing phases. The results indicate that the developed networks perform exceptionally well in both stages, as evidenced by the impressive R² values. Specifically, the training phase achieved an R² value of 0.9976, while the testing phase reached 0.9982. These high R² values demonstrate the excellent predictive capability of the ANN-based models.

The ANN architecture described in Fig. 3c features two hidden layers, each with 25 neurons, resulting in a configuration of 4-25-25-1. This specific setup has been identified as optimal for achieving the highest total goodness value. Figure 3d illustrates the relative importance of each input variable, revealing that soil type and epicentral distance contribute the most (41.17%) and the least (2.67%), respectively, to the output of the developed model.

To solve the problem, at the start of the optimization process, the GA assesses each individual by their cost values, targeting the individual with the minimum cost relative to the set input variables. The efficacy of the GA during the optimization is presented in Fig. 3e. It is important to highlight that the best result from the optimization is the average of 15 simulation runs, reflecting the GA’s inherent randomness. This approach ensures the robustness and reliability of the optimization results, taking into account the stochastic nature of the GA process.

Fig. 3
figure 3

Performance of developed network during (a) training and (b) testing steps, (c) topology of developed network, (d) relative importance of different independent parameters, and (e) solving performance of genetic algorithm.

Overall, the combined use of ANN and GA in this study has proven to be highly effective in predicting ground motions, providing valuable insights for seismic hazard assessments and earthquake engineering. The integration of these advanced computational techniques offers a powerful tool for improving the safety and resilience of structures in earthquake-prone areas.

Figure 4 illustrate the relationships between Peak Ground Acceleration (PGA) and four key seismic factors: moment magnitude (X1), fault type (X2), epicentral distance (X3), and soil type (X4). Each plot reveals the unique and complex influence of these factors on PGA. All these plots are obtained for a scenario with a moment magnitude of 5, fault type of 0, epicentral distance of 0.44, and soil type of 1. Starting with moment magnitude, we observe a strong non-linear correlation where larger magnitudes correspond to increased PGA values, but the relationship also contains peaks and troughs, indicating that other factors may modulate the effect. Fault type, while showing variation in PGA, has a relatively lower and more limited impact, suggesting that, for this case, fault type does not significantly affect ground acceleration compared to other parameters. Epicentral distance shows a clear inverse relationship with PGA—closer distances to the epicenter result in higher PGA, a physically intuitive outcome since seismic waves lose energy as they travel farther. Soil type, however, exhibits a non-linear effect, where certain soil types significantly amplify the ground acceleration, reflecting the well-known influence of soil characteristics on seismic wave behavior.

Fig. 4
figure 4

Effect of four key seismic factors: (a) moment magnitude (X1), (b) fault type (X2), (c) epicentral distance (X3), and (d) soil type (X4) on peak ground acceleration (Y).

For this specific case, among these factors, moment magnitude and epicentral distance stand out as the most influential in determining PGA. Moment magnitude drives the energy release of an earthquake, directly affecting ground acceleration, while epicentral distance dictates how much energy reaches a specific ___location. Soil type plays a critical, albeit secondary, role, especially in areas prone to amplification due to soft soils. For the case under consideration, moment magnitude emerges as the most important factor, as it fundamentally governs the energy produced during an earthquake, which in turn influences all subsequent seismic behavior, including PGA.

Table 7 presents the outcomes for a set of 20 distinct fault combinations archived by ANN-GA system. It is well-established that reverse faults possess the capacity to yield elevated PGAs, especially in cases characterized by substantial fault slip and proximity to the Earth’s surface. In contrast, PGAs associated with normal oblique faults typically exhibit lower values compared to those emanating from reverse and strike-slip faults which was consistent with the results of21,22,23. Existing literature suggests that given the same earthquake magnitude, distance to the site, and site condition, the ground motions from normal-faulting earthquakes tend to be smaller than those from strike-slip earthquakes by about 20%. Upon careful examination of the data, a notable trend becomes apparent: for reverse and reverse-oblique faults (with X2 values of 2 and 3), even when the earthquake magnitude is substantial, the optimized PGA values are observed at greater epicentral distances. This observation suggests that these fault types are capable of generating heightened PGAs even when located at more substantial distances from the epicenter. In the case of normal faults (with an X2 value of 1), our analysis indicates that the maximum PGA is most prominent at shorter epicentral distances, with their impact diminishing as distances increase. However, for normal-oblique faults (with an X2 value of 4), the obtained results appear inconsistent with the underlying physical principles of the problem. Given that, in the context of normal-oblique faults, the PGA values at larger epicentral distances are typically negligible according to existing knowledge, which suggest higher PGAs at greater distances, raise questions regarding the validity of the optimization procedure. Further investigation and validation may be warranted to reconcile this discrepancy between the model results and the known physics of normal-oblique fault behavior. It is to be noted fault parameter uncertainty including fault ___location, slip rate, rupture mechanism, and fault geometry is highly crucial in predictions.

Table 7 Predicted scenarios for shallow earthquakes.

The calibration of ANN-based models for region-specific seismic data is a critical step in ensuring accurate and reliable predictions of Peak Ground Acceleration (PGA). The data selected is of high-quality and it captures localized features such as Vs30, fault mechanisms, and attenuation characteristics for region-specific seismic prediction. Furthermore, model robustness was enhanced through hyperparameter tuning, cross-validation, and the application of regularization techniques to mitigate overfitting. These measures ensure that the ANN can generalize effectively to both training data and unseen events, contributing to the broader generalizability of PGA predictions. Incorporating diverse and representative datasets further strengthens the model’s adaptability to varying seismic conditions, making it suitable for practical seismic hazard evaluations.

Moreover, the analysis of the seismic data highlights a distinct pattern in PGA related to both epicentral distance and moment magnitude. The findings indicate a pronounced reduction in PGA as the epicentral distance increases, supporting the inverse square law. This law suggests that ground shaking intensity decreases as the distance from the seismic source increases, corroborating earlier research3,4,24. The study also uncovers a slight positive correlation between PGA and moment magnitude. Furthermore, it is observed that softer soils register higher PGA compared to harder soils. These insights highlight the importance of considering both epicentral distance and moment magnitude for a thorough assessment of seismic impacts on ground motion and structural integrity. This aids in the development of more effective seismic hazard assessments and infrastructure resilience strategies.

Conclusion

This study investigates the application of Artificial Neural Networks (ANN) for developing Ground Motion Prediction Equations (GMPE) using PEER NGA data. The findings demonstrate that ANNs provide a robust and effective alternative to traditional empirical methods for predicting ground motion, highlighting their potential in seismic hazard assessments and earthquake engineering. Key conclusions include:

  • The ANN model exhibits high predictive accuracy, proving to be a valuable tool for estimating ground motion, which is critical for designing safer structures in earthquake-prone regions.

  • Machine learning techniques are crucial for analyzing complex, nonlinear patterns in seismic data, thereby improving the reliability of seismic hazard assessments.

  • Optimization insights from the study point to areas with potential seismic risks, helping to prioritize safety measures.

  • There is substantial potential for further development and application of ANN models in earthquake prediction, which could enhance preparedness and mitigation strategies against earthquake impacts.

  • The study encourages ongoing research using machine learning in seismology, emphasizing the significance of advanced data-driven methods for a deeper understanding of ground motion behavior.

  • Although the ANN model includes key seismic parameters, future iterations could incorporate additional factors, such as directivity effects, dip angles, and hypocentral distances, to better capture complex seismic phenomena.

  • The model’s calibration for specific seismic regions may reduce its broader applicability. Expanding the dataset to include high-resolution, multi-regional seismic data would enhance the model’s robustness and generalizability, making it adaptable to various tectonic settings.

This research sets the stage for more sophisticated, accurate, and practical approaches to understanding and managing seismic risks.