Abstract
Peak ground acceleration (PGA) is frequently used to describe ground motions accurately to defined the zone is critical for structural engineering design. This study developed a novel models for predicting the PGA using Artificial Neural Networks-Gravitational Search Algorithm (ANN-GSA) and Response Surface Methodology (RSM). This paper grants the prediction of PGA for the seismotectonic of Iraq, which is considered the earlier attempt in Iraqi region. The magnitude of the earthquake, the average shear-wave velocity, the focal depth, the distance between the station, and the earthquake source were used in this study. The proposed models are constructed using a database of 187 previous ground motion records, this dataset is also utilized to evaluate the effect of PGA’s parameters. In general, the results demonstrate that the newly proposed models exhibit a high degree of correlation, perfect mean values, a low coefficient of variance, fewer errors, and an acceptable performance index value compared to actual PGA values. However, the composite ANN-GSA model performs better than the RSM model.
Keywords Peak ground acceleration (PGA); Artificial neural network (ANN); Response Surface Methodology (RSM); Analyse factorial design; Gravitational Search Algorithm (GSA); Analysis of variance (ANOVA)
1 INTRODUCTION
Seismic hazard analysis is a critical step in the engineering phase. Seismological characteristics of earthquakes include their distance, magnitude, soil effects, and kind of faulting. The engineering parameters of an earthquake can be classified into two broad categories: 1) parameters in the response domain; and 2) parameters in the time domain. Pseudo-spectral acceleration (PSA) is a frequently used response domain parameter. Peak ground acceleration (PGA), peak ground velocity (PGV), and peak ground displacement (PGD) are the three major time-domain class parameters. Both of these categories could be used to evaluate the hazards associated with construction. It has been demonstrated that spectral parameters are more effective than time-domain parameters (Luco and Cornell 2007). However, time-domain parameters are more appropriate for applications due to their independence from the structures under consideration (Al-Zuhairi et al., 2021). As a result, PGV, PGD, and PGA are frequently used in seismic risk assessments.
PGA is a well-known earthquake engineering metric that can be utilized for structural analysis and risk assessment during a seismic event. This critical component can be approximated using various techniques, including physical modelling and on-site inspections (Alavi and Gandomi 2011). However, implementing such a method is inconvenient, time-consuming, costly, and frequently impossible (Gandomi et al., 2011). A method for analyzing PGA uses attenuation relationships, which are critical in seismic analysis. PGA is often defined using several independent variables, including the magnitude of the earthquake, the distance between the source and the site, the local site circumstances, and the features of the earthquake source (Güllü and Erçelebi 2007, Gandomi et al. 2011).
Due to the tremendous complexity and non-linearity of the PGA, it is not easy to establish a correlation between it and the predictors. Soft computing methods (SCTs) are widely utilized in engineering research to handle a variety of classification problems and anticipate a variety of issues (Hanoon et al. 2017a,b , Banyhussan et al. 2020), and more recently to predict ground motion characteristics (Alavi et al. 2011, Gandomi et al. 2011). SCTs are typically used to resolve complex numerical optimization problems and nonlinear systems. Numerous research problems in a variety of fields of science have been theoretically and analytically articulated utilizing soft computing techniques (Hanoon et al., 2021). Numerous classifications are incorporated in the SCT, for example, ANFIS (Adaptive Network-based Fuzzy Inference System), ANN (Artificial Neural Networks), SVM (Support Vector Machine), FL (Fuzzy Logic), and OA (Optimization Algorithms) (Jang and Topal 2014). Each category of soft computing also has a fine-grained set of algorithms; for example, the OA category includes the GSA (gravitational search algorithm), PSO (particle swarm optimization), GA (genetic algorithm), ABC (Artificial Bee Colony), ACO (Ant Colony Optimization), and DE (Differential Evolution). Pragmatic modelling and design using SCT are still hotly debated topics, particularly in engineering modelling. SCTs are based on experimental data rather than theoretical and/or analytical derivations, which is the primary distinction between SCTs and traditional models. SCTs are frequently complex and frequently unable to be expressed explicitly. As a result, they are best suited for inclusion as a more prominent computer program component, limiting their application (Hanoon et al., 2021). The artificial neural network (ANN) is the most widely used forecasting method in soft computing, having been successfully employed to address complicated pattern identification and analysis issues in a wide variety of domains, including earthquake engineering. ANNs have been widely used in recent years due to their superior pattern recognition capability, which is advantageous for various problems. The amount of hidden nodes in an ANN model is critical, as an overfitted model can result (Hanoon et al. 2021, Hason et al. 2021).
Using experimental and physical correlations, the prediction of model’s equation based on site geology and event propagation has been developed. The regression analysis findings at a specific location with a massive amount of data are explained using empirical models based on mathematical procedures (Mahmod et al., 2017). Seismic wave models can also be utilized when there is insufficient data available to make an accurate determination. Response Surface Methodology (RSM) and Design of Experiments (DOE) have lately been used to create correlations between attenuation and other information processing methods (Hason et al. 2020). Research-based forecasting (RSM) relies on the results of experiments to predict future outcomes. In engineering, the RSM technique can be used to construct an acceptable model for establishing a relationship between the causes and the potential answers to a given issue (Hason et al., 2020).
The primary goal of optimization methods is to achieve values from a set of parameters that minimize and maximize objective functions under restrictions. The ANN-GSA and RSM are used in this study to create two models of peak ground acceleration. The proposed models are based on a large dataset of 187 events of a powerful earthquake that strikes Iraq (2004-2020). This research would primarily contribute to the expanding body of knowledge on PGA and seismic activity assessment, especially in Iraqi tectonic regions. Furthermore, this study also facilitates the usage of combined ANN-GSA and RSM methods for earthquake and hazard forecasting. To date, this is the first attempt to derive explicit and implicit PGA models for the Iraq tectonic region from a variety of parameters. However, there are many unanswered concerns about the extent to which these variables affect PGA. Thus, the factorial design of trials is used to examine the impact of various characteristics on PGA in Iraq. The current research is based on data from the international earthquake stations, which is a limitation. The response (PGA) selected parameters (REpi, Mw, Vs30, and FD) that were considered and affected the PGA response are also subject to the constraints herein. A central composite design was used to model the interactions between the elements (CCD). According to the results, the Mw and Vs30 have the most significant impact on PGA, followed by FD, and the REpi has the most negligible impact on PGA.
2 STUDY AREA
Iraq is located within longitudinal (45°38'- 45°48') and latitudinal (28°5'-37°22') coordinates in south-western Asia (Hasan et al. 2014). The topography of Iraq is like a basin as it is surrounded on north and east by mountains, about 3,500 meters above sea level. Iraq has a total surface area of almost 437 thousand km2 (Abbas et al. 2020, Abbas et al. 2020).
Figure 1 depicts the seismic and topographical conditions in the research region (Iraq) and the adjacent areas. After the collision of Anatolian plates with Iran's plates in the north and northeast of Iraq, the Bitlis-Zagros Fold and Thrust Belt were formed, which included a magnitude w.7.3 earthquake in November 2017 that killed 539 people, as well as thousands of others, in both countries, as it stretched from Turkey and Iraq to the Strait of Hormuz (Abdulnaby et al. 2014, Shafiqu and Sa’ur 2016). The seismicity of plate boundary is linked to a variety of limits in Aden Gulf and the Red Sea, which are both growing. The frontiers of Iraq, Iran, and Turkey, where the Zagros and Bitlis tectonic meeting zones intersect, are experiencing several earthquakes (Ghalib et al., 2006).
Seismo-geographical map of Iraq and the Arabian Peninsula (red arrows: plate motion in cm per year; blue lines: boundaries of the plate; red lines: country boundaries) is shown (Abdulnaby et al. 2020).
There are numerous faults in Mesopotamian Foredeep, including the fault of Badra-Amarah (Iraq's most seismically active fault), the fault of Euphrates (the boundary between Mesopotamian Foredeep and the Stable Platform) as well as the fault of Al-Refaee, Kut, and Hummar (north of Basra). For the most part, Iraq is divided into three main tectonic areas (Fouad and Sissakian 2011). Bitlis–Zagros Fold and Thrust Belt, Mesopotamian Foredeep, and Inner (stable) Arabian Plate, in order from northeast to southwest: 1) (Onur et al. 2017). Seismological and seismotectonic studies in Iraq have clearly shown that seismic activity varies from moderate to high in the northern and north-eastern areas and decreases in the southern and south-eastern regions. in Iraq (Abd Alridha and Jasem 2013).
3 METHODS AND MATERIALS
3.1 A description of the database's structure
As illustrated in Figure 2, more than 1800 historical earthquakes with magnitudes ranging from 3.0 to 7.3 struck the research region. Table 1 shows that 187 ground motions occurred between 2004 and 2020, with a magnitude of 4.5 to 7.3, representing the mild, moderate, strong, and significant earthquakes in the research area and the surrounding area. In order to build and verify the suggested models, the datasets were split into 150 records and 37 records, respectively. A wide range of moment magnitude earthquakes (), average shear wave velocity (), focal depth (FD), and closest epicentre distance () are all included in the database. In order to determine the FD, we used worldwide datasets (NOAA, CSEMEMSC, IRIS, and the USGS), including the and Mw values. Actual, and on the other hand, were sourced from the GSHAP and USGS databases, respectively. The mapping and comparison were made using ArcGIS.
With the availability of input data, models derived using SCTs may be anticipated and used for further progress. When it comes to modelling processes, the data quantity is equally important, as it influences the model correctness concerning its intended form. Aside from that, the size of the sample and the parameter combinations affect the performance of an updated model based on these inputs. To further understand how seismic parameter details were included in the suggested models, Table 1 summarizes the input data.
According to the recommendations, the best dataset-to-input variable ratio is 3 for model applicability and greater than 5 for additional safety (Frank and Todeschini 1994). Thus, of the 187 datasets, 150 datasets (80%) were utilized to create the models, while the remaining 37 datasets (20%) were used to verify the models. In the current study, 150/5= 30 and 37/5= 7.4 were used to construct and test the suggested model. There was a considerable difference between designing and checking datasets (i.e., 5).
3.2 A Computed Intelligence (CI)
Computed Intelligence (CI), which is also known as soft computing technique (), is commonly utilized to determine nonlinear systems, complicated mathematical optimization queries, and non-differentiable problems (Jang and Topal 2014). One of the main objectives of the present paper is to precisely foretell the PGA of Iraq’s tectonic regions employing the composite of ANN and GSA. The composite algorithm is generated according to four effective independent input variables () against the response of dependent as follows:
3.2.1 Artificial Neural Network (ANN)
represents a data processing system, which has been produced as a comprehensive numerical form of rational natural nerves. In general, the main benefits of utilizing are their ability to detect errors ideally and maintain training to develop their achievement when dealing with new learning data (Gholami et al., 2013). techniques are capable to model the complicated numerical relationship between the input parameters (i.e. , , and ) and target response (i.e. ). A BP neural network and the Levenberg-Marquardt (LM) training method were used in the current study to prepare, examine, and verify the results. The reason behind considering the training algorithm besides its activity and performance, since it provides fewer localization failure (Payal et al. 2014). Nevertheless, the algorithm needs a significant quantity of operating memory (Kukolj and Levi 2004).
In the , prior preparation, examination, and verification of the data, many data must be determined first, i.e, the variables, number of hidden zones, rate of learning, and output number. Based on the selected variables in this research, which is represented by four parameters (, and ), the number of zones to build the design is four to determine the response of the tectonic study area. These zones included: (i) input zone; (ii) hidden zone; and (iii) output zone, as displayed in Figure 3. The input zone comprised four parameters, , values. The input variables were addressed versus the powers of the individual connections and summated per every neuron of the hidden zone to measure the output in the output zone. Tan-Sigmoidal activation functions and the linear activation functions were utilized in the hidden and output zones, respectively, to include all varieties of the response values (). The neurons number in the hidden zone and training degree was chosen according to the algorithm, which determines the most suitable number of neurons and training degree values of in the hidden zone, to achieve the best optimum solution. Hence, the performance of can be modified. This -algorithm comprises to create a new algorithm called the composite with a minimum of predictions.
The algorithm of (backpropagation) in multi-zone feed-forward networks is the appropriate standard algorithm based on the mathematical design of the training complex nonlinear connections. This performance index of the BP algorithm is called (least mean square error), which can be determined by calculating the difference between the objective and the network outputs (Eq. 2).
Where represents the number of learning configurations, the objectives output, and the networks output, respectively.
3.2.2 GSA technique
There are several excellent ways for finding a rational software solution, but one of the finest is the Heuristic algorithm (Rashedi et al., 2009). GSA is a contemporary heuristic algorithm based on Newton's gravitation and motion equations. An acceptable variable and sufficient input parameter values are necessary for an ANN to reflect the correlation between inputs and output. Consequently, the challenge concerns the number of hidden nodes and learning rate values used in an ANN algorithm. Hence, developing the hybridized version of soft computing models has overcome the previous drawbacks. In this study, these problems can be addressed using the performance of the gravitational search algorithm (GSA) algorithm through searching the optimal values. PSO algorithm is one of the distinguished optimization approaches introduced in the literature within the soft computing implementation for related structural engineering problems (Yaseen et al., 2018, Chen et al. 2018, Kaveh and Talatahari 2012). In order to determine the optimal ANN variables (number of neurons in every hidden layer and training rate), the GSA (heuristic algorithm) is merged with ANN. According to the suggested algorithm, agents' performance is defined by the number of targets they have been identified as being.
Objects are examined in the method, and their mass is used to gauge performance. Thanks to Newton's equations of motion and gravity (Schutz 2003). Illustration of the mass effect concerning other masses as shown in Figure 4. The heavier masses in this figure are linked to good outcomes and travel slower than the lighter masses. Taking this step will allow the algorithm to be more easily attacked.
Eq. (3) defines the agent location.
The expressions , , and represent the location of the dimension, the search space, and the agent's number, respectively. Besides, the gravitational force affecting on the target according to the target can be formulated as (Rashedi et al. 2009):
Where is the jth agent's mass, and the gravitational constant at time t is , is the Euclidean distance between the ith and jth agents. is the gravitational constants initial value, and, is the maximum number of iterations (the total age of the system). When applied to the ith agent, the total force is denoted as:
Where is the set of the first k agents with the best fitness (objective function) value and is evaluated in such a way that it decreases linearly with time (Rashedi et al. 2009) and its value becomes 2% of the initial number of agents at the last iteration. where, rand-j. is a random number in the interval (0, 1). According to Newton's rule of motion, the agent's acceleration, position, and velocity are given in Eqs. (8-10) at the repetition and position. represents the ith agents fitness value at the repetition of the, repetition. The suggested algorithm stages are shown in Figure 5. According to the GSA algorithm, all masses are presumed to be equal. A more significant inertia mass improves search accuracy since the agents movement is slow. In contrast, an enormous gravitational mass attracts more agents, resulting in a quicker convergence rate.
3.2.3 Composite ANN-GSA development
are prepared from a dataset named as the preparing or training data. Through the preparing trial, the network’s measurements are optimized. Besides, the preparation system includes two significant actions: initialization and optimization. A solid and effective preparation procedure requires both processes of initialization and optimization (Alavi and Gandomi 2011). The flow chart in Figure 6 illustrates the soft computing techniques (SCTs) used in this work.
The mean absolute error (MAE), represented by Eq. (16), is the accurate tool between the actual and proposed models. (Hanoon et al. 2016):
where n, and represent the total number of constructed records the actual and predicted values of the , respectively.
3.2.4 Data pre-processing for composite ANN-GSA technique
A dataset of 150 points, representing 80% of the total records (187 points), is processed as learning data. On the other hand, the remaining dataset of 20% (37 points) verifies the composite ANN-GSA model.
Where the total data were divided into two main parts, the first one (80% of the total data), verification of these data was carried out by dividing it into three main parts, which were as follows:
-
1
During the training phase, 70% of the data was utilized.
-
2
15% of the data was used during the testing phase.
-
3
15% of the data was used during the validation phase. External data were used to verify further the model's accuracy in the second main part of the data, which accounts for 20% of the total data and whose values are not included in the first part.
In implementing any ANN model, the main problem is the selection of the datasets correlated to the issue during examination. To achieve stability during the analysis of this study, all input and output datasets were normalized between 0 to 1, by applying Eq. (17), before training the network by utilizing the max-min normalization criterion. The normalization of the data record prioritizes all of the computation parameters equally.
Where , , , and, are the actual parameter value, the normalized value of the specific parameter, the min. and max. values of the database respectively.
3.3 Statistical model designation by RSM
To construct a new statistical model, numerous phases should be considered: (i) deriving the final model according to available datasets; and (ii) carrying out a parametric analysis according to the principles of engineering and problem physics. The first phase is constructed by the Response Surface Methodology (RSM). The second phase focuses on engineering principles and must be carried out based on a parametric analysis by an engineer that acknowledges the issue being modelled. This study tries mainly to analyze the effect of individual parameters on PGA.
Second-order polynomial or quadratic models can model and evaluate problems using RSM, a statistical and mathematical technique (Box and Draper 1987). An optimal output value can be found by exposing the solution to various factors using this technique (Montgomery 2017). The surface response methodology's central composite design (CCD) technique becomes extremely flexible whenever the intended parameters' preliminary lower and upper limitations have been surpassed. Using the CCD process, it is still possible to obtain the best possible values for parameters that appear well outside the initial fixed range. This study aims to learn more about what goes into calculating the PGA. To determine how design parameters interact, the CCD approach of RSM was utilized in conjunction with the DOE method (a statistical approach). This method does check the impact of parameters on the selected response (Antony 2014).
It is possible to discover the ideal combination of factors and their relationship, which is impossible in standard optimization techniques through factorial designs. Aside from that, mathematical models can be generated using these designs as a starting point. For the RSM, Eq. (18) represents a second-order polynomial model, which is commonly used to assess the impact of many variables on a response based on the datasets collected:
Where y denotes the estimated response, and are coded variables, denotes the constant, denotes the linear coefficient, denotes the quadratic coefficient, and, denotes the interactive coefficient (Montgomery 2017). The most significant characteristics illustrating PGA behaviour were picked after conducting a trial study and a literature review (Gandomi et al. 2016). As a result, the formulation of PGA must take into account the link between the response and the specified parameters as described before in Eq. 1 ( ). The RSM mechanism is depicted in Figure 7.
4 RESULTS AND DISCUSSION
The results of the proposed models in terms of algorithm (composite ) and () are presented and discussed in this section. Followed by the comparison between the proposed models relative to the actual values (). Validation and verification are conducted to investigate the degree of accuracy of the implicit and explicit models. Finally, the impact of the independent variables on the dependent response () is carried out to know which parameter has a significant influence during the selection of the ground motion components.
4.1 Composite ANN-GSA Algorithm model (
It should be noted that a specific algorithm to produce an exact outcome for all optimization problems is not exist. However, the GSA algorithm was performed throughout the parameter settings with different population sizes (60, 80, and 100) to enable the algorithm to choose the best population that accomplished the minimum accurate function, as shown in Table 2. To achieve the minimum errors between the actual and predicted PGAs, the preparation operation of ANN was replicated numerous times utilizing a high number of iterations (i.e., 1000).
shows the GSA and Neuron parameters in each hidden zone and the ANN learning rate based on population size.
MATLAB software was used to run the algorithm according to the selected population sizes to obtain the (number of Neuron in the hidden layer) and (Learning rate) as shown in Table 3. The mean absolute error (), which represents the objective function, of the composite for various population sizes is demonstrated in Figure 8. This figure demonstrates that the optimal solution for the is represented by the population size of 100 since it reaches the lowest of about 125 iterations compared to the population size of 60 and 80, which needs more time of about 160 and 275 iterations respectively to reach the minimum error. According to the results of the proposed composite , the was run based on the variables that performed the minimum of population size—resulting in a high calculation accuracy of proposed model.
4.2 ANN training and validation
To choose the best values of neurons in every and , the is utilized to optimize the running, which operated using the input parameters (, , and ) values and output response of the actual for Iraq’s tectonic regions. The potential of different ANN structures. Figure 9 depicts the ANN structure with four input variables, sixteen hidden neurons in the hidden layer, and a single output variable. This structure was evaluated to determine the optimized circumstances of for Iraq’s tectonic region.
In the , the learning dataset of the composite is divided into three categories: training data, testing data, and validation data, as shown in the regression graph of Figure 10. The x-axis (target) and y-axis (output) are represented by the actual and predicted . It is seen that the statistical results of the correlation coefficient () for the training, testing, and validation records have a good correlation between the actual and predicted that produced by algorithm.
Ideally, the most residual error values (differences between target and output) are remarkably less, located near the zero lines as revealed in the error histogram with 20 bins of Figure 11. The -algorithm (Levenberg-Marquardt backpropagation) performance throughout the training development is plotted in Figure 12 using MATLAB R2019b. The difference of (mean square error) by training epochs is drawn, where the best validation performance is 0.897 at epoch 2.
4.3 PGA model using RSM ()
To capture the PGA response against other variables (, , and ), response surface methodology (RSM) is employed using Minitab software. With (analysis of variance) and model summary, results are demonstrated in Tables 3. Moreover, -value and P-value are two vital factors in evaluating the correlation between variables. and are connected (Winship and Zhuo 2020). They go hand in hand, similar to Tweedledee and Tweedledum. When T-value is close to 0 (negative or positive values), the more probable there is not a notable variation between variables, as shown in Table 3. The -value is utilized to determine evidence strength in the data provided. Generally, the lower -value, the greater the sample evidence for a significant correlation (Chaubey 1993). By convention, a -value higher than 5% is called not statistically significant and vice versa (Altman and Bland 1995). It should be noted from Table 3 that the -value (for ) is < 0.05, which means the factor is more significant in findings.
A realistic measure of the degree of multi-collinearity in a regression is the Variance Inflation Factor (VIF), which is a term used to quantify how strongly two or more predictors in the regression are associated. For more information (Robinson and Schumacker 2009). The is, therefore, an essential part of examining interaction effects in multiple regression. According to Table 3, values are around 1, the regressions have a good shape and are multi-collinearity.
As we mentioned previously, out of the 187 datasets, 80% of them (i.e., 150 datasets) are adopted for the explicit model fabrication processes, and the rest 20% (37 datasets) are considered for verifying the final model. Fit Factorial multi-linear regression model is conducted to find the best correlation between dependent response and independent parameters ( versus , , and ). The explicit equation is formulated to be:
Following the building of the model, it is essential to assess its statistical trustworthiness. From a statistical point of view, one condition has to be settled by a model for its reliability fits approved. This condition is normalized residuals. The expression residual denotes the variation between an experimentally determined value achieved by the model.
The PGA Fit multi-linear regression analysis findings are shown in Figure 13 (Minitab). The residuals are shown in the outline for demonstration purposes. Figures showing the distribution of differences between predicted and observed values show few outliers in the data set. The PGA Fit multi-linear regression analysis findings are shown in Fig. 13 (Minitab). These graphs afford good evidence to confirm the use of the suggested PGA model. The reliability is verified since the circumstance of model authenticity has been performed.
4.4 Comparative study
Datasets of 20% (37 records) of the overall datasets (187 records) are employed to evaluate the proposed models by algorithm and . Those data have not been used in the model building process. The comparisons are conducted between the prediction models. The outcomes achieved from the proposed models by the verification records are displayed in this section. Besides, the outcomes show that the proposed composite model () exhibits better than the model produced by ().
The standard deviation (SD) was detected as the data variance was measured. The fewer SD results in less data variance, and conversely. As a result, the variance coefficient (CoV) measures the actual amount of relative variation and reflects the correctness between output and input data. According to [Pimentel-Gomes 2000], a CoV value of less than 10% indicates great precision, whereas values of 20–30% indicate low precision, and more than 30% indicate low precision. Table 4 demonstrates that both the proposed models have reasonable -values. It was found that the CoV values for the two models ( and ) were 8.739 percent and 10.362%, respectively, with great accuracy in determining the target values. Furthermore, an excellent value adjacent to 1.0 of the mean-values for both models were obtained (1.061 and 1.07). The RSM shows accuracy slightly as compared with the GSAANN technique.
The correlation coefficient (R) is an inadequate indicator of a model's predicting effectiveness since (R) is not sensitive to output values multiplied by a constant (Gandomi et al. 2011). Accordingly, another recommended function should be employed to assess the proposed models' performance. The suggested models' performance may be assessed by comparing their (performance indices) with side to side with (Eq. 20).
Where refers to the actual value, refers to the predicted outputs, refers to average actual values, and refers to the average predicted outputs for number of samples.
The value of is located between 0 and . The relationship between and values is an inverse relationship. Thus, to obtain a perfect prediction performance, the value should be closed to , which means higher ith lower . As shown in Table 5, the values are too small of about 0.0613 and 0.0862 for and respectively, which indicate that the proposed models predict perfectly the experiment values of . According to Table 5, the statistical performance of the composite model produced by algorithm exhibit better outcomes relative to the proposed model generated by the approach in terms of , , , mean, and .
As shown in Figure 14, the high (PGA-act./PGA-pred.) ratio histogram frequency illustrates the high accuracy of the suggested equations in prediction, which means that a model has a respectable level of predictive accuracy if the ratio of actual to projected values is equal to (1.0).
In addition, ARE may be a preferred method for assessing the model prediction capabilities of the relative error distributions (Bagheri et al., 2012). From the percentage formula, ARE could be predicted as below:
Superlatively, the frequency should decrease with every ARE% increasing. It is seen distinctly from Figure 15, where the proposed model has the lowest of the highest frequency (less than 10%) and the largest ARE of the lowest frequency (more than 17.5%). Therefore, the two predicted PGA models have a very satisfactory error distribution.
4.5 Parametric analyses
To evaluate the impact of individual parameters on , Figure 16 (a-c) show the as a function of a pair of parameters involving: with , , and , respectively. It can be observed that the increases in the amounts of individual variables ( and ) up to a specific range leading to an increase in values, as depicted in Figure 16. By contrast, decreasing the to less than 2.5 m/s2 with increasing the . Thus, the tested parametric analyses could directly enhance the assessment of through selecting the suitable parameters.
The results of the current study were examined further in terms of the interaction plot for response, as demonstrated in Figure 17. Different levels of one element may have different responses at different levels of another. As a result, there is an interplay between the various variables. Any point where the lines cross denotes a relationship between two variables. Non-parallel lines in the interaction plots (Figure 17 (a)) reveal a significant interaction between Mw and VS30. Similar outcomes have been indicated FD (as depicted in Figure 17 (b)). By contrast, slightly weak interaction between the REpi and the response, as presented in Figure 17 (c).
4.6 Screening analysis
Screening analysis or analysis of a factorial design is of extreme significance for selecting the necessary input parameters. Screening analysis offers a valuable method for assessing and evaluating the contributions of every predictive parameter to the response. To achieve that, Minitab software is utilized to conduct the screening analysis by inputting the actual values of the response against other parameters. The findings of the screening analysis are shown in Figure18, employing a standardized diagram. This figure demonstrates that the significant parameters impacting the peak ground acceleration (PGA) value by order of the earthquake magnitude () and the average shear velocity () followed by and .
5 CONCLUSION
This research assesses and investigates PGA (peak ground acceleration) of Iraq's tectonic zones. Based on data from 2004-2020, a database of 187 historical records was analyzed to construct and forecast implicit and explicit models utilizing soft computing techniques (such as the composite- algorithm). This study would significantly contribute to the growing literature on PGA and seismic activity assessment. Besides, the contribution of this paper also includes facilitating artificial intelligence for earthquake and hazard predictions. The following are the main findings of this study:
-
The collected datasets can be utilized to construct a novel model for forecasting peak ground acceleration, especially for the Iraq tectonic region.
-
Soft computing algorithms and Response Surface Methodology () are effective and practical tools and efficient techniques in engineering problems to provide an optimized solution with sufficient accuracy utilizing various parameters, which can be easily employed for predicting values.
-
The implicit and explicit models produced by this study are considered the first novel attempt to predict the values for the Iraqi tectonic zone. Besides, models could be efficiently employed to provide estimates for the values in a spreadsheet or hand calculations.
-
The statistical examination resulted in a mean, , and values of 1.061, 0.093, and 8.739%, respectively for composite model () and of about 1.07, 0.111, and 10.36%, consecutively for model (). Besides, minimum values are obtained for the absolute relative error () for both models. These statistical results indicate good solid accuracy and compatibility of the forecasts made by the soft computing algorithms model. The statistical results show that the RSM approach displays accuracy slightly compared to the GSAANN technique.
-
The and the . which reflect the the most influential factor, and the standardized plot was determined to be in accord.
-
Rarely do we come across data that can be utilized to predict the maximum PGA. As a result, further data should be used to assess and refine the proposed peak ground acceleration models and investigate a broader range of parameters. Consequently
One of the most prominent limitations facing the current paper is the use of data within the Iraq region. Therefore, the current study suggests generalizing the current models to a wider range that includes data for other countries. Further improvements may be made to obtain more accurate results, including employing the parameters used in this study for the related domain. Besides, extra data with a wide range of duration and parameters for Iraq's tectonic zones is recommended to compare with the finding obtained herein since this research is the first attempt to propose the Iraqi PGA formula.
Acknowledgements
The authors would like to express their gratitude for the support provided by their related organizations.
-
Funding: The author(s) received no financial support for this research article, authorship, or publication.
References
- Abbas, M. R., Hason, M. M., Ahmad, B. B., and Abbas, T. R. (2020). Surface roughness distribution map for Iraq using satellite data and GIS techniques. Arabian Journal of Geosciences, 13(17), 839.
- Abd Alridha N and Jasem NA, (2013). Seismicity evaluation of central and southern Iraq. Iraqi Journal of Science. 54(4), 911-918.
- Abdulnaby W, Mahdi H, Numan NMS and Al-Shukri H, (2014). Seismotectonics of the Bitlis–Zagros Fold and Thrust Belt in Northern Iraq and Surrounding Regions from Moment Tensor Analysis. Pure and Applied Geophysics. 171: (7), 1237-1250.
- Abdulnaby W, Onur T, Gök R, Shakir AM, Mahdi H, Al-Shukri H, Numan NM, Abd NA, Chlaib HK and Ameen TH (2020) Probabilistic seismic hazard assessment for Iraq. Journal of Seismology.
- Alavi AH and Gandomi AH, (2011). Prediction of principal ground-motion parameters using a hybrid method coupling artificial neural networks and simulated annealing. Computers and Structures. 89: (23-24), 2176-2194.
- Alavi AH, Gandomi AH, Modaresnezhad M and Mousavi M, (2011). New ground-motion prediction equations using multi expression prograMming. Journal of Earthquake Engineering. 15: (4), 511-536.
- Altman DG and Bland JM, (1995). Statistics notes: Absence of evidence is not evidence of absence. Bmj. 311: (7003), 485.
- Al-Zuhairi, A. H., Al-Ahmed, A. H. A., Hanoon, A. N., & Abdulhameed, A. A. (2021). Structural behavior of reinforced hybrid concrete columns under biaxial loading. Latin American Journal of Solids and Structures, 18(6).
- Antony J (2014). Design of experiments for engineers and scientists, Elsevier.
- Bagheri M, Bagheri M, Gandomi AH and Golbraikh A, (2012). Simple yet accurate prediction method for sublimation enthalpies of organic contaminants using their molecular structure. Thermochimica acta. 543, 96-106.
- Banyhussan QS, Hanoon AN, Al-Dahawi A, Yıldırım G and Abdulhameed AA, (2020). Development of gravitational search algorithm model for predicting packing density of cementitious pastes. Journal of Building Engineering. 27, 100946.
- Box GE and Draper NR (1987). Empirical model-building and response surfaces, John Wiley & Sons.
- Chaubey YP, (1993). Resampling-based multiple testing: Examples and methods for p-value adjustment. Technometrics. 35: (4), 450-451.
- Chen, X.L., Fu, J.P., Yao, J.L. and Gan, J.F., (2018). Prediction of shear strength for squat RC walls using a hybrid ANN–PSO model. Engineering with Computers, 34(2), 367-383
- Fouad SF and Sissakian VK, (2011). Tectonic and structural evolution of the Mesopotamia Plain. Iraqi Bulletin of Geology and Mining (4), 33-46.
- Frank IE and Todeschini R, (1994). The Data Analysis Handbook. Amsterdam, Elsevier.
- Gandomi AH, Alavi AH, Mousavi M and Tabatabaei SM, (2011). A hybrid computational approach to derive new ground-motion prediction equations. Engineering Applications of Artificial Intelligence. 24: (4), 717-732.
- Gandomi M, Soltanpour M, Zolfaghari MR and Gandomi AH, (2016). Prediction of peak ground acceleration of Iran's tectonic regions using a hybrid soft computing technique. Geoscience Frontiers. 7: (1), 75-82.
- Ghalib HA, Aleqabi GI, Ali BS, Saleh BI, Mahmood DS, Gupta IN, Wagner RA, Shore PJ, Mahmood A and Abdullah S, (2006). Seismic characteristics of northern Iraq and surrounding regions. Proceedings of the 28th Seismic Research Review: Ground-Based Nuclear Explosion Monitoring Technologies, 40-48.
- Gholami M, Cai N and Brennan R, (2013). An artificial neural network approach to the problem of wireless sensors network localization. Robotics and Computer-Integrated Manufacturing. 29: (1), 96-109.
- Güllü H and Erçelebi E, (2007). A neural network approach for attenuation relationships: An application using strong ground motion data from Turkey. Engineering Geology. 93: (3-4), 65-81.
- Hanoon AN, Al Zand AW and Yaseen ZM, (2021). Designing new hybrid artificial intelligence model for CFST beam flexural performance prediction. Engineering with Computers. 1-27.
- Hanoon AN, Jaafar M, Hejazi F and Abdul Aziz FN, (2016). Energy absorption evaluation of reinforced concrete beams under various loading rates based on particle swarm optimization technique. Engineering Optimization, 1-19.
- Hanoon AN, Jaafar M, Hejazi F and Abdul Aziz FN, (2017a). Energy absorption evaluation of reinforced concrete beams under various loading rates based on particle swarm optimization technique. Engineering Optimization. 49: (9), 1483-1501.
- Hanoon AN, Jaafar M, Hejazi F and Aziz FNA, (2017b). Strut-and-tie model for externally bonded CFRP-strengthened reinforced concrete deep beams based on particle swarm optimization algorithm: CFRP debonding and rupture. Construction and Building Materials. 147, 428-447.
- Hasan G, Al Kubaisy MA, Nahhas FH, Ali AA, Othman N and Hason MM, (2014). Sulfur Dioxide (SO^ sub 2^) Monitoring Over Kirkuk City Using Remote Sensing Data. Journal of Civil & Environmental Engineering. 4(5), 1.
- Hason, M. M., Hanoon, A. N., & Abdulhameed, A. A. (2021). Particle swarm optimization technique based prediction of peak ground acceleration of Iraq’s tectonic regions. Journal of King Saud University - Engineering Sciences.
- Hason MM, Hanoon AN, Al Zand AW, Abdulhameed AA and Al-Sulttani AO, (2020). Torsional Strengthening of Reinforced Concrete Beams with Externally-Bonded Fibre Reinforced Polymer: An Energy Absorption Evaluation. Civil Engineering Journal. 6, 69-85.
- Jang H and Topal E, (2014). A review of soft computing technology applications in several mining problems. Applied Soft Computing. 22, 638-651.
- Kaveh, A. and Talatahari, S., (2012). A hybrid CSS and PSO algorithm for optimal design of structures. Structural Engineering and Mechanics, 42: (6), 783-797.
- Kukolj D and Levi E, (2004). Identification of complex systems based on neural and Takagi-Sugeno fuzzy model. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 34: (1), 272-282.
- Luco N and Cornell CA, (2007). Structure-specific scalar intensity measures for near-source and ordinary earthquake ground motions. Earthquake Spectra. 23: (2), 357-392.
- Mahmod HM, Aznieta AFN and Gatea SJ, (2017). Evaluation of rubberized fibre mortar exposed to elevated temperature using destructive and non-destructive testing. KSCE Journal of Civil Engineering. 21: (4), 1347-1358.
- Montgomery DC (2017). Design and analysis of experiments, John wiley & sons.
- Onur T, Gök R, Abdulnaby W, Mahdi H, Numan NM, Al‐Shukri H, Shakir AM, Chlaib HK, Ameen TH and Abd NA, (2017). A Comprehensive Earthquake Catalogue for Iraq in Terms of Moment Magnitude. Seismological Research Letters. 88: (3), 798-811.
- Payal A, Rai CS and Reddy BVR, (2014). Artificial Neural Networks for developing localization framework in Wireless Sensor Networks. 2014 International Conference on Data Mining and Intelligent Computing (ICDMIC).
- Pimentel-Gomes F, (2000). Course of experimental statistics. Piracicaba: FEALQ. 15.
- Rashedi E, Nezamabadi-pour H and Saryazdi S, (2009). GSA: A Gravitational Search Algorithm. Information Sciences. 179: (13), 2232-2248.
- Robinson C and Schumacker RE, (2009). Interaction effects: centering, variance inflation factor, and interpretation issues. Multiple linear regression viewpoints. 35: (1), 6-11.
- Shafiqu, Q. S. A.-D. M., & Sa’ur, R. H. (2016). Data Base for Dynamic Soil Properties of Seismic Active Zones in Iraq. Journal of Engineering, 22(7), 1–18.
- Schutz B, (2003). Gravity from the Ground Up: An Introductory Guide to Gravity and General Relativity. Cambridge, Cambridge University Press.
- Winship C and Zhuo X, (2020). Interpreting t-Statistics under Publication Bias: Rough Rules of Thumb. Journal of Quantitative Criminology. 36: (2), 329-346.
- Yaseen, Z.M., Tran, M.T., Kim, S., Bakhshpoori, T. and Deo, R.C., (2018). Shear strength prediction of steel fiber reinforced concrete beam using hybrid intelligence models: a new approach. Engineering Structures, 177, 244-255.
Edited by
-
Editor: Pablo Andrés Muñoz Rojas
Publication Dates
-
Publication in this collection
06 June 2022 -
Date of issue
2022
History
-
Received
07 Jan 2022 -
Reviewed
19 Apr 2022 -
Accepted
03 May 2022