Introduction

Accurate and reliable predictions of groundwater flow are crucial to sustainable groundwater management methods with rising human and climate pressure on groundwater resources (Rojas et al. 2008). Without a detailed and accurate knowledge of the aquifer, rational and reliable calculation of hydrogeological parameters not only increases the precision of these experiments but also decreases the time spent on realistic study, thus further enhancing the efficiency of the work. In recent decades, significant attempts have been made to develop techniques for the determination of ideal groundwater parameters and uncertainty connected with insecurity in the quantitative model forecast in the estimated parameters. The result was a number of reverse methods for groundwater models (Carrera et al. 2005). Various aquifer systems have different hydro-geological parameter values like Transmissivity (T) (a function of hydraulic conductivity (k)) and storativity (S). Those hydraulic parameters are essential for hydro-geological and environmental studies such as the assessment of groundwater resources and aquifer vulnerability assessments. A model could have several parameters to specify, often in the absence of sufficient knowledge, leading to parametric uncertainty (Singh and Hespanha 2010). Stochastic uncertainty is mainly attributed to the difficulty of identifying the distribution of aquifer properties in space. For a long time, it has been debated in the hydrologic community how to achieve the relevant parameters for every modeling research (Marshall et al. 2004).

Estimation of the aquifer parameter is generally achieved using pumping test information. Analyzing pumping tests involves obtaining the parameters using trial-and-error, graphical approach (traditional methods), or least square error estimation. In these approaches, the hydraulic parameters are usually considered to be fixed and unique. In reality, pumping test data is noisy in field pumping experiments, meaning that no two pump tests can result in the same drawdown traces except under ideal conditions. Typically, when using the constant rate pumping test study, the data are matched with one of several standard curves according to the aquifer type. During this process, only one excellent point should be chosen, and the hydraulic parameters of the aquifer should be calculated accordingly. Once the best match point is obtained and the associated parameters are determined, these parameters are used in modeling without questioning or quantifying the uncertainty of aquifer parameters. For example, if it tried to apply the previous concept to actual pumping test data (for confined aquifer type, as an example) and tried to choose the best matching point, we would have different results for the same pumping test data. Figure 1 shows the results of two cases for choosing the best matching point to detect the values of transmissivity (T) and storativity (S); the two cases give different values for T & S.

Fig. 1
figure 1

Analysis of pumping test data for a Case (1) and b case (2)

Consequently, if one conducts several pumping tests on a well under the same conditions, different estimates of aquifer parameters may be obtained. Using single values for the aquifer hydraulic parameters may lead to erroneous and unreliable flow modeling and further transport predictions due to uncertainty arising from this estimation. To obtain the most optimal of these models of decision support, data should also be contained on the uncertainties of each decision-making option that could influence management strategy selection (Borsuk et al. 2004).

In the field of groundwater modeling, Markov chain Monte Carlo (MCMC) is used to sample the posterior distribution of an aquifer of uncertain permeability for a geostatistical model. The reverse model was created and included both direct log-hydraulic conductivity measurements using MCMC method as well as head measurements (standard state or transient) (Lu et al. 2004). Hassan et al. (2009) evaluated parameter uncertainty for a density-driven groundwater flow model utilizing a Bayesian statistical framework and MCMC sampling approach to address the inverse problem (Hassan et al. 2009).

A new algorithm has been described and proven by Fu and Gómez (2009) to produce a stochastic log conductivity based on log conductivity and piezometric head data based on Markov's Monte Carlo chain block hypothesis (Fu and Gómez 2009). Uncertainties are applied to a synthetic isotropic lognormal 2D confined aquifer with an unvarying natural-gradient flowing condition.

Jardani et al. (2012) found that there was an alluvial, heterogeneous, semi-confined aquifer transmitting field that was related to the tidal-influenced estuary (Jardani et al. 2012). Ward and Fox (2012) used Bayesian inference to quantify the parameters of aquifer uncertainty from pumping test data on three examples: the first for synthetic data derived from the Thies (1935) model, the second for synthetic and field data analyzed for the Boulton delayed yield model, and the last for synthetic data from the Hunt-Scott multilayer model (Ward and Fox 2012; Hunt and Scott 2007).

Bardsley and Fox (2012) investigated linear inverse problems with nonnegativity requirements and presented MCMC approach for such problems, whose output (samples) can be utilized to get estimates of unknowns as well as to quantify uncertainty in those estimates (Bardsley and Fox 2012). Shi et al. (2014) assessed parameter uncertainty of groundwater simulations with a specific focus on uranium reactive modeling by using the Monte Carlo Chain method as a solution to the challenge of nonlinearity models and the non-precision parameter (Shi et al. 2014). Zheng and Han (2016) assessed the applicability of MCMC in assessing the uncertainty of watershed-scale water quality (WWQ) modeling (Zheng and Han 2016).

One of the main advantages of adopting MCMC method is that it provides an estimate of the uncertainty associated with the aquifer parameters. This is quite important, especially in light of how climate change is affecting everything. The hydrological cycle is significantly more uncertain as a result of climate change. Aquifers will be subject to variations in soil behavior, groundwater levels, and recharge rates due to changes in temperatures, precipitation patterns, and streamflow (Kidmose et al. 2013, Swin et al. 2022, Osman et. al. 2022). Extreme weather conditions, such prolonged droughts, can also affect aquifer responses and add uncertainty to parameter estimation. Therefore, providing aquifer parameter range, associated uncertainties, and probability distributions is an added value to the hydrological community specially when considering climate change.

In a multi-aquifer groundwater system in China, Ha et al. (2020) investigated the applicability of various methods for calculating hydraulic parameters. A hybrid algorithm, GALMA (a genetic algorithm (GA) paired with the Levenberg–Marquardt (LM) algorithm), is used with the Neuman and Witherspoon model and ratio approach to prevent errors in the graphic-analytical process and to improve the efficiency and accuracy (Ha et al. 2020). Zhang et al. (2020) assessed the effects of the shorter pumping test time on the predicted hydraulic conductivity. The Monte Carlo approach was first used to create a heterogeneous hydraulic conductivity K field. Then, to carry out numerical pumping tests, a MODFLOW groundwater flow model was built (Zhang et al. 2020).

Singh and Tripura (2022) used pumping test analysis to estimate the hydraulic parameters in order to evaluate the formation of an aquifer system on a hilly terrain using 16 bore wells (Singh and Tripura 2022). To obtain the Aquifer hydraulic parameters, Li et al. (2023) created a novel dimensionless-form analytical solution for variable-rate pumping tests using piecewise-constant approximations for varied pumping rates (Li et al. 2023). For unconfined alluvial aquifers in Iran, Dashti et al. (2023) used machine learning models to assess transmissivity by pumping test data. Pumping tests and hydrogeological data for 96 pumping wells were gathered, with 30% of the data being utilized for testing and 70% being used for training (Dashti et al. 2023).

Finally, the main objective of this paper is to develop a tool to quantify the uncertainty in parameter estimation for confined and leaky aquifers (fully penetrating wells) obtained from traditional pumping test analysis using MCMC based on Bayesian inference technique.

Materials and methods

Theoretical background

The following sections discuss the main principals of the methods and techniques that are used in this paper.

Bayesian statistical inference

Bayesian statistical inference techniques are an alternative to the traditional statistical assessment process. Bayesian procedures enable the combination of pre-existing understanding of model parameters and observed information with the output of the model. This leads to a likelihood distribution in the parameter space that sums up uncertainty concerning the parameters; this is regarded as the reverse distribution. The possibility to utilize a precise prior should be based on existing parameter knowledge (Marshall et al. 2004). Optimum values of the preceding studies’ parameters can be calculated usefully. A parameter posterior distribution set \({\text{P}}\left( {{\Theta }/{\text{D}}} \right)\) may be found through the implementation of Bayes ' theorem as follows (Marshall et al. 2004):

$${\text{P}}\left( {\Theta /D} \right) = \frac{{P\left( \Theta \right)P\left( {D/\Theta } \right)}}{P\left( D \right)}$$
(1)

where \({\text{P}}\left( {\Theta /D} \right)\): The posterior distribution. \(P\left( {D/\Theta } \right)\): Model parameters are appropriately set, the likelihood function. \(P\left( D \right)\): The probability of observed data \(\left( D \right)\). \({\text{P}}\left( \Theta \right)\): The prior distribution.

It should be noted that if the existing information is scarce, the posterior distribution follows an analogical form as before. However, the posterior distribution is shaped more by the information than by the preceding distribution when big information sets are in existence. As well, if the previous distribution is chosen to reflect a vague previous understanding, the fresh data in the sample will predominate \({\text{ P}}\left( {{\Theta }/{\text{D}}} \right)\). This includes all the information accessible from previous knowledge as well as from data gathered.

Markov Chain Monte Carlo (MCMC)

MCMC technique is a random multi-dimensional settings method in which the value of the next sample parameter is simply dependent on the present sampling point parameter value. In MCMC, the model parameters posterior distribution is created by drawing of many thousand samples and by computing their empirical average, default and percentiles, similar to the Monte-Carlo simplicity of simulation, except because Monte Carlo pulls samples and probably not upgraded from what was actually a previous distribution (Marshall et al. 2004).

The MCMC methods are conducted by a random or autonomous chain. An autonomous chain recommends using candidate states at every stage, despite of the present state, with the same likelihood distribution. Although many separate MCMC sample algorithms exist, the Metropolis Hastings algorithm and the Adaptive Metropolis algorithm are in place. The Metropolis–Hastings algorithm creates a Markov chain series of samples which represents an altered step in the space parameter. The suggested parameter samples are made using an appropriate arbitrary probability distribution called the proposition density or the jump, Adaptive Metropolis algorithm is a possible solution, using the process history to "tune" the proposed distribution appropriately. The adaptive algorithm Metropolis (AM) continually adapts to the destination distribution.

A. Metropolis–Hastings algorithm

The use of the Metropolis–Hastings algorithm to draw samples from the rear distribution can be explained simply (Marshall et al. 2004):

  1. (1)

    With a random beginning parameter vector, start the simulation at iteration I = 0.\(\Theta = \Theta^{^\circ }\).

  2. (2)

    Generate a proposed value \(\Theta^{*}\) for \(\Theta\) from a density of the proposal according to the current value \(\Theta^{i} of \Theta\).

  3. (3)

    Compute a probability of acceptance \(\propto\) (depending on \(\Theta^{i} ,\Theta^{*}\), the density of proposal and the model) that determines whether the proposed value \(\Theta^{*}\) is accepted or not.

  4. (4)

    Accept \(\Theta^{i + 1} = \Theta^{*}\) with the likelihood \(\propto\), Otherwise \(\Theta^{i + 1} = \Theta^{i} .\)

  5. (5)

    Continue with steps 2 through 4 and do an iteration increment of I

This algorithm disregards the challenging step of assessing moments and other post-distribution data from the Bayesian reference, as a reliable estimation of these moments may be obtained once a large parameter sample is produced. As a result, the sample moments and distribution moments converge for a given beginning value and density of the proposal. When it comes to distribution, the parameter sequence \({\Theta }^{{\text{i}}} { }\) has converged to a stationary state. MCMC system utilizes either a single website or a block update to the Metropolis–Hastings algorithm. Candidate values are produced in turn in single-site updated applications for each element. A candidate value \(\theta_{i}^{\prime }\) from densities with a single \(\left( { q_{i} \left( {\Theta ,\theta_{i}^{\prime } } \right) } \right)\) is proposed that depends on the present value \(\Theta\). The transition from the present state \(\Theta = \left( {\theta_{1} , \ldots ,\theta_{i} , \ldots ,\theta_{N} } \right)\) to the proposed state \(\Theta = \left( {\theta_{1} , \ldots ,\theta_{i}^{\prime } , \ldots ,\theta_{N} } \right)\) is accepted with probability:

$$\alpha = \min \cdot \left( {1,\frac{{p\left( {\theta \prime /D} \right)q_{{i\left( {\theta \prime ,\theta_{i} } \right)}} }}{{p\left( {\theta /D} \right)q_{i} \left( {\Theta ,\theta_{i}^{\prime } } \right)}}} \right)$$
(2)
B. Adaptive metropolis algorithm

The adaptation significantly impacts the size and spatial orientation of the proposed distribution. In addition, in practice, the new algorithm is easy to introduce and apply. The description of the Adaptive Metropolis (AM) is based on the Metropolis algorithm of the classic random walk (Marshall et al. 2004) and its modification. One significant benefit of the AM algorithm is that at the start of the simulation, the cumulative data is used. This indicates that the search becomes more effective at an early point of the simulation because of the quick start of adaption, which decreases the number of features necessary for evaluation.

Research methodology

The procedures followed to achieve the research goals are summarized in the flow chart presented in Fig. 2. The main program is created as a combination between solving a forward problem and an inverse (backward) problem. The forward problem entails calculating the drawdown at a specific location for a different time based on known values of transmissivity and storativity (T and S) and hydraulic Resistance (C) using Theis equation and Hantush and Jacob equation (Batu 1998).

Fig. 2
figure 2

Flowchart for the general applied methodology

Building MCMC code (final program)

The primary MCMC program's steps for solving the inverse problem are shown in Fig. 3. The main MCMC code is developed using FORTRAN program. Identifying the hydraulic parameters, which are two (T & S) for confined aquifers and three (T & S & C) for leaking aquifers, and preparing the real data (actual drawdown) for the completed pumping test, constitute the first step of the program.

Fig. 3
figure 3

Procedure steps to create the final program (MCMC code)

The prior distribution for each hydraulic parameter is unknown so it is assumed to be uniform. At the first iteration (i), a random value for \(T_{i} , S_{i}\) is chosen from the uniform prior distribution, and they are utilized in the forward problem with the equations to produce the anticipated drawdown (calculated drawdown) at the stated location of the observed well and at the same times of the pumping test. The difference (error) between the calculated and actual drawdown for the various time periods is then determined. Then, inverse, log, and exponential functions that represent three likelihood functions are computed. The identical prior computations are then repeated for the iteration (i + 1) and another value for each of the hydraulic parameters, \(T_{i + 1} , S_{i + 1}\) is chosen randomly. In order to determine whether the chosen values \(T_{i} , S_{i}\) are accepted or not, the values of each likelihood function and the corresponding probabilities are used to calculate the value of \(\alpha\), which is compared with a random value (\(u\)). The previous steps are repeated for subsequent iterations until reaching the maximum iterations. The aforementioned processes provide a posterior distribution that includes all accepted values of T and S.

Verification of the program analytically

The verification process (see Fig. 4) is based on the available pumping test data, passes through four main steps. The first step is related to using the MCMC code. In MCMC code, three main likelihood functions (L.H.) are used: inverse, log, and exponential. For each likelihood function (L.H.), the accepted values of the hydraulic parameters (posterior distribution) are collected and two distributions lognormal and normal distribution are tested to represent the accepted values. For each assumed distribution per each likelihood function, three central tendency measures (mean, median, and mode) are tested.

Fig. 4
figure 4

Verification process for the program analytically

For each measure (mean for example), the root mean square error (RMSE) between the accepted values and the tested distribution is calculated for each hydraulic parameter (\({\text{RMSE}}_{T} , {\text{RMSE}}_{S} , {\text{RMSE}}_{C} ).\) Then some dimensionless factors are calculated \((F_{1} = {\text{RMSE}}_{T} /\mu_{T} , F_{2} = {\text{RMSE}}_{S} /\mu_{S } ,F_{3} = {\text{RMSE}}_{C} /\mu_{C} )\), and factors \((F_{{{\text{av}}{.}}} = \left( {F_{1} + F_{2} } \right)/2\)) for confined aquifers, and \((F_{{{\text{av}}{.}}} = \left( {F_{1} + F_{2} + F_{3} } \right)/3\)) for leaky aquifer (see Fig. 4). The best distribution (lognormal or normal) for each likelihood function is detected based on the minimum value of \((F_{{{\text{av}}{.}}}\)).

The second step deals with the selection of the best likelihood function and the associated best distribution (lognormal or normal) and best representing central measure to represent the case study graphically, using the standard/reference curves. Then, 90% confidence level for each hydraulic parameter is determined. The third step is to use the traditional methods (aquifer test—aquatesolve software) to detect one single value for T, S, and C. The final step is the comparison between the single values of the aquifer hydraulic parameters using the traditional methods and the confidence limit of 90% for the parameters resulting from the MCMC code.

Proof the concept numerically

By developing 3D numerical groundwater models (using the Modflow program), a hypothetical case is generated with known hydraulic parameter values for confined and semi-confined aquifer layers in order to demonstrate the effectiveness of the developed MCMC tool. To validate the idea of the program quantitatively, the aquifer parameters produced by the developed MCMC tool were compared with the known values and with the values produced by the traditional methods.

Application of the program for real case

The proposed MCMC code is applied for one actual real pumping test data in El-Wahat El-Baharya at Egypt in order to test the applicability of the MCMC code.

Results and discussion

Verification of the program analytically

Three hypothetical cases for confined aquifer and another three hypothetical cases for the semi- confined aquifer are used to verify the MCMC code (Bashandy et al. 2022), but due to the limitation of the manuscript size only one case for each aquifer type is presented and for more details about the additional two cases for each aquifer type refer to Bashandy et al. 2022.

Verification process: hypothetical case for confined aquifer

A. Description of the hypothetical case

The hypothetical case deals with 22 drawdown records for pumping test data, with a well discharge rate of 0.835 m3/minute as shown in Table 1 (Fetter 2014). The confined aquifer thickness is 14.6 m and fully penetrating well, and the observed well is located about 251.3 m away from the pumping well.

Table 1 Pumping test data for the confined aquifer–hypothetical case

B. Step 1: results of using MCMC code (comprehensive tool)

The prior distribution is assumed as uniform distribution for the parameters T&S. The lower and upper limits for the transmissivity (T) are assumed to be 0.01, and 1.0 m2/min., respectively. The lower and upper limits for the storativity (S) are assumed to be 1E−6, and 1E−4, respectively. The resulting parameters (posterior histogram) and distribution using the three likelihood functions: Inverse, log, and exponential for the accepted values for (normal and log-normal distribution) based on the mean values as an example are shown in Fig. 5.

Fig. 5
figure 5

Posterior distribution for the hypothetical case of confined aquifer, case (1)—mean values

Based on the minimum value of the dimensionless factor Fav. the best distribution for each likelihood function (based on mean values) is chosen as shown in Table 2 (see bold numbers for the selected distribution). For mean central tendency, the best distribution for inverse Likelihood function is lognormal distribution and for Log and exponential Likelihood functions the best distribution is normal distribution.

Table 2 Weighted average values for inverse-log-exponential likelihood function based on mean values as a measure for central tendency

The same previous process is repeated through using median and mode as measures for central tendency, and the results are shown in Tables 3 and 4.

Table 3 Weighted average values for inverse-log-exponential likelihood function based on median values as a measure for central tendency
Table 4 Weighted average values for inverse-log-exponential likelihood function based on mode values as a measure for central tendency

C. Step 2: selection of best likelihood function

Selection of the best likelihood function is achieved by plotting the three selected distributions for each central tendency measure with the Theis type curve and selecting the best matching one for each central tendency measure then select the best likelihood function. Figures 6, 7, and 8 show the comparison between curves resulting from mean, median and mode (transmissivity–storativity) values in case of (inverse-log-exponential) likelihood functions with Theis curve. From the previous figures, the best likelihood function is log form-normal distribution using mode values as a measure of central tendency (best values). At this case, the lower and upper limits of 90% confidence interval can be determined for T and S.

Fig. 6
figure 6

Comparison between (Inverse-Log-Exp.) likelihood function in case of mean values with Theis type curve

Fig. 7
figure 7

Comparison between (Inverse-Log-Exp.) likelihood function in case of median values with Theis type curve

Fig. 8
figure 8

Comparison between (Inverse-Log-Exp.) likelihood function in case of mode values with Theis type curve

D. Step 3: detecting the aquifer parameters using traditional methods

Two traditional software (aquifer test—aquatesolve) are used to detect the single values of the transmissivity and storativity (T &S) as follows:

  • From aquifer test program: T = 0.09 m2/min., S = 2.4E−5

  • From aquatesolve program: T = 0.099 m2/min., S = 1.4E−5

E. Step 4: comparison between results from MCMC code and traditional methods

A comparison between the values of the hydraulic aquifer parameters resulted from MCMC code and traditional methods is as shown in Table 5. Figure 9 illustrates the plots for the aquifer parameters resulted from the traditional methods and the best values and confidence interval (lower and upper limits) resulted from MCMC code using Theis curve. As shown in Fig. 9, the values of aquifer hydraulic parameters resulted from the traditional programs are located inside the 90% confidence interval resulted from MCMC code.

Table 5 Different hydrogeological values using MCMC code and traditional programs—hypothetical case (1) for confined aquifer
Fig. 9
figure 9

Best fitting comparison between results of MCMC code and traditional programs values with Theis type curve

Verification process: hypothetical case for leaky aquifer

A. Description of the hypothetical case

This case deals with a pumping well with a flow rate of 0.0946 m3/min. Time-Drawdown data from an observation well located away from the pumping well at 29 m are illustrated in Table 6 with aquifer thickness of 133 m, and aquitard thickness is 4.3 m (Fetter 2014).

Table 6 Pumping test data for the semi-confined aquifer- hypothetical case

B. Step 1: results of using MCMC code (comprehensive tool)

The lower and upper limits of the aquifer parameters (T, S, C) for uniform distributions (prior distribution) are assumed as in Table 7.

Table 7 Lower and upper limits (prior distribution) for the semi-confined aquifer- hypothetical case

The same process followed in the hypothetical case of the confined aquifer is used here. Figure 10 represents the posterior distribution resulted from the three likelihood functions assuming mean values for the measure of the central tendency. Table 8 illustrates the best selected distribution for each likelihood function based on using the mean values as an example, for more details for the results for median and mode refer to Bashandy et al. 2022.

Fig. 10
figure 10

Posterior distribution for the hypothetical case of semi-confined aquifer—mean values

Table 8 Weighted average value for inverse-log-exponential likelihood function based on mean values—semi-confined aquifer

C. Step 2: selection of best likelihood function

In order to select the best likelihood function, the best selected distributions should be plotted on a standard/reference curve. Due to that in the leaky aquifer, there is no reference curve such as Theis curve in the confined aquifer, the actual measured pumping data (drawdown vs time) is used as reference curve under name "actual". Plot best distributions of (drawdown vs time) for all likelihood functions (using mean, median, and mode), on the same graph of actual curve and select the best fitting likelihood near to the actual curve, see Fig. 11. Comparison shows that the best fitting likelihood is exponential likelihood–normal distribution—Mode value which has the best consistent curve with actual records.

Fig. 11
figure 11

Comparison of the actual curve and best distributions for the three likelihood functions; semi-confined aquifer hypothetical case

D. Step 3: detecting the aquifer parameters using traditional methods

The results of the aquifer hydraulic parameters using the traditional programs are as below:

  • From aquifer test program: T = 0.01 m2/min., S = 2.0E−4, C = 590,000 min.

  • From aquatesolve program: T = 0.0195 m2/min., S = 7.6E−5, C = 4,400,000 min.

E. Step 4: comparison between results from MCMC code and traditional methods

Table 9 contains the results of the comparison between the aquifer parameters resulted from MCMC code and traditional methods. Figure 12 illustrates the plots for the aquifer parameters resulted from the traditional methods and the best values and confidence interval (lower and upper limits) resulted from MCMC code using the actual curve. The values of aquifer hydraulic parameters in Fig. 12 resulted from the traditional programs are located inside the 90% confidence interval resulted from MCMC code.

Table 9 Different hydrogeological values using MCMC code and traditional programs—hypothetical case for semi-confined aquifer
Fig. 12
figure 12

Best fitting comparison between results of MCMC code and traditional programs values with actual curve

Proof the concept numerically (uncertainty quantification of aquifer parameters)

Building a groundwater model (using the Modflow program) as a decision tool to simulate the response of a hypothetical (confined and semi-confined) aquifer with well-known hydrogeological parameters allowed the comprehensive tool (MCMC code) results to be proven. The applied methodology consists of three key steps: Groundwater model’s preparation step, the verification step, and the uncertainty quantification step.

Through the first step of the groundwater model’s preparation, a three-dimension finite difference Processing Modflow (PM) model is originally developed, supporting the first official release of MODFLOW-88. A hypothetical case is built for confined and leaky aquifer with the following characteristics; the model domain is 20 × 5 km, divided into 200 columns and 50 rows for simulating groundwater flow (cell size 100 m × 100 m) as shown in Fig. 13. Initial groundwater levels gradient taken 2/1000, starting with 100 m at the left boundary and no flow in the north and south directions. Aquifer thickness is 200 m (confined aquifer) and aquitard thickness is 20 m (for leaky aquifer). One pumping well is assumed and located at the center of the grids (x = 10,000 m, y = 2500 m) with flow rate of 4000m3/d, one observed well is located 100 m away from the pumping well, fully penetrating wells (see Fig. 13). Finally, the hydraulic conductivity was assumed to be anisotropic and heterogeneous with well-known average values for T, S (for confined aquifer) and C for leaky aquifer.

Fig. 13
figure 13

Hypothetical model grid size for groundwater modeling

Second step deals with the verification process through getting the pumping test data form the groundwater model at the observed well, then solve for the hydrogeological parameters using the traditional methods and using MCMC code (following the same steps described in the evaluation process). Then finally compare between the real hydrogeological parameters and the same from traditional methods and the best value and 90% confidence interval using MCMC code.

Third step studies the effect of uncertainty quantification of the hydrogeological parameters on the long run. Through this step the groundwater model (using GMS Modflow software) is modified to have eight pumping wells with 4000m3/d/well with one observed well located at the center of the grids with 82 m static head as shown in Fig. 14. Run the model with several scenarios as follows:

  • Scenario (1): Use the real values of the aquifer parameters

  • Scenario (2): Use the values of the aquifer parameters form the traditional methods.

  • Scenario (3): Use the best values of the aquifer parameters from MCMC code.

  • Scenario (4): Use the lower limit of 90% level of confidence for the aquifer parameters from MCMC code

  • Scenario (5): Use the upper limit of 90% level of confidence for the aquifer parameters from MCMC code

Fig. 14
figure 14

The hypothetical model grid size for uncertainty quantification process

Plot the drawdown over 50 years for the first three scenarios, and plot the drawdown with time for the whole scenarios on one graph in order to illustrate the effect of selecting the values of aquifer parameters on the drawdown on the long run condition for the decision making. Finally, the model should be re-run for ten’s times using the aquifer parameters between the lower and upper limit from MCMC code, and determine the uncertainty quantification measures of the drawdown based on the uncertainty of the values of the aquifer parameters. In the following session one hypothetical case is illustrated for confined and leaky aquifer.

Hypothetical case for confined aquifer

For the preparation stage, the confined aquifer classified into 20 hydrogeological layers. Assign anisotropic and heterogeneous conductivity medium as a cell by cell entry method using a grid size 100 × 100 m. The real average Transmissivity is 6912 m2/d. Anisotropic and heterogeneous vertical conductivity with a ratio 10% of the horizontal conductivity is assigned as a cell by cell entry method using a grid with the same columns and rows number. Assign a homogeneous and anisotropic storativity medium as a single value for each layer in the hypothetical aquifer, with average real S = 9E−6. For the verification step, Table 10 illustrates the resulted hydrogeological parameters. For the MCMC code, the best likelihood function is inverse form—normal distribution using mean value as the measure of central tendency, for more details refer to Bashandy et al. 2022.

Table 10 Results of different hydrogeological values using MCMC code and traditional programs comparing with real data in case of confined aquifer

For the uncertainty quantification step, the groundwater model is run for the five scenarios as described before. Figure 15 illustrates the effect of using different values of the aquifer hydraulic parameters for the long run (real known values, from traditional method (aquifer test), best value from MCMC method and lower and upper limit of MCMC method). It is obvious that there are different changes in the simulated piezometric head for each value of the aquifer parameter from about 75.2 to 76.9 m. Figure 16 illustrates the comparison of the simulated drawdown map for the first three scenarios over 50 years. The maximum simulated drawdown value using the real hydrogeological parameters, the traditional methods, and best values from MCMC are 3.38, 3.05, and 3.77 m, respectively.

Fig. 15
figure 15

Drawdown versus time for the five scenarios—case of confined aquifer

Fig. 16
figure 16

Drawdown over 50 years for uncertainty quantification process—case of confined aquifer a scenario 1, b scenario 2, and c scenario 3

For the calculation of uncertainty quantification, the numerical model is run ten’s times on the simulated hydraulic parameter values resulted from the MCMC code started with the lower limit value and ended by the higher limit value of 90% confidence interval to get the simulated drawdown each time. The statistical measures of the simulated drawdown are calculated as shown in Table 11.

Table 11 Uncertainty quantification of the drawdown values- confined aquifer

The variation in the simulated drawdown in the observed well as shown in Table 11 varies from a minimum of 3.65 to a maximum of 5.35 m, while this drawdown is 3.05 m in case of using the traditional methods. So, the decision maker here cannot depend on the simulated drawdown using the aquifer parameters resulted from the traditional methods which is (3.05 m) for 50 years to take the right decision. But the decision maker should depend on the simulated drawdown using variable aquifer parameters (due to uncertainty in aquifer parameter estimation) starting for the lower limit to the upper limit resulted from MCMC method as obvious in Table 11.

So, instead of depending on one drawdown value resulted from the deterministic modeling (Traditional Programs), it is recommended to depend on the quantified drawdown based on the created range of the aquifer hydraulic parameters using MCMC method in order to take the uncertainty in the estimation of aquifer parameters into consideration.

Hypothetical case for leaky aquifer

The real average Transmissivity is 2600 m2/d, and average real storativity S = 5.7E−4 in case of semi-confined aquifer, and hydraulic resistance (C) with 1E + 5 with aquitard thickness of 20 m. Anisotropic and heterogeneous vertical conductivity with a ratio 10% of the horizontal conductivity is assigned. The results of the hydrogeological parameters are shown in Table 12. For the MCMC code, the best likelihood function is exponential likelihood—normal distribution using mode value as the measure of central tendency, for more details refer to Bashandy et al. 2022.

Table 12 Results of different hydrogeological values using MCMC code and traditional programs comparing with real data in case of semi-confined aquifer

Figure 17 illustrates the drawdown versus time for using the aquifer parameters resulted from the five scenarios. Figure 18 illustrates the comparison of the drawdown map for the first three scenarios over 50 years. The maximum simulated drawdown value using the real hydrogeological parameters, the traditional methods, and best values from MCMC are 29, 26.5, and 30.5 m, respectively.

Fig. 17
figure 17

Simulated hydraulic head resulted from all semi-confined aquifer scenarios

Fig. 18
figure 18

Drawdown over 50 years for uncertainty quantification process—case of semi-confined aquifer a scenario 1, b scenario 2, and c scenario 3

Table 13 illustrates the statistical measures for the simulated drawdown due to using several values of the hydraulic parameters (uncertainty quantification) from MCMC code started from the lower limit of 90% confidence interval and ended by the maximum limit.

Table 13 Uncertainty quantification of the drawdown values- semi-confined aquifer

Table 13 shows that there is a wide range for the simulated drawdown from 22 to 33 m through using the lower and upper limit of MCMC method, while the simulated drawdown using the traditional method is one unique value of 26.5 m. So, the decision maker should take this variation in the simulated drawdown (due to uncertainty) in order to take the right decision.

Application of the program for real case

The real case is located at El-Wahat El-Baharya which located in the eastern desert—Egypt. This was part of work for project of the groundwater aquifers exploration by groundwater sector- ministry of water resources and irrigation. The pumping well is located at coordinates (29.33605556) E, (28.68211111) N. The field data (drawdown vs time) is available for 76 record, see Table 14. The pumping well discharge is 3.37 m3/min., and aquifer thickness is 130 m. The observed well is located at 22.9 m away from the pumping well.

Table 14 Pumping test data for the confined aquifer—real case study

A. Step 1: results of using MCMC code (comprehensive tool)

By following the same methodology used in the hypothetical case for confined aquifer, Fig. 19 shows the results of parameters posterior distribution for the real case study assuming mean values as measure of central tendency. Also Table 15 contains the values of aquifer parameters for the three likelihood functions using the mean values, and the same have been developed for the median and mode cases as shown in Tables 16 and 17.

Fig. 19
figure 19

Posterior distribution for the real case of confined aquifer—mean values

Table 15 Aquifer values for inverse-log-exponential likelihood function based on mean values for the real case study
Table 16 Aquifer values for inverse-log-exponential likelihood function based on median values for the real case study
Table 17 Aquifer values for inverse-log-exponential likelihood function based on mode values for the real case study

B. Step 2: Selection of best likelihood function

Figures 20, 21, and 22 show the comparison between the best distributions resulting from mean, median and mode values. From the previous figures, the best likelihood function is inverse form—normal distribution using mode values.

Fig. 20
figure 20

Comparison between likelihood functions in case of mean values—real case study

Fig. 21
figure 21

Comparison between likelihood functions in case of median values—real case study

Fig. 22
figure 22

Comparison between likelihood functions in case of mode values—real case study

C. Step 3: Detecting the aquifer parameters using traditional methods

The results from the two traditional programs (aquifer test – aquatesolve) are as follows: -

  • From aquifer test program: T = 0.39 m2/min., S = 1.5E-4

  • From aquatesolve program: T = 0.248 m2/min., S = 7.79E-6

D. Step 4: Comparison between results from MCMC code and traditional methods

Figure 23 shows the values of aquifer hydraulic parameters for the real case. Table 18 illustrates the results for the traditional methods and MCMC code.

Fig. 23
figure 23

Best fitting comparison between results of MCMC code and traditional programs values—real case study

Table 18 Different hydrogeological values using MCMC code and traditional programs—real case study

Summary and conclusions

A new comprehensive MCMC tool is developed to predict the most reliable range of aquifer hydraulic parameters with high confidence to reduce the uncertainty in using a single value resulting from using the traditional methods. The newly created tool can assist decision-makers since numerous groundwater models are crucial, and uncertainty must be considered in establishing aquifer characteristics, especially when employing these models to make the proper decision. The new MCMC tool provides a robust framework for quantifying uncertainty in aquifer parameter estimates. Such a framework can inform real-world decision-making processes. The main advantages of adopting MCMC can be summarized as follows.

  • Decision-makers can gain insights into the range of possible values for aquifer parameters. This will result in a more informed understanding of the limitations of groundwater models, and also will prompt for relying on more dynamic and adaptive management measures.

  • By exploring the parameter space MCMC facilitates the calibration of groundwater models and leads to improved model accuracy and reliability which is one of the important assets for the decision maker.

  • Decision-makers often need to assess risks associated with various water management scenarios and need the answer of What if Scenario. MCMC allows for the generation of probabilistic predictions that enable the evaluation of potential outcomes under different scenarios. Furthermore, adopting such techniques will help in quantifying the risk and the development of adaptive strategies to address uncertainties.

  • The ability to consider parameter uncertainties allows for the identification of robust solutions that perform well across a range of possible conditions.

  • MCMC provides a framework for adaptive management, as the model can be continuously updated when new data becomes available, ensuring that models remain relevant and effective in responding to changing hydrological conditions.

  • There are several limitations and challenges are associated with the availability and quality of pumping test data. The limitation includes sparse data coverage, insufficient information about the overall aquifer behavior, limited duration of pumping test which may cause the data collected during the test does not fully capture long-term changes, limited knowledge of the subsurface geology which may impact the selected approach for the solution. Overcoming these challenges requires careful planning, robust field investigations, and the integration of various data sources and modeling techniques to improve the reliability of aquifer parameter estimations based on pumping test data. However, the proposed MCMC will provide a tool for overcoming these limitations and variability. As MCMC relies on the available data to generate meaningful posterior distributions and make reliable inferences. The better-quality data will be reflecting in less uncertainty and reduces the acceptable range of the parameter under investigation. Poor quality and limited data may lead to increased uncertainty in parameter estimation. Sparse or poor-quality data can result in broader and less precise posterior distributions. Insufficient data may also hinder the convergence of the Markov chains, making it difficult to obtain a representative sample from the posterior distribution. To overcome the limitations on data when using MCMC, additional sources of information may be incorporated, such as expert knowledge or supplementary data from different monitoring techniques. Moreover, sensitivity analyses can play an importance role to assess the reliability of the results in the absence of abundant and high-quality data.