Introduction

The widespread distribution of groundwater makes it a prospectively significant factor in human adaptation to climate variations, as well as social and economic growth. The importance to ensure adequate access to water for a continuously expanding population is increasing. Groundwater which accounts for approximately 34% of the earth freshwater, is of immense importance as a vital natural resource (Islam et al. 2023). A comprehensive understanding of groundwater resources in a specific region is crucial for promoting sustainable development (Tariq et al. 2022). This invaluable resource is extensively utilized for residential, industrial, and agricultural purposes in many regions of the world, underscoring the imperative need for effective management and conservation across various sectors (da Silva Monteiro et al. 2022).

Previous research has acknowledged that the demand for groundwater is rapidly increasing, leading to frequent overuse and significant strain on limited groundwater resources (Sadasivuni et al. 2015; Kuntamalla et al. 2018). Heightened demand for water is a critical issue in tropical and subtropical areas due to improper irrigation practices, excessive exploitation, urbanization, and climate change (Arsène et al. 2018). The flexibility and occurrence of groundwater are influenced by factors, such as geomorphology, geology, fractures, weathering, porosity, drainage patterns, soil characteristics, and climate conditions (Das and Nag 2017). In order to effectively assess, monitor, and conserve prospective groundwater zones, it is crucial to utilize appropriate tools, Geospatial approaches are the most efficient and innovative method for identifying feasible groundwater prospective (Basharat et al. 2022). These approaches can be implemented at both local and regional levels, utilizing ground and satellite data analyzing integrated ground and remote sensing data within a GIS platform to obtain more precise and comprehensive insights into groundwater prospective and distribution.

The study conducted by Pourtaghi and Pourghasemi (2014) used the FR method to delineate groundwater prospective zones (GWPZs). The authors used well locations as indicators and demonstrated the effectiveness of the FR method in assessing groundwater prospective. The utilization of the shannon entropy model for delineating groundwater prospective showed exceptional performance, surpassing the accuracy achieved by the FR model. Recent case studies have shown promising outcomes through the integration of geospatial and remote sensing techniques in groundwater mapping using the AHP model. The AHP model, when used in combination with geospatial and remote sensing methodologies, has been found to be effective and efficient with impressive predictive accuracy (Razandi et al. 2015).

China, particularly in the Yellow River basins, has a significant concentration of underground water resources Studies by Wen et al. (2013, Hou et al. (2017). Groundwater depletion in the North China Plain (NCP) has been happening for many years, primarily as a result of human consumption (Taylor et al. 2013). The widespread use of irrigation-based agriculture, the combination of rapid industry and population growth has led to excessive extraction of groundwater (Huang et al. 2015; Pan et al. 2017) Over the past few decades, rising water demand has caused decline in groundwater tables throughout the whole NCP, as seen through both on-site and satellite-based gravimetric studies (Shen et al. 2015; Zhao et al. 2019). A number of consequences have been brought about by the declining groundwater tables, such as riverbed desiccation, land subsidence, deterioration of water quality, and the requirement for deeper wells (Zheng et al. 2010). The majority of the North China Plain's deep groundwater extraction occurs in Tianjin City and the center and eastern regions of Hebei Province. The level of deep groundwater utilization in most areas of Henan Province is quite low Jiansheng (Shi et al. 2011). A comparative analysis and performance assessment were undertaken to evaluate the effectiveness of the methodologies used in the previously stated models various approaches from existing literature were investigated and taken into consideration. The geographical methodologies, analytic hierarchy process (AHP) and frequency ratio (FR) were chosen for further inquiry due to their limited exploration in prior studies.

The objective was to identify the Groundwater prospective Zones (GWPZ) and evaluate the relationship between certain factors and groundwater inventory data in order to address the growing water demands of the Henan province north China plain (HPNCP). This research will provide valuable guidance to policymakers and managers for future hydrogeological studies, including the identification of optimal locations for boreholes or wells. Ultimately, this will help reduce the costs associated with hydrogeological investigations. This study also assists to achieve the United Nation Sustainable Development Goal 6 target 6.4 to ensure the sustainable withdrawals and supply of freshwater to address water scarcity and sustainability.

Study area overview

HPNCP is located in the central part of China, specifically between latitudes 34.5°N and 36.5°N and longitudes 113° E and 116° E (Fig. 1). It covers various counties and cities within Henan Province and is located in the lower reaches of the Yellow River basin. The western region is dominated by the Taihang Mountains, while the southern area is situated near the Yellow River The eastern border is adjacent to Shandong Province, which defines the boundaries of the region. HPNCP is an important region in North China encompasses several key cities and Districts, i.e. Anyang, Hebi, Puyang (AHP); Jiaozuo, Xinxiang (JX); Zhengzhou, Kaifeng (ZK); Xuchang, Luohe, Pingdingshan (XLP) Zhumadian, Nanyang (ZN). The climate in HPNCP is classified as intermediate, with a combination of a humid subtropical climate in the south and a warm temperate climate in the north. The average annual temperature ranges from 12 to 16 °C, with the hottest months being July and August, when temperatures can reach up to 35 °C. Winters are cold, with January being the coldest month, and temperatures often dropping below freezing. The area receives an average annual precipitation of 600–800 mm, with a concentration during the summer months, influenced by the East Asian monsoon system.

Fig. 1
figure 1

Layout base map of Henan province North China Plain as a study area

Hydrogeology of study area

The study area, characterized by Quaternary unconsolidated formations, experiences a gradual increase in thickness, ranging from less than 100 m in mountain pediments to over 180 m in the middle plain and exceeding 400 m in specific eastern areas of the Henan Plains. The aquifer primarily consists of sandy gravel and sand in mountain pediments, transitioning to medium-coarse sand and some silt in the western region, and further east, featuring silt, fine sand, and mild clay. The groundwater consists of water from pores derived from unconsolidated Quaternary rock with the aquifer media primarily composed of medium and fine sand, occasionally incorporating coarse sand (Shi et al. 2012).

Material methodology

Datasets

The present study utilized a diverse range of datasets to establish specific parameters for evaluating groundwater prospective. These datasets encompassed in-situ measurements collected through meticulous field surveys and groundwater observation wells, as well as remote sensing data acquired from Sentinel-1 satellite imagery and Google Earth Pro. These datasets were utilized to inventory the groundwater bodies. The remote sensing data provided a comprehensive perspective of the study area from a remote sensing perspective. These datasets encompass ten crucial factors that are essential for determining groundwater prospective. To ensure the accuracy and reliability of the data, the acquisition of both ground and satellite information relied on reputable national and international research platforms. A detailed summary of the specific data and their respective sources can be found in Table 1, which improves transparency in understanding the data used in this study.

Table 1 Thematic layers and their source of collection

Methods of groundwater prospective zone (GWPZ)

The research was conducted in four main stages to ensure a comprehensive analysis of groundwater prospective (GWPZ) in the study area. The first stage involved creating an accurate inventory map of groundwater resources using geospatial analysis, and well data. This phase aimed to establish a comprehensive understanding of the distribution of groundwater in the area. In the second stage, ten influential parameters were identified as key indicators for assessing groundwater prospective. These parameters were derived through data analysis, using various geospatial techniques and statistical methodologies. The third stage involved applying two geospatial models, namely AHP and FR to delineate the (GWPZ). These models incorporated influential parameters to provide a comprehensive assessment of groundwater prospective in the study area. The fourth and final stage focused on validating and assessing the accuracy of the generated GWPZ. The AUC technique, a robust statistical measure, was used to evaluate the performance and reliability of the models in predicting groundwater prospective. Figure 2 provides a clear and organized representation of the methodology, illustrating the step-by-step process and the connections between activities (Fig. 3).

Fig. 2
figure 2

Flowchart of Methodology

Fig. 3
figure 3

Indicating testing (red) and training points (black) of the groundwater locations

Surface water bodies inventory mapping

Accurately mapping surface water bodies is important for delineating Groundwater Protection Zones. The inventory map was constructed using a combination of ground-based observations and remote sensing data. Sentinel-1 satellite imagery was used to identify and delineate different water bodies. A statistical model was used to ensure accuracy. The inventory map was validated using groundwater observation wells and groundwater level data. Extensive field surveys were conducted in the HPNCP. The inventory map was compared and verified against the actual items on the ground to enhance credibility.

As a final step, the inventory data obtained in this research was divided into training and testing datasets, with a split of 70% for training and 30% for testing. The methodology used for constructing the inventory map closely aligns with the work conducted by Ismail et al. (2023), ensuring consistency and scientific rigor in the process of data collection and analysis.

Preparation of parameters for conditioning groundwater prospective zone (GWPZ)

The accurate selection and identification of conditioning parameters are crucial for effectively delineating and determining (GWPZ). These parameters have a significant impact on the final output map, emphasizing the importance of careful consideration (Bui et al. 2019). In the present study ten influential conditioning factors were considered, including elevation, curvature, slope, drainage, rainfall, land use and land cover (LULC), soil properties, road and stream distance, drainage density, and landform to evaluate the influences of mentioned parameters on GWPZ in the study area. Each parameter provides valuable insights into the hydrological characteristics and geological attributes of the research area, enabling a comprehensive evaluation of the groundwater prospective the integration method of weighted overlay analysis was employed to merge these thematic layers by assigning ranks and weights to each layer weighted overlay analysis is a straightforward technique that allows for the examination of multiclass maps, revealing the complex relationships and dependencies that contribute to variations in groundwater prospective.

This approach will contribute to a better understanding of the complex factors influencing groundwater availability and distribution, thereby contributing to more informed and effective water resource management decisions.

Groundwater prospective zone (GWPZ) mapping models

Groundwater presence and productivity in each aquifer are determined by a multitude of factors (Razandi et al. 2015). Using a geospatial framework, this study considered ten factors that affect groundwater, which were classified into five groups: topographical, hydrological, geological, and other components. Thematic maps, which are related to the analysis of spatial correlations, were generated and superimposed onto the comprehensive inventory map, along with groundwater data. To facilitate this analytical process, the factors contributing to good formation were transformed into a raster grid format, utilizing cells measuring 30 × 30 m. Subsequently, bivariate and multivariate statistical analysis methods were applied individually to compare their efficiency in delineating GWPZ.

AHP model

The weighting of criteria was established through the implementation of pairwise ranking using the rank-sum method within the (AHP), initially formulated by Saaty (1990) and commonly referred to as the Saaty method (Coyle 2004), serves as a valuable tool for managing complex decisions. It empowers decision-makers to navigate complex problems and identify the optimal solutions for problem resolution. Notably, the strength of the model lies in its ability to perform computations for evaluating Eqs. 1 and 2). The consistency index and the consistency ratio. The resulting matrix was examined for consistency. If the consistency ratio is less than 0.1, the weights obtained for the criteria will be applied to the analysis. The approach encompasses three fundamental principles: decomposition, relative judgment, and priority synthesis. Operating through pairwise evaluations involves comparing elements at the hierarchical level immediately above, such as variables and indicators. Ratings are assigned to features in the form of numerical values within a comparison matrix. From this framework, the relative weights of all aspects are computed based on their hierarchy of importance.

The Saaty rating scale was used to determine the relative importance of factors in the pairwise matrix. This approach utilizes a numerical scale (ranging from 1 to 9) to assign values that indicate the relative significance of factors. The process involves a pairwise comparison methodology for calculating the relative significance of each factor. Please refer to Supplementary Data table-S1 for an illustration of the Saaty pairwise rating scale.

Step 1 Principal Eigenvalue (λmax) was computed by the eigenvector technique.

Step 2 Consistency Index (CI) was calculated from the following equation.

$$CI = \frac{\lambda max i - in}{{n - 1}}$$
(1)

where n is the number of criteria or factors.

Step 3 The consistency ratio (CR) is a measure of the consistency of the pairwise comparison matrix, and it is provided by the equation.

$${\text{CR}} = I\frac{CI}{{RI}}$$
(2)

where, RCI = Consistency index for a random-like matrix whose value was obtained from (Saaty 1980). The value of the consistency ratio (CR) obtained should be less than 0.10 for weights to be consistent, else the corresponding weights must be re-evaluated to avoid inconsistency.

FR model

The frequency ratio (FR) is a prominent method in applied mathematics for geospatial research. It is used to assess correlations between various measures. Numerous studies have employed the frequency ratio method to assess groundwater prospective across different geographical areas. The model is constructed by analyzing the relationship and statistical interaction among multiple conditioning factors and the presence of wells within each relevant factor (Online Appendix-1) The frequency ratio is numerically represented by Eq. 3 (Pham et al. 2015).

This method provides valuable insights into the probability of groundwater presence as determined by spatial distribution and relationships between influencing factors, making it a priceless tool for effective groundwater prospective assessment and management strategies.

$$FR = \frac{{\frac{N_I^P }{N}}}{{\frac{{N_i^{Ip} }}{N^I }}}$$
(3)

where Npi = several pixels with wells for each contributing factor; N = the total number of all pixels in the total study area; \({Ni}_{i}^{Ipi}\) = number of pixels in each class of a contributing factor; and Nl total number of pixels in a contributing factor (Table 2).

Table 2 AHP Normalized pairwise comparison Metrix for GWPZ delineation

Delineation of groundwater prospective

Delineation of groundwater using the FR model

The weight proportion for each class in the FR model is not determined based on the features of the conditioning factor. But rather on the spatial distribution of wells within each category. In contrast, the AHP model assigns weights based on the characteristics of the conditioning component. The FR value is calculated for each conditioning variable. If the value exceeds 1, it indicates a strong correlation, while a value less than 1 suggests a weaker correlation. These intervals represent different zones of groundwater prospective (Lee and Pradhan 2006). The following equation (Eq. 4) was used to generate the groundwater prospective zone (GWPZ) map for the Henan North China Plain (Guru et al. 2017).

$$GWPZ=\sum \left({\text{FRij}}\right)$$
(4)

where:

FRij denotes the FR values for each conditioning variable and ∑ indicates the summation over all conditioning variables. By applying the FR model and calculating the GWPZ using (Eq. 4) were delineated, providing insights into the GWPZs within the HPNCP, in (Fig. 8a–j).

Validation of the groundwater prospective mapping (GWPZ)

Ensuring the reliability and empirical value of the demarcation (GWPZ) is of utmost importance, and a critical validation process has been implemented. The measurement of groundwater yield, which is a secondary indicator, was used as a validation metric, following the approach used by previous researchers. Instead of relying solely on specific hydraulic parameters, the use of groundwater yield measurement as a validation indicator is widely accepted in assessing the accuracy of GWPZ models (Arabameri et al. 2019). AUC/ROC curves were employed. The analysis quantifies the accuracy of the prediction system in determining the occurrence or non-occurrence of groundwater incidents (Shah et al. 2021). To validate the analytic hierarchy process (AHP) and feature ranking (FR), a higher Area Under the Curve (AUC) value indicates a higher level of accuracy in predicting groundwater prospective. The model with the highest AUC is considered superior.

Results

Results of study area are given below,

Elevation

The flow of groundwater and surface discharge is influenced by topography. The link between infiltration and low-altitude areas is positive. Considering this, it is likely that low-lying places accumulate surface water following severe storms that drain downstream networks. The elevation of the research area is a critical factor that influences the prospective zone for groundwater (GWPZ) due to its inverse relationship with the groundwater source (Karimi-Rizvandi et al. 2021). For the analysis in this investigation, elevation information was obtained from the ALOS WD30DSM dataset, which has a spatial resolution of 30 m. The elevation ranges from −29 to 1700 m above mean sea level (MSL). To enhance the analytical process, the elevation values were segmented into five distinct categories using the natural break method, as illustrated in (Fig. 4A).

Fig. 4
figure 4

(A) showing elevation (B) slope angle

Slope

The slope gradient is a crucial parameter that profoundly influences groundwater prospective, as it directly affects the infiltration of rainwater and surface runoff in the region. Steeper slopes have a negative impact on groundwater reservoirs due to increased runoff and reduced water infiltration. On the other hand, lower slopes promote water percolation and prospective recharge areas (Kamali Maskooni et al. 2020). Within the HPNCP, the slope varies across five classes: < 5°, 0–3.2°, 3.3–9.9°, 10–21°, and > 35–81.4°, as illustrated in (Fig. 4B). The highest recorded slope in the study area was 81.4°, while the lowest slope was 0°. Understanding the distribution of slope gradients is crucial for assessing groundwater prospective and identifying regions that are more favorable for groundwater recharge and preservation. The information provided by slope analysis helps in making informed decisions regarding the sustainable management of water resources in the area.

Landform

The study focuses on earth structures and illustrates the various landforms associated with the GWPZ and structural features. The landform of an area depends on the structural evolution of geological formations. Based on their origin, seven geomorphological features were identified and classified as plain, medium gradient hill, high gradient hill, high gradient mountain, flat, and dissected plain in (Fig. 5A). The (HPNCP) with high-altitude rocky fields, such as plateaus, piedmont slopes, structural and residual hills, and inselberg complexes, were found to have medium to low groundwater prospective. As a result, low weights were assigned to these areas. Conversely, alluvial and floodplain areas typically exhibit ideal conditions for groundwater storage due to their high rates of infiltration. As a result, these areas were assigned higher weights.

Fig. 5
figure 5

(A) Geomorphology (B) Curvature

Curvature

The curvature map provides insights into the prospective for water retention within various surface areas. Generally, concave structures tend to collect and retain additional water features (Pham et al. 2019). The analysis of high-resolution curvature of the High-Performance Nonlinear Curvature Profile (HPNCP) utilized the 30-m resolution ALOS DSM data. This data was then categorized into open, flat, and convex segments, as previously described in (Fig. 5B).

Rainfall

Climatic parameters, particularly precipitation, exert a significant influence on groundwater recharge and are essential for prospective groundwater mapping (da Silva Monteiro et al. 2022). Rainfall data was acquired from 1970 to 2022 using the Climate Research Unit (CRU). Utilizing maps to display the spatial distribution, the inverse distance weighting (IDW) method was successfully employed within the ArcGIS 10.8 environment using the provided data. The research region experiences an average annual precipitation of approximately 1100 mm, as depicted in (Fig. 6A). Evaluating the influence of the annual average precipitation pattern on groundwater prospective within the study area. The precipitation data was categorized into five distinct groups, with their spatial range varying from less than 520 mm to less than 760 mm. Regions with precipitation levels below 520 mm were assigned a lower weight, indicating their limited prospective. Predominantly located in the southeastern part of the study area, regions with precipitation levels exceeding 760 mm were assigned a higher weight and primarily situated in the northeastern part of the area. The weights assigned to the rankings of individual factors and their respective subclasses were determined by considering the concentration of precipitation and its impact on groundwater recharge.

Fig. 6
figure 6

(A) Rainfall (B) Drainage Density

Drainage density

Drainage density plays a crucial role as an indicator of terrain permeability, demonstrating an inverse relationship with areas of water retention. Strong indicators of probable water retention are areas with low drainage intensity, while increased drainage intensity indicates a reduced likelihood of groundwater presence due to increased runoff. In the context of this investigation, the drainage density within the study region was calculated using the line density technique in ArcGIS 10.8 software. After calculating the drainage density, the study area was reclassified into five distinct classes: very low (0–0.1 km/km2), low (0.2–0.2 km/km2), moderate (0.3–0.31 km/km2), high (0.4–0.5 km/km2), and very high (0.6–1 km/km2). (Fig. 6B). Areas with greater drainage density indicate a higher prospective for runoff, while those with low drainage density are more likely to have a higher prospective for groundwater recharge.

Land use land class (LULC)

Groundwater recharge, occurrence, and availability are significantly influenced by parameters. It plays a vital role in the hydrological cycle, influencing water infiltration and surface runoff patterns. Different (LULC) categories have distinct impacts on groundwater prospective. Typically, areas with bare ground and built-up areas exhibit lower groundwater prospective, while areas with vegetation and proximity to water reservoirs show higher groundwater prospective. In the context of this research, the (LULC) map was produced using Sentinel-2 data and the Google Earth Engine. The map was further classified into eight classes, enabling the assessment of their individual impacts on groundwater, as depicted in (Fig. 7A).

Fig. 7
figure 7

(A) Land use and land cover (B) Soil

Soil

Soil plays a crucial role in influencing the amount of groundwater present. Analyzing soil characteristics enables the identification of its various types and properties. The movement of groundwater and the infiltration of surface water into the ground are mainly controlled by the porosity and permeability of the soil. Hence, understanding soil is crucial for assessing local groundwater levels. The results of soil classification are depicted in (Fig. 7B). The study revealed nine soil types in the region: Chromic Luvisols, Lithosols, Molic Gleysols, Calcaric Gleysols, Chromic Cambisols, and Orthic Crisols. Movement and infiltration of water in these types of soils are not the same; they vary based on their properties (Fig. 8).

Fig. 8
figure 8

Groundwater-related factors for GWPZ mapping based on class FR values for HCNP (a) slope (b) elevation (c) LULC (d) distance to road (e) distance to streams (f) rainfall (g) drainage density (h) landform (i) soil (j) curvature

Discussions

Based on the analysis of hydrologic, geologic, topographic, and climatic data, the downstream area has high groundwater prospectiveity as indicated by the spatial and statistical analysis of the input factors. Soil composition of study area has great prospective of holding groundwater aquifers mostly Chromic Luvisols, Lithosols, Molic Gleysols, etc. These types of soil have porosity and permeability. In this study area, fifty-four wells were analyzed to assess the prospective occurrence of groundwater. Initially, thematic maps related to groundwater were created and overlaid into the well inventory map. Through these overlapping areas, every factor related to groundwater was subsequently evaluated using the analytic hierarchy process (AHP) and frequency ratio (FR) in the ArcGIS environment for weighted linear combination to analyze the data. The results of the assigned value to models for each class are displayed in Supplementary Data Table S1.

The developed AHP map for groundwater prospective zone showed that 3.34 and 2.90% of the total area were determined to have a very low prospective. There is not much difference between the low and moderate zones, as the moderate zone only covers 5.04% of the total area. The percentages for the high and very high values are 15.81 and 72.94%, respectively. The high prospective zone for groundwater was determined using the (AHP) model (Fig. 9A). This zone is characterized by a slope area greater than 35 m, a drainage density greater than 0.2 km2, and the presence of plains with both older and younger silt deposits, valley terrace deposits, and farmland areas. The altitude in these areas ranges from −29 to 1730 m. Areas with sloped terrain and weathered granite were identified as having a high prospective for groundwater (GWPZ). Results showed that groundwater prospective is primarily influenced by elevation, soil type, landform, drainage density, curvature, land use/cover, slope angle, and rainfall.

Fig. 9
figure 9

(A) AHP (B) FR-based GWPZ map of HNCP

According to the final output based on the generated groundwater prospective map from the FR method, approximately 72% of the total area exhibits good groundwater prospective. The groundwater zones for very low, low, moderate, high, and very high groundwater prospective cover approximately 3.2, 2.80, and 5.11% of the area. 15.9, 72.91, and 100% correspond, respectively, to the entire area (Fig. 9B), Table 3.

Table 3 Coverage areas of different groundwater potential zones

The correlation between groundwater presence and altitude reveals that the elevation range of −29 to 1730 m exhibits the highest FR value. As elevation increases, the FR value decreases, indicating that regions characterized by lower elevation are associated with higher groundwater prospective, especially in relation to slope angles (0–2.8). Flat terrain is identified as having excellent groundwater prospective. The flat topography facilitates greater water penetration and storage, resulting in an area with high groundwater prospective. Classes with gently sloping terrain (2.8–9.26 m) have FR values > 1 and are considered zones with significant groundwater prospective due to their undulating topography and limited surface runoff. This indicates a positive correlation between low slopes and groundwater prospective. However, areas with very steep terrain (> 35°) are considered to have weak groundwater prospective zones due to high runoff, low infiltration rates, and gradual slope. By increasing the slope within a given area, the likelihood of groundwater availability decreases because of a reduced infiltration capacity.

GWPZ was also greatly influenced by parameters, such as drainage density and stream distance. As can be observed, distances between 0–2000 m and 2100–4600 m exhibit FR values > 1. As the distance increases, the density of drainage measures the total length of the stream segment of all orders per unit area. This density is categorized into five classes based on the range of density, namely very low (0–0.2 km/km2), moderate (0.3–0.31 km/km2), high (0.4–0.5 km/km2), and very high (0.6–1 km2). About 61% of the studied area falls into the moderate drainage density, indicating a high prospective for groundwater occurrence. Greater drainage density indicates a higher prospective for runoff, while low drainage density indicates a higher probability of groundwater recharge.

The amount of rainfall was taken into consideration using the GWPZ indicates that the study experiences higher and the highest rates of groundwater recharge and prospective, leading to higher amounts of rainfall. Conversely, regions with lower rainfall rates receive smaller weights due to their limited recharge prospective. The HPNCP is a diverse region that includes prairies, plateaus, hills, and mountain ranges. (Fig. 5A). The geomorphology demonstrates that the terrain was relatively flat to a certain degree. The map covers a plain area of 7747 km2, high-gradient mountains covering an area of 243.56 km2, and medium-gradient hills spanning 533.06 km2. Among these, platforms are considered favorable for groundwater storage due to their gentle slopes, spacious basins, and convex shapes that can retain water. As results are given a high weight in the assessment. However, the hills primarily consist of solid rock formations with minimal soil content and a lack of essential groundwater fractures and caverns. So that were assigned a lower weighting allocation. In mountainous regions, groundwater is also derived from interstitial water within tertiary unconsolidated coarse-grained geological formations, resulting in a moderate to high prospective for groundwater availability.

Additionally, changes in land use and land cover (LULC) can alter recharge rates by influencing evapotranspiration, drainage, and the overall groundwater recharge cycle (Fig. 7A). The land is classified into eight classes: water (2%), trees (1%), flooded area (0.0%), crops (68%), built-up areas (25%), bare land (0.0%), clouds (0.0%), and range land (4%). The largest (LULC) class is cultivated land, while rangeland covers the smallest area. Aquatic bodies, cultivated areas, and unproductive fields are recognized as significant reservoirs for storing groundwater due to the ability of agricultural land to facilitate water infiltration through soil pores. The impact of land development is significant because it hinders water infiltration into the soil due to decrease in porous surfaces in urban areas. Consequently, urban areas, which are characterized by high levels of impervious surfaces, are assigned lower weights in the evaluation, and reflects their limited ability to contribute to the recharge of groundwater. In contrast, water bodies, crops, and rangeland received the highest weights, emphasizing their significant role in storing groundwater. Conversely, barren land, built-up areas, and clouds are assigned the lowest weights, as outlined in the GWPZ classification presented in Online Appendix 1.

The ROC curve is a widely used technique in the binary classification framework for assessing and comparing the effectiveness of different models. This curve graphically represents the true positive rate (TPR) in contrast to the false positive rate (FPR) across different thresholds of a model's forecasted probability (Li et al. 2021). A model that achieves a higher AUC (Area Under the ROC Curve) is considered to have superior performance. In this study, we utilized (AHP) and (FR) methodologies to evaluate the performance of the models using the Area Under the Curve (AUC) metric, AHP yielded an AUC value of 70%, indicating a satisfactory balance between the true positive rate (TPR) and the false positive rate (FPR). This implies a significant lack of false positives and false negatives. AHP emerges as a suitable option for classification tasks that emphasize both precision and recall as highlighted by Wahla et al. (2022). Conversely, FR demonstrated an AUC of 0.69. This indicates an increased true positive rate (TPR) combined with a decreased false positive rate (FPR), making it ideal for situations where accurate detection of true positives is crucial, while the occurrence of false positives is less important Conversely, it implies a higher false positive rate (FPR) and a lower true positive rate (TPR) compared to Model AHP. In summary, each model has its own strengths and weaknesses, and the selection of the appropriate model depends on the specific requirements of the given task FR adeptly strikes a balance between TPR (True Positive Rate) and FPR (False Positive Rate), proving valuable when reducing false positives is crucial. The AHP, in contrast, is better suited for recognizing true positives, even if it involves a trade-off with false negatives.

The findings of this study reveal that the groundwater prospective in most parts of the province is rated as moderate. However, region with exceptionally high groundwater prospective are predominantly located in the northern and central areas of the (HNCP). This increased prospective is attributed to the abundance of sedimentary plains and gradual slopes, which facilitate efficient water infiltration. To validate the GWPZ assessment, a comparison was made with existing well data using the AHP and FR models. The study demonstrated a strong agreement between the GWPZ classification and field data, confirming the reliability and accuracy of this approach in assessing groundwater prospective. Additionally, a sensitivity analysis was conducted to gain a more comprehensive understanding of the impact of individual factors on groundwater prospective.

Validation of the models

During the modeling process, verifying the model is an essential stage to ensure the strong scientific credibility of the research activity (Barakat et al. 2023). The AUC method has been used in a number of studies to evaluate GWPZ performance. The receiver operating characteristic (ROC) curve is well accepted metric for assessing accuracy and has been applied widely to evaluate methodologies in a wide range of water-related research questions. The AHP and FR models in this study were both validated using the ROC graph. Two unique models that resulted from the final classification of GWPZ were analyzed and verified. The range of the AUC values is 0–1. If the value is less than 0.5, the model's classification is insufficient and must be re-evaluated. On the other hand, a value close to 1 indicates a clearly defined result (Pourghasemi and Rossi 2016).

The prediction and accuracy rates were derived from the area under the curves of two models AHP and FR. These models incorporated the weighted rankings of all predictor factors, along with the ten conditioning factors (as shown in Figs. 10 and 11). About 70% of the wells (95 wells) were used for model training, while the remaining 30% (54 wells) were set aside for model validation. Model 1 incorporated AHP factor weights through pairwise comparison, resulting in an overall success rate of 70%, which is considered the optimal outcome for mapping. Model 2, following the FR achieved a success rate of 69.9%. Both models demonstrate competency for GWPZ. However, in this investigation, AHP demonstrated a higher predictive capability compared to FR in these two processed models. In the broader context, the AUC predictability of GWPZ was determined to be between good and excellent, as illustrated in (Fig. 11).

Fig. 10
figure 10

ROC/AUC curves for the AHP and FR model

Fig. 11
figure 11

Score meter charts showing the accuracy of the GWPZ maps based on FR and AHP model performance and validation of HCNP

Conclusion

On the basin of results, it is concluded that GWPZ effectively generated spatial correlation values between groundwater productivity data and each relevant factor. These values were computed using 70% of the wells as a training dataset and 30% of the wells for validation. These findings indicate that areas with good to excellent groundwater prospective are closely associated with high-yield wells. This outcome suggests that the models adeptly classified the study area in terms of well distribution. Reliability analysis further demonstrated that the AHP model, with an AUC of 70%, outperformed the FR model, which had an AUC of 69.9%. These results underscore the considerable accuracy of the models, attesting to the efficacy, excellence, and capability of these methodologies in delineating GWPZ. The maps generated for groundwater prospective can be useful in various decision-making processes for the sustainable management of groundwater resources in the HPNCP. The research advocates for the adoption of GIS technology in conjunction with remote sensing data for future groundwater investigations. This approach offers to reduce cost, save time, and enhance accuracy. However, incorporating sufficient data and information that directly or indirectly influence groundwater storage and occurrence could further enhance the effectiveness of remote sensing; methodology can be helpful in inquiring the areas of GWPZs is helpful to decision-makers of the study area for considering the sustainability.

Strength and limitations

Our research exhibits notable strengths in analysis of Groundwater Prospective Zones (GWPZ) by integrating Landsat-8 satellite data with observation well information, this approach significantly advances the regional knowledge of GWPZ, provide the precise groundwater assessment and mapping, with a specific emphasis on identification of prospective zones. The investigation of ten groundwater condition factors enhances our understanding and provide guidance to future studies about groundwater zoning.

However, it is important to acknowledge certain limitations of this study. While the utilization of satellite data some spatial and temporal resolution issues was found, that affect the accuracy of results. Furthermore, this study focuses on ten conditioning factors. Other influential parameters, such as soil characteristics and socio-economic aspects were excluded. Additionally, the Henan Province North China Plain raises concerns about the direct applicability of our findings to areas with different geographical and climatic conditions.