Determination of groundwater potential using ensemble machine learning models in GIS (Case Study: Birjand plain)

Document Type : Original Article

Authors

1 MSc. Alumni, Department of Surveying Engineering, Faculty of Surveying Engineering and Spatial Information, University of Tehran, Tehran, Iran

2 Master of Science (MSc), Civil Engineering, Water and Hydraulic Structures, Young Researchers and Elite Club, Mashhad Branch, Islamic Azad University, Mashhad, Iran

3 Associate Professor, Department of Civil Engineering, University of Birjand, Birjand, Iran

4 M.Sc. Graduate, Department of Water and Hydraulic Structure, K. N. Toosi University of Technology, Tehran, Iran

Abstract

Predicting the potential of groundwater is very important for the systematic development and planning of water resources. The main purpose of this study was to develop ensemble machine learning models including random forest (RF), logistic regression (LR) and Naïve Bayes (NB) by random subspace Classifier (RS) algorithm to predict groundwater potential areas in Birjand plain. Therefore, for implementation, geo-hydrological data of 37 groundwater wells (Number of wells, location of wells and groundwater level or Water table) and 17 hydrology, topographic, geological and environmental criteria were used. The least squares support vector machine (LSSVM) feature selection method used to determine the effective criteria to increase the performance of machine learning algorithms. Finally, groundwater potential prediction maps were prepared using RF-RS, LR-RS and NB-RS models. The performance of these models evaluated using the area under the curve (AUC) and other statistical indicators. The results showed that the RF-RS hybrid model (AUC = 0.867) has a very high predictability for groundwater potential in the study area. It was also found that the elevation criterion is most important in predicting the groundwater potential in the study area. The results of the present study can be useful for making appropriate decisions and planning regarding the optimal use of groundwater resources.

Keywords


افتخاری، م.، اسلامی نژاد، س. ا.، حاجی الیاسی، ع. و اکبری، م. 1400. ارزیابی زمین‌آماری با شاخص کیفیت آب زیرزمینی به‌منظور آشامیدن (DGWQI) در آبخوان دشت بیرجند، نشریه محیط‌زیست و مهندسی آب، 7(2): 268-278
Althuwaynee, O.F., Pradhan, B., Park, H.J. and Lee, J.H. 2014. A novel ensemble decision tree-based CHi-squared automatic interaction detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping. Landslides. 11:1063–1078.
Bui, D.T., Pradhan, B., Revhaug, I., Nguyen, D.B., Pham, H.V. and Bui, Q.N. 2015. A novel hybrid evidential belief function-based fuzzy logic model in spatial prediction of rainfall-induced shallow landslides in the Lang Son city area (Vietnam). Geomat Nat Haz Risk. 6:243–271.
Chapi, K., Singh, V.P., Shirzadi, A., Shahabi, H., Bui, D., Pham, B.T. and Khosravi, K. 2017. A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environmental Modelling & Software. 95:229-245.
Chen, W., Tsangaratos, P., Ilia, I., Duan, Z. and Chen, X. 2019. Groundwater spring potential mapping using population-based evolutionary algorithms and data mining methods. Science of The Total Environment. 684: 31-49.
Chen, W., Zhao, X., Tsangaratos, P., Shahabi, H., Ilia, I., Xue, W. and Ahmad, B. B. 2020. Evaluating the usage of tree-based ensemble methods in groundwater spring potential mapping. Journal of Hydrology. 583: 124602.
Chung, C.J.F. and Fabbri, A.G. 1993. The representation of geoscience information for data integration. Nonrenewable Resources. 2:122–139.
De Santana, F.B., de Souza, A.M. and Poppi, R.J. 2018. Visible and near infrared spectroscopy coupled to random forest to quantify some soil quality parameters. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy. 191:454-462.
Eftekhari, M. and Akbari, M. 2020. Evaluation of the SINTACS-LU model capability in the analysis of aquifer vulnerability potential in semi-arid regions. Journal of Applied Research in Water and Wastewater. 7(2):111-119.
Eini, M., Kaboli, H.S., Rashidian, M. and Hedayat, H. 2020. Hazard and vulnerability in urban flood risk mapping: Machine learning techniques and considering the role of urban districts. International Journal of Disaster Risk Reduction: 101687.
Farid. D.M., Zhang, L., Rahman, C.M., Hossain, M.A. and Strachan, R. 2014. Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems with Applications. 41:1937–1946.
Guyon, I. and Elisseeff, A. 2003. An introduction to variable and feature selection. Journal of Machine Learning Research. 3:1157–1182.
 Ho, T.K. 1998. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20: 832–844.
Hong, H., Tsangaratos, P., Ilia, I., Liu, J., Zhu, A.X. and Chen, W. 2018. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Science of the Total Environment. 625:575-588.
Hosseini, F.S., Choubin, B., Mosavi, A., Nabipour, N., Shamshirband, S., Darabi, H. and Haghighi, A.T. 2020. Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: Application of the simulated annealing feature selection method. Science of the total environment. 711: 135161.
Jancewicz, K., Migoń, P. and Kasprzak, M. 2019. Connectivity patterns in contrasting types of tablelandsandstone relief revealed by Topographic Wetness Index. Science of the Total Environment. 656:1046-1062.
Kalantari, Z., Ferreira, C.S.S., Walsh, R.P.D., Ferreira, A.J.D. and Destouni, G. 2017. Urbanization development under climate change: hydrological responses in a peri-urban Mediterranean catchment. Land Degradation & Development. 28 (7): 2207–2221.
Kanani-Sadat, Y., Arabsheibani, R., Karimipour, F. and Nasseri, M. 2019. A new approach to flood susceptibility assessment in data-scarce and ungauged regions based on GIS-based hybrid multi criteria decision-making method. Journal of Hydrology. 572:17-31.
Kumar, A. and Krishna, A. P. 2018. Assessment of groundwater potential zones in coal mining impacted hard-rock terrain of India by integrating geospatial and analytic hierarchy process (AHP) approach. Geocarto International. 33(2): 105-129.
Manap, M.A., Sulaiman, W.N.A., Ramli, M.F., Pradhan, B. and Surip, N. 2013. A knowledge-driven GIS modeling technique for groundwater potential mapping at the upper Langat Basin, Malaysia. Arabian Journal of Geosciences. 6:1621– 1637
Moghaddam, D. D., Rezaei, M., Pourghasemi, H. R., Pourtaghie, Z. S. and Pradhan, B. 2015. Groundwater spring potential mapping using bivariate statistical model and GIS in the Taleghan watershed, Iran. Arabian Journal of Geosciences. 8(2): 913-929.
Nampak, H., Pradhan, B. and Manap, M.A. 2014. Application of GIS based data driven evidential belief function model to predict groundwater potential zonation. Journal of Hydrology. 513: 283–300.
Neshat, A., Pradhan, B., Pirasteh, S. and Shafri, H.Z.M. 2014. Estimating groundwater vulnerability to pollution using a modified DRASTIC model in the Kerman agricultural area, Iran. Environmental Earth Sciences. 71: 3119–3131.
Oh, H.J., Kim, Y.S., Choi, J.K., Park, E. and Lee, S. 2011. GIS mapping of regional probabilistic groundwater potential in the area of Pohang City, Korea. Journal of Hydrology .399: 158–172.
Osati, K., Koeniger, P., Salajegheh, A., Mahdavi, M., Chapi, K. and Malekian, A. 2014. Spatiotemporal patterns of stable isotopes and hydrochemistry in springs and river flow of the upper Karkheh River basin, Iran. Isotopes in Environmental and Health Studies. 50: 169–183
Pham, B.T., Bui, D.T., Prakash, I. and Dholakia, M. 2017. Hybrid integration of multilayer perceptron neural networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena. 149:52–63.
Pham, B.T., Bui, D. T., Indra, P. and Dholakia, M. 2015. A comparison study of predictive ability of support vector machines and naive bayes tree methods in landslide susceptibility assessment at an area between Tehri Garhwal and Pauri Garhwal, Uttarakhand state (India) using GIS. In: national symposium on geomatics for digital India and annual conventions of ISG & ISRS, Jaipur (India).
Pradhan, B. 2013. A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Computers & Geosciences5. 1: 350–365.
Prasad, P., Loveson, V.J., Kotha, M. and Yadav, R. 2020. Application of machine learning techniques in groundwater potential mapping along the west coast of India. GIScience & Remote Sensing. 57(6): 735-752.
Quiroz, J.C., Mariun, N., Mehrjou, M.R., Izadi, M., Misron, N. and Mohd Radzi, M.A. 2018. Fault detection of broken rotor bar in LS-PMSM using random forests. Measurement. 116:273-280.
Rahmati, O., Naghibi, S.A., Shahabi, H., Bui, D.T., Pradhan, B., Azareh, A., Rafiei-Sardooi, E., Samani, A.N. and Melesse, A.M. 2018. Groundwater spring potential modelling: comprising the capability and robustness of three different modeling approaches. Journal of Hydrology. 565:248–261.
Rahmati, O., Pourghasemi, H.R. and Melesse, A.M. 2016. Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: a case study at Mehran region, Iran. Catena. 137: 360–372.
 Rahmati, O., Samani, A.N., Mahdavi, M., Pourghasemi, H.R. and Zeinivand, H. 2015. Groundwater potential mapping at Kurdistan region of Iran using analytic hierarchy process and GIS. Arabian Journal of Geosciences. 8: 7059–7071.
Razavi-Termeh, S.V., Sadeghi-Niaraki, A. and Choi, S.M. 2019. Groundwater potential mapping using an integrated ensemble of three bivariate statistical models with random forest and logistic model tree models. Water. 11(8): 1596.
Shahabi, H., Hashim, M. and Ahmad, B.B. 2015. Remote sensing and GIS-based landslide susceptibility mapping using frequency ratio, logistic regression, and fuzzy logic methods at the central Zab basin, Iran. Environmental Earth Sciences. 73:8647–8668.
Skurichina, M. and Duin, R.P. 2002. Bagging, boosting and the random subspace method for linear classifiers. Pattern Analysis and Applications. 5:121–135.
Tang, X., Ou, Z., Su, T., Sun, H. and Zhao, P. 2005. Robust precise eye location by adaboost and svm techniques. In: International Symposium on Neural Networks. Springer: 93–98.
Tehrany, M.S., Pradhan, B. and Jebur, M.N. 2013. Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology. 504: 69–79.
Wang, L.M., Li, X.L., Cao, C.H. and Yuan, S.M. 2006. Combining decision tree and naive Bayes for classification. Knowledge-Based Systems. 19: 511–515.