پیش‌بینی مقادیر TDS و EC رودخانه با استفاده از روش‌های یادگیری ماشین

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی کارشناسی ارشد عمران- محیط زیست، دانشکده مهندسی آب و محیط زیست، دانشگاه شهید چمران اهواز، اهواز

2 استادیار گروه مهندسی محیط زیست، دانشکده مهندسی آب و محیط زیست، دانشگاه شهید چمران اهواز، اهواز

3 استادیار دانشگاه شهید چمران اهواز

چکیده

در این مطالعه ضمن بررسی عملکرد دو مدل جنگل تصادفی و رگرسیون فرایند گاوسی، به شبیه‌سازی مقادیر هدایت الکتریکی و کل مواد جامد محلول رودخانه در ایستگاه‌های غرب دریاچه ارومیه (چهریق علیا، دیزج و تپیک) با توجه به مقادیر دبی و بی‌کربنات‌ها در دوره آماری 99-1350 پرداخته شده است. میزان خطای شبیه‌سازی مقادیر EC با استفاده از مدل RF نسبت به مدل GPR در دو فاز آزمایش در ایستگاه‌های چهریق علیا، دیزج و تپیک حدود 356، 36 و 47 درصد کمتر می‌باشد. در مورد شبیه‌سازی مقادیر TDS در ایستگاه‌های مورد مطالعه نیز، در ایستگاه چهریق علیا مدل RF و در دو ایستگاه دیگر، مدل GPR میزان خطای کمتر و کارایی بالاتری ارائه کرد. در مورد دو ایستگاه دیزج و تپیک در فاز آموزش تفاوت معنی‌داری بین میزان خطای شبیه‌سازی دو مدل RF و GPR مشاهده نشد، اما در فاز آزمایش، میزان خطای مدل GPR نسبت به مدل RF به ترتیب حدود 66 و 74 درصد کمتر می‌باشد. به‌طور کلی نتایج نشان داد که با توجه به حدود اطمینان شبیه‌سازی شده پارامترهای EC و TDS، کارایی دو مدل مورد بررسی قابل قبول می‌باشد، اما در مورد مقادیر TDS، رفتار دو مدل مورد بررسی در دو ایستگاه دیزج و تپیک متفاوت می‌باشد.

کلیدواژه‌ها


عنوان مقاله [English]

Prediction of TDS and EC Values of River Using Machine Learning Methods

نویسندگان [English]

  • Mohammad Hadi Jabbar Matoori 1
  • Ahmad Fathi 2
  • Farshad Ahmadi 3
1 MS.c Student, Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran
2 Assistant Professor, Department of Environmental Engineering, Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran
3 Shahid Chamran university of Ahvaz
چکیده [English]

In this study, while examining the performance of random forest model and Gaussian process regression, simulating the values of electrical conductivity and total dissolved solids in the western stations of Lake Urmia (Chehriq-Olya, Dizj and Tepik) according to the values of flow discharge and The HCO3 of the river flow has been studied in the period of 1971-2020. According to the RMSE and NSE statistics, the simulation results of EC values in the studied stations showed that the error rate of the RF model is lower than the GPR model and the efficiency of the model is also higher. The error rate of simulating EC values using the RF model in the test phase in Chihriq-Olya, Dizj and Tepik stations is about 356, 36 and 47% less than the GPR model. In general, the results showed that according to the simulated confidence intervals of EC and TDS parameters, the performance of the two investigated models is acceptable, but in the case of TDS values, the behavior of the two investigated models in the two stations of Dizj and Tepik is different.

کلیدواژه‌ها [English]

  • Gaussian Process
  • Lake Urmia
  • Random Forest
  • Regression
  • Water Quality
Abaurrea, J. and Asín, J. 2005. Forecasting local daily precipitation patterns in a climate change scenario. Climate Research. 28(3):183-197.
Adamowski, J. and Sun, K. 2010. Development of a coupled wavelet transform and neural network method for flow forecasting of non-perennial rivers in semi-arid watersheds. Journal of Hydrology. 390(1-2):85-91.
Ahmadi, F., Nazeri Tahroudi, M., Mirabbasi, R., Khalili, K. and Jhajharia, D. 2018. Spatiotemporal trend and abrupt change analysis of temperature in Iran. Meteorological Applications. 25(2):314-321.
Ahmadi, F., Nazeri Tahroudi, M., Mirabbasi, R., Kumar, R. 2022. Spatiotemporal analysis of precipitation and temperature concentration using PCI and TCI: a case study of Khuzestan Province, Iran. Theor Appl Climatol, https://doi.org/10.1007/s00704-022-04077-6.
Alqahtani, A., Shah, M. I., Aldrees, A. and Javed, M. F. 2022. Comparative Assessment of Individual and Ensemble Machine Learning Models for Efficient Analysis of River Water Quality. Sustainability. 14(3):1183.
Bhagwat, P. P. and Maity, R. 2012. Multistep-ahead river flow prediction using LS-SVR at daily scale. Journal of water Resource and Protection. 4(07):528.
Birgé, L. 2004. Model selection for Gaussian regression with random design. Bernoulli. 10(6):1039-1051.
Breiman, L. 2001. Random Forests. Machine Learning. 45(1):5–32.
Burt, D., Rasmussen, C. E. and Van Der Wilk, M. 2019. Rates of convergence for sparse variational Gaussian process regression. In International Conference on Machine Learning. PMLR: 862-871
Chen, S. T., Yu, P. S. and Tang, Y. H. 2010. Statistical downscaling of daily precipitation using support vector machines and multivariate analysis. Journal of hydrology. 385(1-4):13-22.
Chu, J. L., Kang, H., Tam, C. Y., Park, C. K. and Chen, C. T. 2008. Seasonal forecast for local precipitation over northern Taiwan using statistical downscaling. Journal of Geophysical Research: Atmospheres. 113(D12).
Dibike, Y. B., Velickov, S., Solomatine, D. and Abbott, M. B. 2001. Model induction with support vector machines: introduction and applications. Journal of Computing in Civil Engineering. 15(3):208-216.
Duan, W., He, B., Nover, D., Yang, G., Chen, W., Meng, H. and Liu, C. 2016. Water quality assessment and pollution source identification of the eastern Poyang Lake Basin using multivariate statistical methods. Sustainability. 8(2):133.
Emamgholizadeh, S., Kashi, H., Marofpoor, I. and Zalaghi, E. 2014. Prediction of water quality parameters of Karoon River (Iran) by artificial intelligence-based models. International Journal of Environmental Science and Technology. 11(3):645-656.
Eslami, P., Nasirian, A., Akbarpour, A. and Nazeri Tahroudi, M. 2022. Groundwater estimation of Ghayen plain with regression-based and hybrid time series models. Paddy and Water Environment. 1-12.
Friedman, J., Hastie, T. and Tibshirani, R. 2001. The elements of statistical learning (Vol. 1, No. 10). New York: Springer series in statistics.
George, J., Janaki, L. and Parameswaran Gomathy, J. 2016. Statistical downscaling using local polynomial regression for rainfall predictions–a case study. Water resources management. 30(1):183-193.
Ghosh, S. 2010. SVM‐PGSL coupled approach for statistical downscaling to predict rainfall from GCM output. Journal of Geophysical Research: Atmospheres. 115(D22).
He, Z., Wen, X., Liu, H. and Du, J. 2014. A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. Journal of Hydrology. 509:379-386.
Hultquist, C., Chen, G. and Zhao, K. 2014. A comparison of Gaussian process regression, random forests and support vector regression for burn severity assessment in diseased forests. Remote sensing letters. 5(8):723-732.
Iannace, G. and Ciaburro, G. 2021. Modelling sound absorption properties for recycled polyethylene terephthalate-based material using Gaussian regression. Building Acoustics. 28(2):185-196.
Iannace, G., Ciaburro, G. and Trematerra, A. 2018. Heating, ventilation, and air conditioning (HVAC) noise detection in open-plan offices using recursive partitioning. Buildings. 8(12):169.
Iannace, G., Ciaburro, G. and Trematerra, A. 2019a. Fault diagnosis for UAV blades using artificial neural network. Robotics. 8(3):59.
Iannace, G., Ciaburro, G. and Trematerra, A. 2019b. Wind turbine noise prediction using random forest regression. Machines. 7(4):69.
Khalili, K., Tahoudi, M. N., Mirabbasi, R. and Ahmadi, F. 2016. Investigation of spatial and temporal variability of precipitation in Iran over the last half century. Stochastic Environmental Research and Risk Assessment. 30(4):1205-1221.
Khaliq M.N., Ouarda T.B.M.J., and Gachon P. 2009. Identification of temporal trends in annual and seasonal low flows occurring in Canadian rivers: The effect of short- and long-term persistence, Journal of Hydrology. 369:183–197.
Khozeymehnezhad, H. and Tahroudi, M. N. 2019. Annual and seasonal distribution pattern of rainfall in Iran and neighboring regions. Arabian Journal of Geosciences. 12(8):1-11.
Kim, G. B., Kim, W. J., Kim, H. U. and Lee, S. Y. 2020. Machine learning applications in systems metabolic engineering. Current opinion in biotechnology. 64:1-9.
Kim, Y. J. and Gu, C. 2004. Smoothing spline Gaussian regression: more scalable computation via efficient approximation. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 66(2):337-356.
Kisi, O. 2012. Modeling discharge-suspended sediment relationship using least square support vector machine. Journal of hydrology. 456:110-120.
Kumar S., Merwade V., Kam J., and Thurner K. 2009. Streamflow trends in Indiana: Effects of long term persistence, precipitation and subsurface drains. Journal of Hydrology. 374(1-2): 171-183.
Kundu, S., Khare, D. and Mondal, A. 2017. Future changes in rainfall, temperature and reference evapotranspiration in the central India by least square support vector machine. Geoscience Frontiers. 8(3):583-596.
Landman, W. A. and Mason, S. J. 2001. Forecasts of near-global sea surface temperatures using canonical correlation analysis. Journal of Climate. 14(18):3819-3833.
Liong, S. Y. and Sivapragasam, C. 2002. Flood stage forecasting with support vector machines 1. JAWRA Journal of the American Water Resources Association. 38(1):173-186.
Mandal, S., Srivastav, R. K. and Simonovic, S. P. 2016. Use of beta regression for statistical downscaling of precipitation in the Campbell River basin, British Columbia, Canada. Journal of Hydrology. 538:49-62.
Mann, H.B. 1945. Non-parametric tests against trend. Econometrica, 13, MathSci Net, 245-259.
Misra, D., Oommen, T., Agarwal, A., Mishra, S. K. and Thompson, A. M. 2009. Application and analysis of support vector machine based simulation for runoff and sediment yield. Biosystems engineering. 103(4):527-535.
Montgomery, D. C., Peck, E. A. and Vining, G. G. 2021. Introduction to linear regression analysis. John Wiley & Sons.
Najah, A., El-Shafie, A., Karim, O. A. and El-Shafie, A. H. 2013. Application of artificial neural networks for water quality prediction. Neural Computing and Applications. 22(1):187-201.
Nazeri Tahroudi, M., Khalili, K., Ahmadi, F., Mirabbasi, R. and Jhajharia, D. 2019. Development and application of a new index for analyzing temperature concentration for Iran’s climate. International Journal of Environmental Science and Technology. 16(6):2693-2706.
Nazeri-Tahroudi, M. and Ramezani, Y. 2020. Estimation of Dew Point Temperature in Different Climates of Iran Using Support Vector Regression. IDŐJÁRÁS/QUARTERLY JOURNAL OF THE HUNGARIAN METEOROLOGICAL SERVICE. 124(4):521-539.
Raje, D. and Mujumdar, P. P. 2011. A comparison of three methods for downscaling daily precipitation in the Punjab region. Hydrological Processes. 25(23):3575-3589.
Raji, M., Tahroudi, M. N., Ye, F. and Dutta, J. 2022. Prediction of heterogeneous Fenton process in treatment of melanoidin-containing wastewater using data-based models. Journal of Environmental Management. 307:114518.
Ramezani, Y., Khashei-Siuki, A. and Nazeri Tahroudi, M. 2020. Spatial distribution of the daily, monthly, and annual precipitation concentration indices in the Lake Urmia basin, Iran. Qquarterly journal of the Hungarian meteorological service. 124(1):73-95.
Saha, A., Tso, S., Rabski, J., Sadeghian, A. and Cusimano, M. D. 2020. Machine learning applications in imaging analysis for patients with pituitary tumors: a review of the current literature and future directions. Pituitary. 23(3):273-293.
Sain, S. R., Baggerly, K. A. and Scott, D. W. 1994. Cross-validation of multivariate densities. Journal of the American Statistical Association. 89(427):807-817.
Shabani, S., Samadianfard, S., Sattari, M. T., Mosavi, A., Shamshirband, S., Kmet, T. and Várkonyi-Kóczy, A. R. 2020. Modeling pan evaporation using Gaussian process regression K-nearest neighbors random forest and support vector machines; comparative analysis. Atmosphere. 11(1):66.
Singh, K. P., Basant, A., Malik, A. and Jain, G. 2009. Artificial neural network modeling of the river water quality—a case study. Ecological modelling. 220(6):888-895.
Sun, N., Zhang, S., Peng, T., Zhang, N., Zhou, J. and Zhang, H. 2022. Multi-Variables-Driven Model Based on Random Forest and Gaussian Process Regression for Monthly Streamflow Forecasting. Water. 14(11):1828.
Tabatabaei, S. M., Tahroudi, M. N. and Hamraz, B. S. 2021. Comparison of the performances of GEP, ANFIS, and SVM artifical intelligence models in rainfall simulaton. Quarterly Journal of the Hungarian Meteorological Service. 125(2):195-209.
Tahroudi, M. N., Khalili, K., Ahmadi, F., Mirabbasi, R. and Jhajharia, D. 2019. Development and application of a new index for analyzing temperature concentration for Iran’s climate. International Journal of Environmental Science and Technology. 16(6):2693-2706.
Tripathi, S., Srinivas, V. V. and Nanjundiah, R. S. 2006. Downscaling of precipitation for climate change scenarios: a support vector machine approach. Journal of hydrology. 330(3-4):621-640.
Tung, T. M. and Yaseen, Z. M. 2021. Deep learning for prediction of water quality index classification: tropical catchment environmental assessment. Natural Resources Research. 30(6):4235-4254.
Virmani, C., Choudhary, T., Pillai, A. and Rani, M. 2020. Applications of machine learning in cyber security. In Handbook of research on machine and deep learning applications for cyber security (pp. 83-103). IGI Global.
Wang, F., Wang, Y., Zhang, K., Hu, M., Weng, Q, and Zhang, H. 2021. Spatial heterogeneity modeling of water quality based on random forest regression and model interpretation. Environmental Research. 202, 111660, https://doi.org/10.1016/j.envres.2021.111660
Wang, Q. Fan, X. Qin, Z. Wang, M. 2012. Change trends of temperature and precipitation in the Loess Plateau Region of China, 1961–2010. Global and Planetary Change. 93:138-147.
Wang, W. C., Xu, D. M., Chau, K. W. and Lei, G. J. 2014. Assessment of river water quality based on theory of variable fuzzy sets and fuzzy binary comparison method. Water resources management. 28(12):4183-4200.
Wang, W., Van Gelder, P. H. A. J. M., Vrijling, J. K., and Ma, J. 2005. Testing and modeling autoregressive conditional heteroskedasticity of streamflow processes. Nonlinear processes in Geophysics. 12:1. 55-66.
Weisberg, S. 2005. Applied linear regression (Vol. 528). John Wiley & Sons.
Wilby, R. L., Dawson, C. W. and Barrow, E. M. 2002. SDSM—a decision support tool for the assessment of regional climate change impacts. Environmental Modelling & Software. 17(2):145-157.
Yoon, H., Jun, S. C., Hyun, Y., Bae, G. O. and Lee, K. K. 2011. A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. Journal of hydrology. 396(1-2):128-138.
Yu, H., Nghia, T., Low, B. K. H. and Jaillet, P. 2019. July. Stochastic variational inference for Bayesian sparse Gaussian process regression. In 2019 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.