Strategic imputation of groundwater data using machine learning: Insights from diverse aquifers in the Chao-Phraya River Basin

Yaggesh Kumar Sharma, Seokhyeon Kim, Amir Saman Tayerani Charmchi, Doosun Kang, Okke Batelaan

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Effective groundwater monitoring is essential for sustainable water management, particularly in data-sparse regions. To address inconsistencies in groundwater level data, we developed a machine learning framework for robust data imputation, tested in the Chao-Phraya River (CPR) Basin, a region facing significant groundwater challenges due to high population density and ecological importance. Our study evaluated five models—K-Nearest Neighbors (KNN), Multiple Imputation by Chained Equations (MICE), Multilayer Perceptron (MLP), Random Forest (RF), and Soft Imputation (SI) —to fill gaps in monthly groundwater level data across various locations, aquifer depths, and data loss scenarios. Results show that MICE perform well in high-density well environments, while SI excels with lower well density, maintaining Pearson correlation coefficients (R) above 0.80 and RMSE values below 6 even at 10% data loss. The Coefficient of Variation (COV) analysis also confirmed that imputed data remains stable and reliable. However, the study also reveals a significant decrease in model performance in regions with fewer wells, as indicated by increased RMSE and reduced R. Our findings indicate that machine learning models are capable of handling groundwater level observations with missing data. The well density in a region has a significant impact on these model's performance. Imputation techniques should be tailored to each aquifer's specific characteristics and surroundings in order to get accurate groundwater data.

Original languageEnglish
Article number101394
Number of pages17
JournalGroundwater for Sustainable Development
Volume28
DOIs
Publication statusPublished - Feb 2025

Keywords

  • Chao-Phraya River Basin
  • Data imputation
  • Drought
  • Groundwater management
  • Machine learning models
  • Well density

Fingerprint

Dive into the research topics of 'Strategic imputation of groundwater data using machine learning: Insights from diverse aquifers in the Chao-Phraya River Basin'. Together they form a unique fingerprint.

Cite this