Proceedings of the
European Safety and Reliability Conference (ESREL2026)
14 – 19 June 2026, Braga, Portugal

Exploring Data-Driven Solutions to Predict Hydrogen Solubility in Saline Environments for Underground Hydrogen Storage

Sandrely Pereira da Silva

Department of Production Engineering, Federal University of Pernambuco (UFPE), Brazil.

sandrely.pereira@ufpe.br

Caio Bezerra Souto Maior

Center of Informatics, UFPE, Brazil.

caio.maior@ufpe.br

Isis Didier Lins

Department of Production Engineering, UFPE, Brazil.

isis.lins@ufpe.br

ABSTRACT

The global transition toward low-carbon energy systems has intensified interest in hydrogen as a sustainable energy carrier due to its high gravimetric energy density and compatibility with renewable energy sources. Underground hydrogen storage (UHS) in geological formations such as aquifers, depleted reservoirs, and salt caverns has emerged as a promising solution for large-scale and seasonal energy storage. However, accurately predicting hydrogen solubility in saline aqueous environments remains a key challenge, as it directly influences gas loss, chemical stability, and operational safety. In this study, a machine learning (ML) approach was applied to model hydrogen solubility in saline solutions, addressing the limitations of conventional thermodynamic approaches in capturing complex, non-linear interactions. A dataset comprising 255 experimental observations, including pressure, temperature, salinity, and hydrogen solubility, was analyzed through exploratory data analysis, physically motivated feature engineering, and permutation-based feature selection. Among the evaluated models, CatBoost demonstrated superior predictive performance and robustness. Using the complete dataset, the optimized CatBoost model achieved a coefficient of determination of R2=0.9973, a root mean squared error (RMSE) of 0.0125 , and a mean absolute error (MAE) of 0.0050 , indicating excellent accuracy and strong generalization capability. These results highlight the effectiveness of gradient boosting-based ML methods for modeling hydrogen solubility under saline conditions and demonstrate their potential to support feasibility assessments and risk evaluation in underground hydrogen storage applications.

Keywords: Hydrogen, Underground Hydrogen Storage, Machine Learning, Data-Driven Modeling, Energy, Storage.



Download PDF