Session: Methods for Uncertainty Quantification, Sensitivity Analysis, and Prediction 2
Paper Number: 156304
156304 - Explainable Machine Learning for Data-Driven Turbulence Modeling in Compressible Fluid Flows Using Shap Analysis
Abstract:
Scientific Machine Learning (SciML) is reshaping critical scientific applications, particularly in domains vital for national security. Its transformative potential is evident in enhancing analytical capabilities and addressing complex challenges in fluid dynamics. The Reynolds-averaged Navier–Stokes (RANS) equations are fundamental for simulating compressible fluid flows; however, they are often hindered by model-form errors that can lead to inaccurate predictions. Recent work by Eric et al. [AIAA 2023-2126; https://doi.org/10.2514/6.2023-2126] introduced a data-driven turbulence modeling strategy aimed at refining RANS models. This approach employs multi-step training across eight diverse datasets, including channel flows at various Reynolds numbers (Re = 180, 395, and 590), duct flow at Re = 3500, flow over a periodic hill, and hypersonic boundary layers under cold wall conditions.
The study emphasizes assessing the credibility of SciML in turbulence modeling, particularly in predicting discrepancies in the Reynolds stress term. By focusing on hyperparameter sensitivity and rigorously evaluating model performance on out-of-distribution datasets, the research enhances the reliability of machine learning models in capturing elusive model-form errors. The findings demonstrate robust improvements across various scenarios, including wall-bounded flows, jet flows, and hypersonic boundary layer flows, thereby advancing the understanding of turbulence modeling. This study evaluates SciML credibility in turbulence modeling, encompassing data provenance, domain knowledge, explainability, code correctness, validation, and uncertainty quantification.
Validation approaches in SciML, particularly for data-driven turbulence modeling in RANS closures, are essential for ensuring the reliability and generalizability of developed models. Cross-validation is frequently employed to assess model performance and mitigate overfitting. By partitioning the dataset into multiple subsets or "folds," the model is trained on a combination of these folds and validated on the remaining fold, ensuring comprehensive evaluation. This process is crucial in turbulence modeling, where flow characteristics can vary significantly.
In our study, we utilized high-resolution direct numerical simulation (DNS) data for data-driven RANS closures. The iterative training procedure involved several steps, including setting up and solving a baseline RANS computation, learning discrepancy models for the anisotropy tensor, computing corrections, and re-converging the solver. Although full convergence was not achieved, we terminated the loop after ten iterations. Various combinations of the eight datasets were created, and the iterative training procedure was performed for each fold, testing results on a channel flow dataset with Re = 590, periodic hill (in-distribution), and NASA hump (out-of-distribution). The results for velocity profiles, turbulent kinetic energy (TKE), and Reynolds shear stress and normal stress in the x, y, and z directions indicated that the machine learning correction term aligns well with trends observed in the DNS data.
Our analysis of hyperparameters revealed that reducing the width of the neural network (NN) yielded better convergence compared to increasing epochs or decreasing the depth of the NN. Optimal hyperparameter settings were identified to mitigate oscillatory behavior, although some deviations in the buffer layer persisted. The training procedure demonstrated strong performance across various combinations of training datasets, enhancing model credibility and applicability in practical turbulent flow scenarios. This systematic investigation has ensured the accuracy and reliability of fluid dynamics simulations modeled by RANS equations, contributing to the refinement of RANS simulations and influencing broader areas within fluid dynamics and machine learning.
At the upcoming ASME VVUQ conference, we will present explainable machine learning models for turbulence closures utilizing SHAP (SHapley Additive exPlanations) analysis. This approach is crucial for interpreting model predictions and understanding the influence of input features on the output, thereby enhancing the transparency and trustworthiness of machine learning applications in turbulence modeling. By focusing on explainability, we aim to foster greater confidence in deploying machine learning techniques within critical scientific applications, ensuring that these models not only perform well but are also interpretable and actionable in real-world scenarios. This work is supported by the DOE-NNSA ASC program.
Presenting Author: Uma Balakrishnan Sandia National Laboratories
Presenting Author Biography: Dr. Uma Balakrishnan is a highly motivated and technically adept professional with a Ph.D. in Applied Mathematics, specializing in Computational Fluid Dynamics. Currently employed at Sandia National Laboratories, she focuses on scientific machine learning, turbulence closures, anomaly detection, and generative AI. Dr. Balakrishnan has published numerous papers in peer-reviewed journals, holds a patent, and has presented at various conferences, delivering many invited talks. Her expertise encompasses a wide range of areas, including computational fluid dynamics, statistical mechanics, computational biology, and AI/ML, making her a valuable asset in her field.
Authors:
Uma Balakrishnan Sandia National LaboratoriesWilliam J. Rider Sandia National Laboratories
Matthew Barone Sandia National Laboratories
Eric Joshua Parish Sandia National Laboratories
Explainable Machine Learning for Data-Driven Turbulence Modeling in Compressible Fluid Flows Using Shap Analysis
Paper Type
Technical Presentation Only