A Simple and Scalable Kernel Density Approach for Reliable Uncertainty Quantification in Atomistic Machine Learning
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články
PubMed
41099625
PubMed Central
PMC12557357
DOI
10.1021/acs.jpclett.5c02595
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
Machine learning models are increasingly used to predict material properties and accelerate atomistic simulations, but the reliability of their predictions depends on the representativeness of the training data. We present a scalable, GPU-accelerated uncertainty quantification framework based on k-nearest-neighbor kernel density estimation (KDE) in a PCA-reduced descriptor space. This method efficiently detects sparsely sampled regions in large, high-dimensional data sets and provides a transferable, model-agnostic uncertainty metric without requiring retraining costly model ensembles. The framework is validated across diverse case studies varying in (i) chemistry, (ii) prediction models (including foundational neural network), (iii) descriptors used for KDE estimation, and (iv) properties whose uncertainty is sought. In all cases, the KDE-based score reliably flags extrapolative configurations, correlates well with conventional ensemble-based uncertainties, and highlights regions of reduced prediction trustworthiness. The approach offers a practical route for improving the interpretability, robustness, and deployment readiness of ML models in materials science.
Zobrazit více v PubMed
Zhong X., Gallagher B., Liu S., Kailkhura B., Hiszpanski A., Han T. Y.-J.. Explainable machine learning in materials science. npj Computational Materials. 2022;8:204. doi: 10.1038/s41524-022-00884-7. DOI
Duignan T. T.. The Potential of Neural Network Potentials. ACS Physical Chemistry Au. 2024;4:232–241. doi: 10.1021/acsphyschemau.4c00004. PubMed DOI PMC
Käser S., Vazquez-Salazar L. I., Meuwly M., Töpfer K.. Neural network potentials for chemistry: concepts, applications and prospects. Digital Discovery. 2023;2:28–58. doi: 10.1039/D2DD00102K. PubMed DOI PMC
Vandermause J., Torrisi S. B., Batzner S., Xie Y., Sun L., Kolpak A. M., Kozinsky B.. On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events. npj Computational Materials. 2020;6:20. doi: 10.1038/s41524-020-0283-z. DOI
Jinnouchi R., Lahnsteiner J., Karsai F., Kresse G., Bokdam M.. Phase Transitions of Hybrid Perovskites Simulated by Machine-Learning Force Fields Trained on the Fly with Bayesian Inference. Phys. Rev. Lett. 2019;122:225701. doi: 10.1103/PhysRevLett.122.225701. PubMed DOI
Sluijterman L., Cator E., Heskes T.. Optimal Training of Mean Variance Estimation Neural Networks. arXiv. 2023 doi: 10.48550/arXiv.2302.08875. DOI
Zhu A., Batzner S., Musaelian A., Kozinsky B.. Fast uncertainty estimates in deep learning interatomic potentials. J. Chem. Phys. 2023;158:164111. doi: 10.1063/5.0136574. PubMed DOI
Heid E., Schörghuber J., Wanzenböck R., Madsen G. K. H.. Spatially Resolved Uncertainties for Machine Learning Potentials. J. Chem. Inf. Model. 2024;64:6377–6387. doi: 10.1021/acs.jcim.4c00904. PubMed DOI PMC
Bilbrey J. A., Firoz J. S., Lee M.-S., Choudhury S.. Uncertainty quantification for neural network potential foundation models. npj Computational Materials. 2025;11:109. doi: 10.1038/s41524-025-01572-y. DOI
Musielewicz J., Lan J., Uyttendaele M., Kitchin J. R.. Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances. J. Phys. Chem. C. 2024;128:20799–20810. doi: 10.1021/acs.jpcc.4c04972. DOI
Behler J., Parrinello M.. Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces. Phys. Rev. Lett. 2007;98:146401. doi: 10.1103/PhysRevLett.98.146401. PubMed DOI
Tan A. R., Urata S., Goldman S., Dietschreit J. C. B., Gómez-Bombarelli R.. Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles. npj Computational Materials. 2023;9:109. doi: 10.1038/s41524-023-01180-8. DOI
Maćkiewicz A., Ratajczak W.. Principal components analysis (PCA) Computers & Geosciences. 1993;19:303–342. doi: 10.1016/0098-3004(93)90090-R. DOI
Johnson J., Douze M., Jégou H.. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data. 2021;7:535–547. doi: 10.1109/TBDATA.2019.2921572. DOI
Beleites C., Neugebauer U., Bocklitz T., Krafft C., Popp J.. Sample size planning for classification models. Anal. Chim. Acta. 2013;760:25–33. doi: 10.1016/j.aca.2012.11.007. PubMed DOI
Miwa K., Ohno H.. Molecular dynamics study on β-phase vanadium monohydride with machine learning potential. Phys. Rev. B. 2016;94:184109. doi: 10.1103/PhysRevB.94.184109. DOI
Jinnouchi R., Karsai F., Kresse G.. On-the-fly machine learning force field generation: Application to melting points. Phys. Rev. B. 2019;100:014105. doi: 10.1103/PhysRevB.100.014105. PubMed DOI
Silverman, B. W. Density Estimation for Statistics and Data Analysis; CRC Press, 1986.
Nix D., Weigend A.. Estimating the mean and variance of the target probability distribution. Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94) 1994;1:55–60. doi: 10.1109/ICNN.1994.374138. DOI
Amini A., Schwarting W., Soleimany A., Rus D.. Deep Evidential Regression. Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020:14927–14937. doi: 10.5555/3495724.3496975. DOI
Bartók A. P., Kondor R., Csányi G.. On representing chemical environments. Phys. Rev. B. 2013;87:184115. doi: 10.1103/PhysRevB.87.184115. DOI
Batatia I.. A foundation model for atomistic materials chemistry. arXiv. 2024 doi: 10.48550/arXiv.2401.00096. DOI
Roy S., Dürholt J. P., Asche T. S., Zipoli F., Gómez-Bombarelli R.. Learning a reactive potential for silica-water through uncertainty attribution. Nat. Commun. 2024;15:6030. doi: 10.1038/s41467-024-50407-9. PubMed DOI PMC
Schwalbe-Koda D., Tan A. R., Gómez-Bombarelli R.. Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks. Nat. Commun. 2021;12:5104. doi: 10.1038/s41467-021-25342-8. PubMed DOI PMC
Schultz L. E., Wang Y., Jacobs R., Morgan D.. A general approach for determining applicability domain of machine learning models. npj Computational Materials. 2025;11:95. doi: 10.1038/s41524-025-01573-x. DOI
Schütt K. T., Kessel P., Gastegger M., Nicoli K. A., Tkatchenko A., Müller K.-R.. SchNetPack: A Deep Learning Toolbox For Atomistic Systems. J. Chem. Theory Comput. 2019;15:448–455. doi: 10.1021/acs.jctc.8b00908. PubMed DOI
Benešová T., Pokorná K., Erlebach A., Heard C.. Mobility and Sintering of Silica-Supported Platinum Clusters via Reactive Neural Network Potentials. ChemRxiv. 2025 doi: 10.26434/chemrxiv-2025-tjz1c. DOI
Heard C. J., Grajciar L., Erlebach A.. Migration of zeolite-encapsulated subnanometre platinum clusters via reactive neural network potentials. Nanoscale. 2024;16:8108–8118. doi: 10.1039/D4NR00017J. PubMed DOI
Baerlocher, C. ; Brouwer, D. ; Marler, B. ; McCusker, L. B. . Database of Zeolite Structures. https://www.iza-structure.org/databases/.
Erlebach A., Šípka M., Saha I., Nachtigall P., Heard C. J., Grajciar L.. A reactive neural network framework for water-loaded acidic zeolites. Nat. Commun. 2024;15:4215. doi: 10.1038/s41467-024-48609-2. PubMed DOI PMC
Deng B., Zhong P., Jun K., Riebesell J., Han K., Bartel C. J., Ceder G.. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nature Machine Intelligence. 2023;5:1031–1041. doi: 10.1038/s42256-023-00716-3. DOI
Christensen, A. S. ; Lilienfeld, A. V. . Revised MD17 dataset (rMD17). 2020; https://figshare.com/articles/Revised_MD17_dataset_rMD17_/12672038/3.
Batatia I., Kovács D. P., Simm G. N. C., Ortner C., Csányi G.. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. arXiv. 2023 doi: 10.48550/arXiv.2206.07697. DOI
Dodge, Y. The Concise Encyclopedia of Statistics; Springer: New York, 2008; pp 502–505.
Willimetz D., Erlebach A., Heard C. J., Grajciar L.. 27Al NMR chemical shifts in zeolite MFI via machine learning acceleration of structure sampling and shift prediction. Digital Discovery. 2025;4:275–288. doi: 10.1039/D4DD00306C. DOI
Willimetz D., Martınez-Ortigosa J., Brako-Amoafo D., Grajciar L., Vidal-Moya A., Bornes C., Sarou-Kanian V., Erlebach A., Rey F., Blasco T., Heard C.. Aluminum Siting in Zeolite RTH From a Combined Machine Learning - NMR Approach. ChemRxiv. 2025 doi: 10.26434/chemrxiv-2025-1p3dj. DOI
Joyce S. A., Yates J. R., Pickard C. J., Mauri F.. A first principles theory of nuclear magnetic resonance J-coupling in solid-state systems. J. Chem. Phys. 2007;127:204107. doi: 10.1063/1.2801984. PubMed DOI
Water Adsorption at Pairs of Proximate Brønsted Acid Sites in Zeolites