PM6-ML: The Synergy of Semiempirical Quantum Chemistry and Machine Learning Transformed into a Practical Computational Method

. 2025 Jan 28 ; 21 (2) : 678-690. [epub] 20250103

Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid39752295

Machine learning (ML) methods offer a promising route to the construction of universal molecular potentials with high accuracy and low computational cost. It is becoming evident that integrating physical principles into these models, or utilizing them in a Δ-ML scheme, significantly enhances their robustness and transferability. This paper introduces PM6-ML, a Δ-ML method that synergizes the semiempirical quantum-mechanical (SQM) method PM6 with a state-of-the-art ML potential applied as a universal correction. The method demonstrates superior performance over standalone SQM and ML approaches and covers a broader chemical space than its predecessors. It is scalable to systems with thousands of atoms, which makes it applicable to large biomolecular systems. Extensive benchmarking confirms PM6-ML's accuracy and robustness. Its practical application is facilitated by a direct interface to MOPAC. The code and parameters are available at https://github.com/Honza-R/mopac-ml.

Zobrazit více v PubMed

Thiel W. Semiempirical quantum–chemical methods. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2014, 4, 145–157. 10.1002/wcms.1161. DOI

Akimov A. V.; Prezhdo O. V. Large-Scale Computations in Chemistry: A Bird’s Eye View of a Vibrant Field. Chem. Rev. 2015, 115, 5797–5890. 10.1021/cr500524c. PubMed DOI

Řezáč J.; Fanfrlík J.; Salahub D.; Hobza P. Semiempirical Quantum Chemical PM6 Method Augmented by Dispersion and H-Bonding Correction Terms Reliably Describes Various Types of Noncovalent Complexes. J. Chem. Theory Comput. 2009, 5, 1749–1760. 10.1021/ct9000922. PubMed DOI

Řezáč J.; Hobza P. A halogen-bonding correction for the semiempirical PM6 method. Chem. Phys. Lett. 2011, 506, 286–289. 10.1016/j.cplett.2011.03.009. DOI

Řezáč J.; Hobza P. Advanced Corrections of Hydrogen Bonding and Dispersion for Semiempirical Quantum Mechanical Methods. J. Chem. Theory Comput. 2012, 8, 141–151. 10.1021/ct200751e. PubMed DOI

Řezáč J. Empirical Self-Consistent Correction for the Description of Hydrogen Bonds in DFTB3. J. Chem. Theory Comput. 2017, 13, 4804–4817. 10.1021/acs.jctc.7b00629. PubMed DOI

Stewart J. J. P. Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters. J. Mol. Model. 2013, 19, 1–32. 10.1007/s00894-012-1667-x. PubMed DOI PMC

Bannwarth C.; Ehlert S.; Grimme S. GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput. 2019, 15, 1652–1671. 10.1021/acs.jctc.8b01176. PubMed DOI

Kříž K.; Řezáč J. Benchmarking of Semiempirical Quantum-Mechanical Methods on Systems Relevant to Computer-Aided Drug Design. J. Chem. Inf. Model. 2020, 60, 1453–1460. 10.1021/acs.jcim.9b01171. PubMed DOI

Řezáč J.; Stewart J. J. P. How well do semiempirical QM methods describe the structure of proteins?. J. Chem. Phys. 2023, 158, 044118.10.1063/5.0135091. PubMed DOI

Pecina A.; Fanfrlík J.; Lepšík M.; Řezáč J. SQM2.20: Semiempirical quantum-mechanical scoring function yields DFT-quality protein–ligand binding affinity predictions in minutes. Nat. Commun. 2024, 15, 1127.10.1038/s41467-024-45431-8. PubMed DOI PMC

Stewart J. J. P. Optimization of parameters for semiempirical methods V: Modification of NDDO approximations and application to 70 elements. J. Mol. Model. 2007, 13, 1173–1213. 10.1007/s00894-007-0233-4. PubMed DOI PMC

Stewart J. J. P. Application of localized molecular orbitals to the solution of semiempirical self-consistent field equations. Int. J. Quantum Chem. 1996, 58, 133–146. 10.1002/(SICI)1097-461X(1996)58:2<133::AID-QUA2>3.0.CO;2-Z. DOI

Dewar M. J. S.; Thiel W. Ground states of molecules. 38. The MNDO method. Approximations and parameters. J. Am. Chem. Soc. 1977, 99, 4899–4907. 10.1021/ja00457a004. DOI

Miriyala V. M.; Řezáč J. Testing Semiempirical Quantum Mechanical Methods on a Data Set of Interaction Energies Mapping Repulsive Contacts in Organic Molecules. J. Phys. Chem. A 2018, 122, 2801–2808. 10.1021/acs.jpca.8b00260. PubMed DOI

Kříž K.; Nováček M.; Řezáč J. Non-Covalent Interactions Atlas Benchmark Data Sets 3: Repulsive Contacts. J. Chem. Theory Comput. 2021, 17, 1548–1561. 10.1021/acs.jctc.0c01341. PubMed DOI

Řezáč J.; Bím D.; Gutten O.; Rulíšek L. Toward Accurate Conformational Energies of Smaller Peptides and Medium-Sized Macrocycles: MPCONF196 Benchmark Energy Data Set. J. Chem. Theory Comput. 2018, 14, 1254–1266. 10.1021/acs.jctc.7b01074. PubMed DOI

Thölke P.; De Fabritiis G.. TorchMD-NET: Equivariant Transformers for Neural Network based Molecular Potentials. 2022, http://arxiv.org/abs/2202.02541.

Eastman P.; Behara P. K.; Dotson D. L.; Galvelis R.; Herr J. E.; Horton J. T.; Mao Y.; Chodera J. D.; Pritchard B. P.; Wang Y.; De Fabritiis G.; Markland T. E. SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials. Sci. Data 2023, 10, 11.10.1038/s41597-022-01882-6. PubMed DOI PMC

Zheng P.; Zubatyuk R.; Wu W.; Isayev O.; Dral P. O.. Artificial Intelligence-Enhanced Quantum Chemical Method with Broad Applicability. 2021, https://chemrxiv.org/engage/chemrxiv/article-details/60f8e1630b093ee195e2bca0. PubMed PMC

Zeng J.; Tao Y.; Giese T. J.; York D. M. QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery. J. Chem. Theory Comput. 2023, 19, 1261–1275. 10.1021/acs.jctc.2c01172. PubMed DOI PMC

Najibi A.; Goerigk L. The Nonlocal Kernel in van der Waals Density Functionals as an Additive Correction: An Extensive Analysis with Special Emphasis on the B97M-V and ωB97M-V Approaches. J. Chem. Theory Comput. 2018, 14, 5725–5738. 10.1021/acs.jctc.8b00842. PubMed DOI

Weigend F.; Ahlrichs R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305. 10.1039/b508541a. PubMed DOI

Řezáč J. Non-Covalent Interactions Atlas Benchmark Data Sets: Hydrogen Bonding. J. Chem. Theory Comput. 2020, 16, 2355–2368. 10.1021/acs.jctc.9b01265. PubMed DOI

Řezáč J. Non-Covalent Interactions Atlas Benchmark Data Sets 2: Hydrogen Bonding in an Extended Chemical Space. J. Chem. Theory Comput. 2020, 16, 6305–6316. 10.1021/acs.jctc.0c00715. PubMed DOI

Kříž K.; Řezáč J. Non-covalent interactions atlas benchmark data sets 4: σ-hole interactions. Phys. Chem. Chem. Phys. 2022, 24, 14794–14804. 10.1039/D2CP01600A. PubMed DOI

Řezáč J. Non-Covalent Interactions Atlas benchmark data sets 5: London dispersion in an extended chemical space. Phys. Chem. Chem. Phys. 2022, 24, 14780–14793. 10.1039/D2CP01602H. PubMed DOI

Stewart J. J. P.MOPAC 2016. 2016, http://openmopac.net/.

Tai K. S.; Bailis P.; Valiant G.. Equivariant Transformer Networks. 2019, http://arxiv.org/abs/1901.11399.

Bronstein M. M.; Bruna J.; Cohen T.; Veličković P.. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. 2021, http://arxiv.org/abs/2104.13478.

Neese F.; Wennmohs F.; Becker U.; Riplinger C. The ORCA quantum chemistry program package. J. Chem. Phys. 2020, 152, 224108.10.1063/5.0004608. PubMed DOI

Neese F. Software update: The ORCA program system—Version 5.0. Wiley Interdiscip. Rev.:Comput. Mol. Sci. 2022, 12, e160610.1002/wcms.1606. DOI

TorchMD-NET GitHub repository. 2024, https://github.com/torchmd/torchmd-net.

PyTorch machine learning library. 2024, https://pytorch.org.

Sedlák R.; Janowski T.; Pitoňák M.; Řezáč J.; Pulay P.; Hobza P. Accuracy of Quantum Chemical Methods for Large Noncovalent Complexes. J. Chem. Theory Comput. 2013, 9, 3364–3374. 10.1021/ct400036b. PubMed DOI PMC

Risthaus T.; Grimme S. Benchmarking of London Dispersion-Accounting Density Functional Theory Methods on Very Large Molecular Complexes. J. Chem. Theory Comput. 2013, 9, 1580–1591. 10.1021/ct301081n. PubMed DOI

Villot C.; Ballesteros F.; Wang D.; Lao K. U. Coupled Cluster Benchmarking of Large Noncovalent Complexes in L7 and S12L as Well as the C60 Dimer, DNA–Ellipticine, and HIV–Indinavir. J. Phys. Chem. A 2022, 126, 4326–4341. 10.1021/acs.jpca.2c01421. PubMed DOI

Godbout N.; Salahub D. R.; Andzelm J.; Wimmer E. Optimization of Gaussian-type basis sets for local spin density functional calculations. Part I. Boron through neon, optimization technique and validation. Can. J. Chem. 1992, 70, 560–571. 10.1139/v92-079. DOI

Hostaš J.; Řezáč J. Accurate DFT-D3 Calculations in a Small Basis Set. J. Chem. Theory Comput. 2017, 13, 3575–3585. 10.1021/acs.jctc.7b00365. PubMed DOI

Goerigk L.; Hansen A.; Bauer C.; Ehrlich S.; Najibi A.; Grimme S. A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions. Phys. Chem. Chem. Phys. 2017, 19, 32184–32215. 10.1039/C7CP04913G. PubMed DOI

Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157–1174. 10.1002/jcc.20035. PubMed DOI

Mongan J.; Simmerling C.; McCammon J. A.; Case D. A.; Onufriev A. Generalized Born Model with a Simple, Robust Molecular Volume Correction. J. Chem. Theory Comput. 2007, 3, 156–169. 10.1021/ct600085e. PubMed DOI PMC

Sellers B. D.; James N. C.; Gobbi A. A Comparison of Quantum and Molecular Mechanical Methods to Estimate Strain Energy in Druglike Fragments. J. Chem. Inf. Model. 2017, 57, 1265–1275. 10.1021/acs.jcim.6b00614. PubMed DOI

Hostaš J.; Řezáč J.; Hobza P. On the performance of the semiempirical quantum mechanical PM6 and PM7 methods for noncovalent interactions. Chem. Phys. Lett. 2013, 568–569, 161–166. 10.1016/j.cplett.2013.02.069. DOI

Řezáč J. Cuby: An integrative framework for computational chemistry. J. Comput. Chem. 2016, 37, 1230–1237. 10.1002/jcc.24312. PubMed DOI

SPICE-Models models GitHub repository. 2022, https://github.com/openmm/spice-models.

Anstine D.; Zubatyuk R.; Isayev O.. AIMNet2: A Neural Network Potential to Meet your Neutral, Charged, Organic, and Elemental-Organic Needs. 2023, https://chemrxiv.org/engage/chemrxiv/article-details/6525b39e8bab5d2055123f75.

AIMNet2 GitHub repository. 2023, https://github.com/isayevlab/AIMNet2.

Kovács D. P.; Moore J. H.; Browning N. J.; Batatia I.; Horton J. T.; Kapil V.; Witt W. C.; Magdău I.-B.; Cole D. J.; Csányi G.. MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules. 2023, http://arxiv.org/abs/2312.15211.

Batatia I.; Batzner S.; Kovács D. P.; Musaelian A.; Simm G. N. C.; Drautz R.; Ortner C.; Kozinsky B.; Csányi G.. The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials. 2022, https://arxiv.org/abs/2205.06643. PubMed PMC

Batatia I.; Kovacs D. P.; Simm G. N. C.; Ortner C.; Csanyi G.. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. 2022, https://openreview.net/forum?id=YPpSngE-ZU.

MACE GitHub repository. 2024, https://github.com/ACEsuit/mace.

MACE-OFF23 models GitHub repository. 2024, https://github.com/ACEsuit/mace-off.

Jacobson L. D.; Stevenson J. M.; Ramezanghorbani F.; Ghoreishi D.; Leswing K.; Harder E. D.; Abel R. Transferable Neural Network Potential Energy Surfaces for Closed-Shell Organic Molecules: Extension to Ions. J. Chem. Theory Comput. 2022, 18, 2354–2366. 10.1021/acs.jctc.1c00821. PubMed DOI

Giese T. J.; Zeng J.; Lerew L.; McCarthy E.; Tao Y.; Ekesan S. ¸.; York D. M. Software Infrastructure for Next-Generation QM/MM-ΔMLP Force Fields. J. Phys. Chem. B 2024, 128, 6257–6271. 10.1021/acs.jpcb.4c01466. PubMed DOI PMC

Řezáč J.Cuby 4, software framework for computational chemistry. 2015, http://cuby4.molecular.cz/. PubMed

Řezáč J.; Kontkanen O. V.; Nováček M. Working with benchmark datasets in the Cuby framework. J. Chem. Phys. 2024, 160, 202501.10.1063/5.0203372. PubMed DOI

Simple-dftd3 library GitHub repository. 2024, https://github.com/dftd3/simple-dftd3.

Zeng J.; Tao Y.; Giese T. J.; York D. M. Modern semiempirical electronic structure methods and machine learning potentials for drug discovery: Conformers, tautomers, and protonation states. J. Chem. Phys. 2023, 158, 124110.10.1063/5.0139281. PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

    Možnosti archivace