PM6-ML: The Synergy of Semiempirical Quantum Chemistry and Machine Learning Transformed into a Practical Computational Method
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články
PubMed
39752295
PubMed Central
PMC11780751
DOI
10.1021/acs.jctc.4c01330
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
Machine learning (ML) methods offer a promising route to the construction of universal molecular potentials with high accuracy and low computational cost. It is becoming evident that integrating physical principles into these models, or utilizing them in a Δ-ML scheme, significantly enhances their robustness and transferability. This paper introduces PM6-ML, a Δ-ML method that synergizes the semiempirical quantum-mechanical (SQM) method PM6 with a state-of-the-art ML potential applied as a universal correction. The method demonstrates superior performance over standalone SQM and ML approaches and covers a broader chemical space than its predecessors. It is scalable to systems with thousands of atoms, which makes it applicable to large biomolecular systems. Extensive benchmarking confirms PM6-ML's accuracy and robustness. Its practical application is facilitated by a direct interface to MOPAC. The code and parameters are available at https://github.com/Honza-R/mopac-ml.
Zobrazit více v PubMed
Thiel W. Semiempirical quantum–chemical methods. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2014, 4, 145–157. 10.1002/wcms.1161. DOI
Akimov A. V.; Prezhdo O. V. Large-Scale Computations in Chemistry: A Bird’s Eye View of a Vibrant Field. Chem. Rev. 2015, 115, 5797–5890. 10.1021/cr500524c. PubMed DOI
Řezáč J.; Fanfrlík J.; Salahub D.; Hobza P. Semiempirical Quantum Chemical PM6 Method Augmented by Dispersion and H-Bonding Correction Terms Reliably Describes Various Types of Noncovalent Complexes. J. Chem. Theory Comput. 2009, 5, 1749–1760. 10.1021/ct9000922. PubMed DOI
Řezáč J.; Hobza P. A halogen-bonding correction for the semiempirical PM6 method. Chem. Phys. Lett. 2011, 506, 286–289. 10.1016/j.cplett.2011.03.009. DOI
Řezáč J.; Hobza P. Advanced Corrections of Hydrogen Bonding and Dispersion for Semiempirical Quantum Mechanical Methods. J. Chem. Theory Comput. 2012, 8, 141–151. 10.1021/ct200751e. PubMed DOI
Řezáč J. Empirical Self-Consistent Correction for the Description of Hydrogen Bonds in DFTB3. J. Chem. Theory Comput. 2017, 13, 4804–4817. 10.1021/acs.jctc.7b00629. PubMed DOI
Stewart J. J. P. Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters. J. Mol. Model. 2013, 19, 1–32. 10.1007/s00894-012-1667-x. PubMed DOI PMC
Bannwarth C.; Ehlert S.; Grimme S. GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput. 2019, 15, 1652–1671. 10.1021/acs.jctc.8b01176. PubMed DOI
Kříž K.; Řezáč J. Benchmarking of Semiempirical Quantum-Mechanical Methods on Systems Relevant to Computer-Aided Drug Design. J. Chem. Inf. Model. 2020, 60, 1453–1460. 10.1021/acs.jcim.9b01171. PubMed DOI
Řezáč J.; Stewart J. J. P. How well do semiempirical QM methods describe the structure of proteins?. J. Chem. Phys. 2023, 158, 044118.10.1063/5.0135091. PubMed DOI
Pecina A.; Fanfrlík J.; Lepšík M.; Řezáč J. SQM2.20: Semiempirical quantum-mechanical scoring function yields DFT-quality protein–ligand binding affinity predictions in minutes. Nat. Commun. 2024, 15, 1127.10.1038/s41467-024-45431-8. PubMed DOI PMC
Stewart J. J. P. Optimization of parameters for semiempirical methods V: Modification of NDDO approximations and application to 70 elements. J. Mol. Model. 2007, 13, 1173–1213. 10.1007/s00894-007-0233-4. PubMed DOI PMC
Stewart J. J. P. Application of localized molecular orbitals to the solution of semiempirical self-consistent field equations. Int. J. Quantum Chem. 1996, 58, 133–146. 10.1002/(SICI)1097-461X(1996)58:2<133::AID-QUA2>3.0.CO;2-Z. DOI
Dewar M. J. S.; Thiel W. Ground states of molecules. 38. The MNDO method. Approximations and parameters. J. Am. Chem. Soc. 1977, 99, 4899–4907. 10.1021/ja00457a004. DOI
Miriyala V. M.; Řezáč J. Testing Semiempirical Quantum Mechanical Methods on a Data Set of Interaction Energies Mapping Repulsive Contacts in Organic Molecules. J. Phys. Chem. A 2018, 122, 2801–2808. 10.1021/acs.jpca.8b00260. PubMed DOI
Kříž K.; Nováček M.; Řezáč J. Non-Covalent Interactions Atlas Benchmark Data Sets 3: Repulsive Contacts. J. Chem. Theory Comput. 2021, 17, 1548–1561. 10.1021/acs.jctc.0c01341. PubMed DOI
Řezáč J.; Bím D.; Gutten O.; Rulíšek L. Toward Accurate Conformational Energies of Smaller Peptides and Medium-Sized Macrocycles: MPCONF196 Benchmark Energy Data Set. J. Chem. Theory Comput. 2018, 14, 1254–1266. 10.1021/acs.jctc.7b01074. PubMed DOI
Thölke P.; De Fabritiis G.. TorchMD-NET: Equivariant Transformers for Neural Network based Molecular Potentials. 2022, http://arxiv.org/abs/2202.02541.
Eastman P.; Behara P. K.; Dotson D. L.; Galvelis R.; Herr J. E.; Horton J. T.; Mao Y.; Chodera J. D.; Pritchard B. P.; Wang Y.; De Fabritiis G.; Markland T. E. SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials. Sci. Data 2023, 10, 11.10.1038/s41597-022-01882-6. PubMed DOI PMC
Zheng P.; Zubatyuk R.; Wu W.; Isayev O.; Dral P. O.. Artificial Intelligence-Enhanced Quantum Chemical Method with Broad Applicability. 2021, https://chemrxiv.org/engage/chemrxiv/article-details/60f8e1630b093ee195e2bca0. PubMed PMC
Zeng J.; Tao Y.; Giese T. J.; York D. M. QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery. J. Chem. Theory Comput. 2023, 19, 1261–1275. 10.1021/acs.jctc.2c01172. PubMed DOI PMC
Najibi A.; Goerigk L. The Nonlocal Kernel in van der Waals Density Functionals as an Additive Correction: An Extensive Analysis with Special Emphasis on the B97M-V and ωB97M-V Approaches. J. Chem. Theory Comput. 2018, 14, 5725–5738. 10.1021/acs.jctc.8b00842. PubMed DOI
Weigend F.; Ahlrichs R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305. 10.1039/b508541a. PubMed DOI
Řezáč J. Non-Covalent Interactions Atlas Benchmark Data Sets: Hydrogen Bonding. J. Chem. Theory Comput. 2020, 16, 2355–2368. 10.1021/acs.jctc.9b01265. PubMed DOI
Řezáč J. Non-Covalent Interactions Atlas Benchmark Data Sets 2: Hydrogen Bonding in an Extended Chemical Space. J. Chem. Theory Comput. 2020, 16, 6305–6316. 10.1021/acs.jctc.0c00715. PubMed DOI
Kříž K.; Řezáč J. Non-covalent interactions atlas benchmark data sets 4: σ-hole interactions. Phys. Chem. Chem. Phys. 2022, 24, 14794–14804. 10.1039/D2CP01600A. PubMed DOI
Řezáč J. Non-Covalent Interactions Atlas benchmark data sets 5: London dispersion in an extended chemical space. Phys. Chem. Chem. Phys. 2022, 24, 14780–14793. 10.1039/D2CP01602H. PubMed DOI
Stewart J. J. P.MOPAC 2016. 2016, http://openmopac.net/.
Tai K. S.; Bailis P.; Valiant G.. Equivariant Transformer Networks. 2019, http://arxiv.org/abs/1901.11399.
Bronstein M. M.; Bruna J.; Cohen T.; Veličković P.. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. 2021, http://arxiv.org/abs/2104.13478.
Neese F.; Wennmohs F.; Becker U.; Riplinger C. The ORCA quantum chemistry program package. J. Chem. Phys. 2020, 152, 224108.10.1063/5.0004608. PubMed DOI
Neese F. Software update: The ORCA program system—Version 5.0. Wiley Interdiscip. Rev.:Comput. Mol. Sci. 2022, 12, e160610.1002/wcms.1606. DOI
TorchMD-NET GitHub repository. 2024, https://github.com/torchmd/torchmd-net.
PyTorch machine learning library. 2024, https://pytorch.org.
Sedlák R.; Janowski T.; Pitoňák M.; Řezáč J.; Pulay P.; Hobza P. Accuracy of Quantum Chemical Methods for Large Noncovalent Complexes. J. Chem. Theory Comput. 2013, 9, 3364–3374. 10.1021/ct400036b. PubMed DOI PMC
Risthaus T.; Grimme S. Benchmarking of London Dispersion-Accounting Density Functional Theory Methods on Very Large Molecular Complexes. J. Chem. Theory Comput. 2013, 9, 1580–1591. 10.1021/ct301081n. PubMed DOI
Villot C.; Ballesteros F.; Wang D.; Lao K. U. Coupled Cluster Benchmarking of Large Noncovalent Complexes in L7 and S12L as Well as the C60 Dimer, DNA–Ellipticine, and HIV–Indinavir. J. Phys. Chem. A 2022, 126, 4326–4341. 10.1021/acs.jpca.2c01421. PubMed DOI
Godbout N.; Salahub D. R.; Andzelm J.; Wimmer E. Optimization of Gaussian-type basis sets for local spin density functional calculations. Part I. Boron through neon, optimization technique and validation. Can. J. Chem. 1992, 70, 560–571. 10.1139/v92-079. DOI
Hostaš J.; Řezáč J. Accurate DFT-D3 Calculations in a Small Basis Set. J. Chem. Theory Comput. 2017, 13, 3575–3585. 10.1021/acs.jctc.7b00365. PubMed DOI
Goerigk L.; Hansen A.; Bauer C.; Ehrlich S.; Najibi A.; Grimme S. A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions. Phys. Chem. Chem. Phys. 2017, 19, 32184–32215. 10.1039/C7CP04913G. PubMed DOI
Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157–1174. 10.1002/jcc.20035. PubMed DOI
Mongan J.; Simmerling C.; McCammon J. A.; Case D. A.; Onufriev A. Generalized Born Model with a Simple, Robust Molecular Volume Correction. J. Chem. Theory Comput. 2007, 3, 156–169. 10.1021/ct600085e. PubMed DOI PMC
Sellers B. D.; James N. C.; Gobbi A. A Comparison of Quantum and Molecular Mechanical Methods to Estimate Strain Energy in Druglike Fragments. J. Chem. Inf. Model. 2017, 57, 1265–1275. 10.1021/acs.jcim.6b00614. PubMed DOI
Hostaš J.; Řezáč J.; Hobza P. On the performance of the semiempirical quantum mechanical PM6 and PM7 methods for noncovalent interactions. Chem. Phys. Lett. 2013, 568–569, 161–166. 10.1016/j.cplett.2013.02.069. DOI
Řezáč J. Cuby: An integrative framework for computational chemistry. J. Comput. Chem. 2016, 37, 1230–1237. 10.1002/jcc.24312. PubMed DOI
SPICE-Models models GitHub repository. 2022, https://github.com/openmm/spice-models.
Anstine D.; Zubatyuk R.; Isayev O.. AIMNet2: A Neural Network Potential to Meet your Neutral, Charged, Organic, and Elemental-Organic Needs. 2023, https://chemrxiv.org/engage/chemrxiv/article-details/6525b39e8bab5d2055123f75.
AIMNet2 GitHub repository. 2023, https://github.com/isayevlab/AIMNet2.
Kovács D. P.; Moore J. H.; Browning N. J.; Batatia I.; Horton J. T.; Kapil V.; Witt W. C.; Magdău I.-B.; Cole D. J.; Csányi G.. MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules. 2023, http://arxiv.org/abs/2312.15211.
Batatia I.; Batzner S.; Kovács D. P.; Musaelian A.; Simm G. N. C.; Drautz R.; Ortner C.; Kozinsky B.; Csányi G.. The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials. 2022, https://arxiv.org/abs/2205.06643. PubMed PMC
Batatia I.; Kovacs D. P.; Simm G. N. C.; Ortner C.; Csanyi G.. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. 2022, https://openreview.net/forum?id=YPpSngE-ZU.
MACE GitHub repository. 2024, https://github.com/ACEsuit/mace.
MACE-OFF23 models GitHub repository. 2024, https://github.com/ACEsuit/mace-off.
Jacobson L. D.; Stevenson J. M.; Ramezanghorbani F.; Ghoreishi D.; Leswing K.; Harder E. D.; Abel R. Transferable Neural Network Potential Energy Surfaces for Closed-Shell Organic Molecules: Extension to Ions. J. Chem. Theory Comput. 2022, 18, 2354–2366. 10.1021/acs.jctc.1c00821. PubMed DOI
Giese T. J.; Zeng J.; Lerew L.; McCarthy E.; Tao Y.; Ekesan S. ¸.; York D. M. Software Infrastructure for Next-Generation QM/MM-ΔMLP Force Fields. J. Phys. Chem. B 2024, 128, 6257–6271. 10.1021/acs.jpcb.4c01466. PubMed DOI PMC
Řezáč J.Cuby 4, software framework for computational chemistry. 2015, http://cuby4.molecular.cz/. PubMed
Řezáč J.; Kontkanen O. V.; Nováček M. Working with benchmark datasets in the Cuby framework. J. Chem. Phys. 2024, 160, 202501.10.1063/5.0203372. PubMed DOI
Simple-dftd3 library GitHub repository. 2024, https://github.com/dftd3/simple-dftd3.
Zeng J.; Tao Y.; Giese T. J.; York D. M. Modern semiempirical electronic structure methods and machine learning potentials for drug discovery: Conformers, tautomers, and protonation states. J. Chem. Phys. 2023, 158, 124110.10.1063/5.0139281. PubMed DOI PMC