Most cited article - PubMed ID 35686612
Non-Covalent Interactions Atlas benchmark data sets 5: London dispersion in an extended chemical space
Machine learning (ML) methods offer a promising route to the construction of universal molecular potentials with high accuracy and low computational cost. It is becoming evident that integrating physical principles into these models, or utilizing them in a Δ-ML scheme, significantly enhances their robustness and transferability. This paper introduces PM6-ML, a Δ-ML method that synergizes the semiempirical quantum-mechanical (SQM) method PM6 with a state-of-the-art ML potential applied as a universal correction. The method demonstrates superior performance over standalone SQM and ML approaches and covers a broader chemical space than its predecessors. It is scalable to systems with thousands of atoms, which makes it applicable to large biomolecular systems. Extensive benchmarking confirms PM6-ML's accuracy and robustness. Its practical application is facilitated by a direct interface to MOPAC. The code and parameters are available at https://github.com/Honza-R/mopac-ml.
- Publication type
- Journal Article MeSH
Accurate estimation of protein-ligand binding affinity is the cornerstone of computer-aided drug design. We present a universal physics-based scoring function, named SQM2.20, addressing key terms of binding free energy using semiempirical quantum-mechanical computational methods. SQM2.20 incorporates the latest methodological advances while remaining computationally efficient even for systems with thousands of atoms. To validate it rigorously, we have compiled and made available the PL-REX benchmark dataset consisting of high-resolution crystal structures and reliable experimental affinities for ten diverse protein targets. Comparative assessments demonstrate that SQM2.20 outperforms other scoring methods and reaches a level of accuracy similar to much more expensive DFT calculations. In the PL-REX dataset, it achieves excellent correlation with experimental data (average R2 = 0.69) and exhibits consistent performance across all targets. In contrast to DFT, SQM2.20 provides affinity predictions in minutes, making it suitable for practical applications in hit identification or lead optimization.
- MeSH
- Ligands MeSH
- Proteins * metabolism MeSH
- Drug Design * MeSH
- Thermodynamics MeSH
- Protein Binding MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Ligands MeSH
- Proteins * MeSH
Accurate estimates of intermolecular interaction energy, ΔE, are crucial for modeling the properties of organic electronic materials and many other systems. For a diverse set of 50 dimers comprising up to 50 atoms (Set50-50, with 7 of its members being models of single-stacking junctions), benchmark ΔE data were compiled. They were obtained by the focal-point strategy, which involves computations using the canonical variant of the coupled cluster theory with singles, doubles, and perturbative triples [CCSD(T)] performed while applying a large basis set, along with extrapolations of the respective energy components to the complete basis set (CBS) limit. The resulting ΔE data were used to gauge the performance for the Set50-50 of several density-functional theory (DFT)-based approaches, and of one of the localized variants of the CCSD(T) method. This evaluation revealed that (1) the proposed "silver standard" approach, which employs the localized CCSD(T) method and CBS extrapolations, can be expected to provide accuracy better than two kJ/mol for absolute values of ΔE, and (2) from among the DFT techniques, computationally by far the cheapest approach (termed "ωB97X-3c/vDZP" by its authors) performed remarkably well. These findings are directly applicable in cost-effective yet reliable searches of the potential energy surfaces of noncovalent complexes.
- Keywords
- CCSD(T), DFT, interaction energy, noncovalent interactions, supramolecular junctions,
- MeSH
- Dimerization MeSH
- Electronics * MeSH
- Physical Phenomena MeSH
- Polymers MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Polymers MeSH