Representations of lipid nanoparticles using large language models for transfection efficiency prediction
Jazyk angličtina Země Anglie, Velká Británie Médium print
Typ dokumentu časopisecké články, práce podpořená grantem
Grantová podpora
Sanofi
PubMed
38810107
PubMed Central
PMC11629694
DOI
10.1093/bioinformatics/btae342
PII: 7684951
Knihovny.cz E-zdroje
- MeSH
- lipidy * chemie MeSH
- liposomy MeSH
- messenger RNA metabolismus MeSH
- nanočástice * chemie MeSH
- transfekce * metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- Lipid Nanoparticles MeSH Prohlížeč
- lipidy * MeSH
- liposomy MeSH
- messenger RNA MeSH
MOTIVATION: Lipid nanoparticles (LNPs) are the most widely used vehicles for mRNA vaccine delivery. The structure of the lipids composing the LNPs can have a major impact on the effectiveness of the mRNA payload. Several properties should be optimized to improve delivery and expression including biodegradability, synthetic accessibility, and transfection efficiency. RESULTS: To optimize LNPs, we developed and tested models that enable the virtual screening of LNPs with high transfection efficiency. Our best method uses the lipid Simplified Molecular-Input Line-Entry System (SMILES) as inputs to a large language model. Large language model-generated embeddings are then used by a downstream gradient-boosting classifier. As we show, our method can more accurately predict lipid properties, which could lead to higher efficiency and reduced experimental time and costs. AVAILABILITY AND IMPLEMENTATION: Code and data links available at: https://github.com/Sanofi-Public/LipoBART.
DataSentics Brno 602 00 Czech Republic
Digital R and D Sanofi Cambridge MA 02141 United States
Digital R and D Sanofi Toronto ON M5V 1V6 Canada
mRNA Center of Excellence Marcy L'Etoile Sanofi 69280 France
mRNA Center of Excellence Sanofi Waltham MA 02451 United States
Zobrazit více v PubMed
Aimo L, Liechti R, Hyka-Nouspikel N et al. The SwissLipids knowledgebase for lipid biology. Bioinformatics 2015;31:2860–6. PubMed PMC
Baden LR, El Sahly HM, Essink B et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. N Engl J Med 2021;384:403–16. PubMed PMC
Bjerrum E, Edwards L MegaMolBart: Generally applicable chemical AI models with large-scale pretrained transformers. https://resources.nvidia.com/en-us-drug-discovery/gtcfall21-a31106 (7 December 2023, date last accessed).
Curtis A, Henderson M, Eygeris Y et al. Quantifying lipid nanoparticle-mediated GFP expression in the murine retina. Invest Ophthalmol Vis Sci 2023;64:5027.
Dara T, Vatanara A, Meybodi MN et al. Erythropoietin-loaded solid lipid nanoparticles: preparation, optimization, and in vivo evaluation. Colloids Surf B Biointerfaces 2019;178:307–16. PubMed
Ding DY, Zhang Y, Jia Y et al. Machine learning-guided lipid nanoparticle design for mRNA delivery. arXiv, arXiv: 2308.01402, 2023, preprint: not peer reviewed.
Duvenaud DK, Maclaurin D, Iparraguirre J et al. Convolutional networks on graphs for learning molecular fingerprints. Adv Neural Inf Process Syst 2015;28:2224–32.
Eygeris Y, Gupta M, Kim J et al. Chemistry of lipid nanoparticles for RNA delivery. Acc Chem Res 2021;55:2–12. PubMed
Gaulton A, Bellis LJ, Bento AP et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 2012;40:D1100–7. PubMed PMC
Gilmer J, Schoenholz SS, Riley PF et al. Neural message passing for quantum chemistry. In: Precup D, Teh, YW. (eds) International Conference on Machine Learning, Sydney, Australia, 2017. 1263-72. Proceedings of Machine Learning Research.
Goh GB, Siegel C, Vishnu A et al. Chemception: a deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR/QSPR models. arXiv, arXiv: 1706.06689, 2017, preprint: not peer reviewed.
Hajj KA, Whitehead KA. Tools for translation: non-viral materials for therapeutic mRNA delivery. Nat Rev Mater 2017;2:1–17.
Han X, Zhang H, Butowska K et al. An ionizable lipid toolbox for RNA delivery. Nat Commun 2021;12:7233. PubMed PMC
Hinton GE, Srivastava N, Krizhevsky A et al. Improving neural networks by preventing co-adaptation of feature detectors. arXiv, arXiv: 1207.0580, 2012, preprint: not peer reviewed.
Hou X, Zaks T, Langer R et al. Lipid nanoparticles for mRNA delivery. Nat Rev Mater 2021;6:1078–94. PubMed PMC
Irwin R, Dimitriadis S, He J et al. Chemformer: a pre-trained transformer for computational chemistry. Mach Learn Sci Technol 2022;3:015022.
Kauffman KJ, Dorkin JR, Yang JH et al. Optimization of lipid nanoparticle formulations for mRNA delivery in vivo with fractional factorial and definitive screening designs. Nano Lett 2015;15:7300–6. 10.1021/acs.nanolett.5b02497 PubMed DOI
Kim J, Eygeris Y, Gupta M et al. Self-assembled mRNA vaccines. Adv Drug Deliv Rev 2021;170:83–112. PubMed PMC
Kon E, Elia U, Peer D. Principles for designing an optimal mRNA lipid nanoparticle vaccine. Curr Opin Biotechnol 2022;73:329–36. 10.1016/j.copbio.2021.09.016. PubMed DOI PMC
Lewis M, Liu Y, Goyal N et al. Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv, arXiv: 1910.13461, 2019, preprint: not peer reviewed.
Li B, Luo X, Deng B et al. An orthogonal array optimization of lipid-like nanoparticles for mRNA delivery in vivo. Nano Lett 2015;15:8099–107. PubMed PMC
Li S, Moayedpour S, Li R et al. Codonbert: large language models for mRNA design and optimization. bioRxiv, 10.1101/2023.09.09.556981, 2023. DOI
Li Y, Li R, Chakraborty A et al. Combinatorial library of cyclic benzylidene acetal-containing pH-responsive lipidoid nanoparticles for intracellular mRNA delivery. Bioconjug Chem 2020;31:1835–43. PubMed PMC
Lin Z, Akin H, Rao R et al. Language models of protein sequences at the scale of evolution enable accurate structure prediction. bioRxiv, 2022.
Liu S, Cheng Q, Wei T et al. Membrane-destabilizing ionizable phospholipids for organ-selective mRNA delivery and CRISPR–Cas gene editing. Nat Mater 2021;20:701–10. PubMed PMC
Meng C, Chen Z, Li G et al. Nanoplatforms for mRNA therapeutics. Adv Therap 2021;4:2000099.
Mouchlis VD, Afantitis A, Serra A et al. Advances in de novo drug design: from conventional to machine learning methods. Int J Mol Sci 2021;22:1676. PubMed PMC
Rajan K, Zielesny A, Steinbeck C. DECIMER: towards deep learning for chemical image recognition. J Cheminform 2020;12:65–9. PubMed PMC
Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model 2010;50:742–54. PubMed
Rong Y, Bian Y, Xu T et al. Self-supervised graph transformer on large-scale molecular data. Adv Neural Inf Process Syst 2020;33:12559–71.
Ross J, Belgodere B, Chenthamarakshan V et al. Large-scale chemical language representations capture molecular structure and properties. Nat Mach Intell 2022;4:1256–64.
Sterling T, Irwin JJ. Zinc 15–ligand discovery for everyone. J Chem Inf Model 2015;55:2324–37. PubMed PMC
Sun D, Lu Z-R. Structure and function of cationic and ionizable lipids for nucleic acid delivery. Pharm Res 2023;40:27–46. PubMed PMC
Ucak UV, Ashyrmamatov I, Lee J. Improving the quality of chemical language model outcomes with atom-in-smiles tokenization. J Cheminform 2023;15:55. PubMed PMC
Winter B, Winter C, Schilling J et al. A smile is all you need: predicting limiting activity coefficients from smiles with natural language processing. Digit Discov 2022;1:859–69. PubMed PMC
Zhang X, Zhao W, Nguyen GN et al. Functionalized lipid-like nanoparticles for in vivo mRNA delivery and base editing. Sci Adv 2020;6(34):eabc2315. PubMed PMC
Computational Methods for Modeling Lipid-Mediated Active Pharmaceutical Ingredient Delivery