-
Je něco špatně v tomto záznamu ?
GastroGPT: Development and controlled testing of a proof-of-concept customized clinical language model
C. Simsek, M. Ucdal, E. de-Madaria, A. Ebigbo, P. Vanek, O. Elshaarawy, TA. Voiosu, G. Antonelli, R. Turró, JP. Gisbert, OP. Nyssen, C. Hassan, H. Messmann, R. Jalan
Status neindexováno Jazyk angličtina Země Německo
Typ dokumentu časopisecké články
NLK
Directory of Open Access Journals
od 2013
Free Medical Journals
od 2013
PubMed Central
od 2013
Europe PubMed Central
od 2013
Open Access Digital Library
od 2013-01-01
Open Access Digital Library
od 2013-01-01
ROAD: Directory of Open Access Scholarly Resources
od 2013
Thieme Connect Journals Open Access
od 2013
PubMed
40860687
DOI
10.1055/a-2637-2163
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
BACKGROUND AND STUDY AIMS: Current general-purpose artificial intelligence (AI) large language models (LLMs) demonstrate limited efficacy in clinical medicine, often constrained to question-answering, documentation, and literature summarization roles. We developed GastroGPT, a proof-of-concept specialty-specific, multi-task, clinical LLM, and evaluated its performance against leading general-purpose LLMs across key gastroenterology tasks and diverse case scenarios. METHODS: In this structured analysis, GastroGPT was compared with three state-of-the-art general-purpose LLMs (LLM-A: GPT-4, LLM-B: Bard, LLM-C: Claude). Models were assessed on seven clinical tasks and overall performance across 10 simulated gastroenterology cases varying in complexity, frequency, and patient demographics. Standardized prompts facilitated structured comparisons. A blinded expert panel rated model outputs per task on a 10-point Likert scale, judging clinical utility. Comprehensive statistical analyses were conducted. RESULTS: A total of 2,240 expert ratings were obtained. GastroGPT achieved significantly higher mean overall scores (8.1 ± 1.8) compared with GPT-4 (5.2 ± 3.0), Bard (5.7 ± 3.3), and Claude (7.0 ± 2.7) (all P < 0.001). It outperformed comparators in six of seven tasks ( P < 0.05), except follow-up planning. GastroGPT demonstrated superior score consistency (variance 34.95) versus general models (97.4-260.35) ( P < 0.001). Its performance remained consistent across case complexities and frequencies, unlike the comparators ( P < 0.001). Multivariate analysis revealed that model type significantly predicted performance ( P < 0.001). CONCLUSIONS: This study pioneered development and comparison of a specialty-specific, clinically-oriented AI model to general-purpose LLMs. GastroGPT demonstrated superior utility overall and on key gastroenterology tasks, highlighting the potential for tailored, task-focused AI models in medicine.
Digestive Endoscopy Unit Humanitas Research Hospital Department of Gastroenterology Milan Italy
Division of Gastroenterology Faculty of Medicine Hospital Universitario de la Princesa Madrid Spain
Division of Gastroenterology Universitätsklinikum Augsburg Augsburg Germany
Dr Balmis General University Hospital Alicante Spain
Endoscopy Unit Teknon Medical Center Barcelona Spain
Gastroenterology and Hepatology Johns Hopkins Medical Institutions Campus Baltimore United States
Gastroenterology Colentina Hospital Bucharest Romania
Hospital Universitario de la Princesa Madrid Spain
internal medicine Hacettepe University Faculty of Medicine Ankara Turkey
National Liver Institute Shebeen El Kom Egypt
Citace poskytuje Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc25020640
- 003
- CZ-PrNML
- 005
- 20251014150312.0
- 007
- ta
- 008
- 251007e20250806gw f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1055/a-2637-2163 $2 doi
- 035 __
- $a (PubMed)40860687
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a gw
- 100 1_
- $a Simsek, Cem $u Gastroenterology & Hepatology, Johns Hopkins Medical Institutions Campus, Baltimore, United States
- 245 10
- $a GastroGPT: Development and controlled testing of a proof-of-concept customized clinical language model / $c C. Simsek, M. Ucdal, E. de-Madaria, A. Ebigbo, P. Vanek, O. Elshaarawy, TA. Voiosu, G. Antonelli, R. Turró, JP. Gisbert, OP. Nyssen, C. Hassan, H. Messmann, R. Jalan
- 520 9_
- $a BACKGROUND AND STUDY AIMS: Current general-purpose artificial intelligence (AI) large language models (LLMs) demonstrate limited efficacy in clinical medicine, often constrained to question-answering, documentation, and literature summarization roles. We developed GastroGPT, a proof-of-concept specialty-specific, multi-task, clinical LLM, and evaluated its performance against leading general-purpose LLMs across key gastroenterology tasks and diverse case scenarios. METHODS: In this structured analysis, GastroGPT was compared with three state-of-the-art general-purpose LLMs (LLM-A: GPT-4, LLM-B: Bard, LLM-C: Claude). Models were assessed on seven clinical tasks and overall performance across 10 simulated gastroenterology cases varying in complexity, frequency, and patient demographics. Standardized prompts facilitated structured comparisons. A blinded expert panel rated model outputs per task on a 10-point Likert scale, judging clinical utility. Comprehensive statistical analyses were conducted. RESULTS: A total of 2,240 expert ratings were obtained. GastroGPT achieved significantly higher mean overall scores (8.1 ± 1.8) compared with GPT-4 (5.2 ± 3.0), Bard (5.7 ± 3.3), and Claude (7.0 ± 2.7) (all P < 0.001). It outperformed comparators in six of seven tasks ( P < 0.05), except follow-up planning. GastroGPT demonstrated superior score consistency (variance 34.95) versus general models (97.4-260.35) ( P < 0.001). Its performance remained consistent across case complexities and frequencies, unlike the comparators ( P < 0.001). Multivariate analysis revealed that model type significantly predicted performance ( P < 0.001). CONCLUSIONS: This study pioneered development and comparison of a specialty-specific, clinically-oriented AI model to general-purpose LLMs. GastroGPT demonstrated superior utility overall and on key gastroenterology tasks, highlighting the potential for tailored, task-focused AI models in medicine.
- 590 __
- $a NEINDEXOVÁNO
- 655 _2
- $a časopisecké články $7 D016428
- 700 1_
- $a Ucdal, Mete $u internal medicine, Hacettepe University Faculty of Medicine, Ankara, Turkey
- 700 1_
- $a de-Madaria, Enrique $u Dr Balmis General University Hospital, Alicante, Spain
- 700 1_
- $a Ebigbo, Alanna $u Division of Gastroenterology, Universitätsklinikum Augsburg, Augsburg, Germany
- 700 1_
- $a Vanek, Petr $u Palacky University Olomouc, Olomouc, Czech Republic
- 700 1_
- $a Elshaarawy, Omar $u Liverpool University Hospitals NHS Foundation Trust, Liverpool, United Kingdom of Great Britain and Northern Ireland $u National Liver Institute, Shebeen El-Kom, Egypt
- 700 1_
- $a Voiosu, Theodor Alexandru $u Gastroenterology, Colentina Hospital, Bucharest, Romania
- 700 1_
- $a Antonelli, Giulio $u Sapienza University of Rome, Digestive and Liver Disease Unit, Azienda Ospedaliera Sant'Andrea, Roma, Italy $1 https://orcid.org/0000000317973864
- 700 1_
- $a Turró, Román $u Endoscopy Unit,, Teknon Medical Center, Barcelona, Spain
- 700 1_
- $a Gisbert, Javier P $u Division of Gastroenterology, Faculty of Medicine, Hospital Universitario de la Princesa, Madrid, Spain
- 700 1_
- $a Nyssen, Olga P $u Hospital Universitario de la Princesa, Madrid, Spain
- 700 1_
- $a Hassan, Cesare $u Digestive Endoscopy Unit, Humanitas Research Hospital Department of Gastroenterology, Milan, Italy
- 700 1_
- $a Messmann, Helmut $u Division of Gastroenterology, Universitätsklinikum Augsburg, Augsburg, Germany
- 700 1_
- $a Jalan, Rajiv $u University College Hospital London Medical School, London, United Kingdom of Great Britain and Northern Ireland
- 773 0_
- $w MED00200138 $t Endoscopy international open $x 2364-3722 $g Roč. 13 (20250806), s. a26372163
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/40860687 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y - $z 0
- 990 __
- $a 20251007 $b ABA008
- 991 __
- $a 20251014150318 $b ABA008
- 999 __
- $a ok $b bmc $g 2410875 $s 1258796
- BAS __
- $a 3
- BAS __
- $a PreBMC-PubMed-not-MEDLINE
- BMC __
- $a 2025 $b 13 $c - $d a26372163 $e 20250806 $i 2364-3722 $m Endoscopy international open $n Endosc Int Open $x MED00200138
- LZP __
- $a Pubmed-20251007