Background: Classifying diseases into ICD codes has mainly relied on human reading a large amount of written materials, such as discharge diagnoses, chief complaints, medical history, and operation records as the basis for classification. Coding is both laborious and time consuming because a disease coder with professional abilities takes about 20 minutes per case in average. Therefore, an automatic code classification system can significantly reduce the human effort. Objectives: This paper aims at constructing a machine learning model for ICD-10 coding, where the model is to automatically determine the corresponding diagnosis codes solely based on free-text medical notes. Methods: In this paper, we apply Natural Language Processing (NLP) and Recurrent Neural Network (RNN) architecture to classify ICD-10 codes from natural language texts with supervised learning. Results: In the experiments on large hospital data, our predicting result can reach F1-score of 0.62 on ICD-10-CM code. Conclusion: The developed model can significantly reduce manpower in coding time compared with a professional coder.
- MeSH
- automatizované zpracování dat metody MeSH
- deep learning * MeSH
- elektronické zdravotní záznamy MeSH
- mezinárodní klasifikace nemocí * MeSH
- neuronové sítě MeSH
- strojové učení MeSH
- ukládání a vyhledávání informací metody statistika a číselné údaje MeSH
- vizualizace dat MeSH
- zpracování přirozeného jazyka MeSH
- Publikační typ
- práce podpořená grantem MeSH