• This record comes from PubMed

Multidimensional Machine Learning Model to Calculate a COVID-19 Vulnerability Index

. 2023 Jul 15 ; 13 (7) : . [epub] 20230715

Status PubMed-not-MEDLINE Language English Country Switzerland Media electronic

Document type Journal Article

In Colombia, the first case of COVID-19 was confirmed on 6 March 2020. On 13 March 2023, Colombia registered 6,360,780 confirmed positive cases of COVID-19, representing 12.18% of the total population. The National Administrative Department of Statistics (DANE) in Colombia published in 2020 a COVID-19 vulnerability index, which estimates the vulnerability (per city block) of being infected with COVID-19. Unfortunately, DANE did not consider multiple factors that could increase the risk of COVID-19 (in addition to demographic and health), such as environmental and mobility data (found in the related literature). The proposed multidimensional index considers variables of different types (unemployment rate, gross domestic product, citizens' mobility, vaccination data, and climatological and spatial information) in which the incidence of COVID-19 is calculated and compared with the incidence of the COVID-19 vulnerability index provided by DANE. The collection, data preparation, modeling, and evaluation phases of the Cross-Industry Standard Process for Data Mining methodology (CRISP-DM) were considered for constructing the index. The multidimensional index was evaluated using multiple machine learning models to calculate the incidence of COVID-19 cases in the main cities of Colombia. The results showed that the best-performing model to predict the incidence of COVID-19 in Colombia is the Extra Trees Regressor algorithm, obtaining an R-squared of 0.829. This work is the first step toward a multidimensional analysis of COVID-19 risk factors, which has the potential to support decision making in public health programs. The results are also relevant for calculating vulnerability indexes for other viral diseases, such as dengue.

See more in PubMed

Información Basíca Sobre la COVID-19. [(accessed on 12 September 2021)]. Available online: https://www.who.int/es/news-room/q-a-detail/coronavirus-disease-covid-19.

El Coronavirus en Colombia. [(accessed on 22 September 2021)]; Available online: https://coronaviruscolombia.gov.co/Covid19/

[(accessed on 13 June 2023)]; Available online: https://www.dane.gov.co/files/comunicados/Nota_metodologica_indice_de_vulnerabilidad.pdf.

Pastor-Sierra K.S., Peñata-Taborda A., Coneo-Pretelt A., Jiménez-Vidal L., Arteaga-Arroyo G., Caldera D.R., Salcedo-Arteaga S., Galeano-Páez C., Espitia-Pérez P., Espitia-Pérez L. Factores ambientales en la transmisión del SARS-CoV-2/COVID 19: Panorama mundial y colombiano. Salud UIS. 2021;53:15. doi: 10.18273/saluduis.53.e:21037. PubMed DOI

Lo Que Debes Saber Sobre Las Vacunas Contra la COVID-19. [(accessed on 26 February 2023)]. Available online: https://www.unicef.org/es/coronavirus/lo-que-debes-saber-sobre-vacuna-covid19.

En Colombia, No Vacunados Tienen de 4 a 9 Veces Más Riesgo de Morir Por COVID-19. [(accessed on 26 February 2023)]; Available online: https://www.minsalud.gov.co/Paginas/En-Colombia-no-vacunados-tienen-de-4-a-9-veces-mas-riesgo-de-morir-por-covid-19-.aspx.

Economic Commission for Latin America and the Caribbean . La Prolongación de la Crisis Sanitaria y su Impacto en la Salud, la economía y el Desarrollo Social. United Nations; San Francisco, CA, USA: 2021. Informes COVID-19 de la CEPAL.

Rosero P.A., Realpe J.S., Farinango C.D., Restrepo D.S., Salazar-Cabrera R., Lopez D.M. PHealth 2022: Proceedings of the 19th International Conference on Wearable Micro and Nano Technologies for Personalized Health. IOS Press; Amsterdam, The Netherlands: 2022. Risk Factors for COVID-19: A Systematic Mapping Study; pp. 63–74. PubMed DOI

Tiwari A., Dadhania A.V., Ragunathrao V.A.B., Oliveira E.R.A. Using Machine Learning to Develop a Novel COVID-19 Vulnerability Index (C19VI) Sci. Total Environ. 2021;773:145650. doi: 10.1016/j.scitotenv.2021.145650. PubMed DOI PMC

IBM Docs. [(accessed on 23 September 2021)]. Available online: https://prod.ibmdocs-production-dal-6099123ce774e592a519d7c33db8265e-0000.us-south.containers.appdomain.cloud/docs/es/spss-modeler/SaaS?topic=dm-crisp-help-overview.

COLOMBIA—Censo Nacional de Población y Vivienda—CNPV—2018—Data Dictionary. [(accessed on 10 November 2022)]; Available online: http://microdatos.dane.gov.co/index.php/catalog/643/data_dictionary#page=F9&tab=data-dictionary.

Base COVID-19 Dataset. [(accessed on 26 February 2023)]. Available online: https://www.kaggle.com/datasets/sebastianrgonzalez/base-dane-covid19-dataset.

Una Comparación de los Métodos de Correlación de Pearson y Spearman. [(accessed on 26 February 2023)]. Available online: https://support.minitab.com/es-mx/minitab/20/help-and-how-to/statistics/basic-statistics/supporting-topics/correlation-and-covariance/a-comparison-of-the-pearson-and-spearman-correlation-methods/

Spearman’s Rank Correlation: The Definitive Guide to Understand|Simplilearn. [(accessed on 9 November 2022)]. Available online: https://www.simplilearn.com/tutorials/statistics-tutorial/spearmans-rank-correlation.

Una Guía Para Principiantes Sobre La Regresión Lineal En Python Con Scikit-Learn. [(accessed on 26 February 2023)]. Available online: https://www.datasource.ai/es/data-science-articles/view-source:https://www.datasource.ai/es/data-science-articles/una-guia-para-principiantes-sobre-la-regresion-lineal-en-python-con-scikit-learn.

Evaluando El Error En Los Modelos de Clasificación—Aprende IA. [(accessed on 13 January 2023)]. Available online: https://aprendeia.com/evaluando-el-error-en-los-modelos-de-clasificacion-machine-learning/

Producto Interno Bruto (PIB)|Banco de La República. [(accessed on 18 November 2022)]; Available online: https://www.banrep.gov.co/es/glosario/producto-interno-bruto-pib.

Google Earth Engine. [(accessed on 3 February 2023)]. Available online: https://earthengine.google.com.

Microsoft Power BI [(accessed on 4 February 2023)]. Available online: https://app.powerbi.com/view?r=eyJrIjoiNThmZTJmZWYtOWFhMy00OGE1LWFiNDAtMTJmYjM0NDA5NGY2IiwidCI6ImJmYjdlMTNhLTdmYjctNDAxNi04MzBjLWQzNzE2ZThkZDhiOCJ9.

Empleo y Desempleo. [(accessed on 18 November 2022)]; Available online: https://www.dane.gov.co/index.php/estadisticas-por-tema/mercado-laboral/empleo-y-desempleo.

COVID-19 Community Mobility Report. [(accessed on 18 November 2022)]. Available online: https://www.google.com/covid19/mobility?hl=en.

Casos Positivos de COVID-19 en Colombia | Datos Abiertos Colombia. [(accessed on 1 December 2022)]; Available online: https://www.datos.gov.co/Salud-y-Protecci-n-Social/Casos-positivos-de-COVID-19-en-Colombia/gt2j-8ykr.

Multidimensional Index of COVID-19 Colombia. [(accessed on 26 February 2023)]. Available online: https://www.kaggle.com/datasets/sebastianrgonzalez/covid19-colombia.

Sambangi S., Gondi L. A Machine Learning Approach for DDoS (Distributed Denial of Service) Attack Detection Using Multiple Linear Regression. Proceedings. 2020;63:51. doi: 10.3390/proceedings2020063051. DOI

Zach RMSE vs. R-Squared: Which Metric Should You Use? Statology. 2021. [(accessed on 26 February 2023)]. Available online: https://www.statology.org/rmse-vs-r-squared/

Explaining Negative R-Squared. [(accessed on 30 June 2023)]. Available online: https://towardsdatascience.com/explaining-negative-r-squared-17894ca26321.

John V., Liu Z., Guo C., Mita S., Kidono K. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 9431. Springer; Berlin/Heidelberg, Germany: 2016. Real-time lane estimation Using Deep features and extra trees regression; pp. 721–733. DOI

Blobel B., Oemig F., Ruotsalainen P., Lopez D.M. Transformation of Health and Social Care Systems—An Interdisciplinary Approach Toward a Foundational Architecture. Front. Med. 2022;9:802487. doi: 10.3389/fmed.2022.802487. PubMed DOI PMC

Post Pruning Decision Trees with Cost Complexity Pruning. [(accessed on 14 December 2022)]. Available online: https://scikit-learn/stable/auto_examples/tree/plot_cost_complexity_pruning.html.

sklearn.model_selection.GridSearchCV. [(accessed on 6 January 2023)]. Available online: https://scikit-learn/stable/modules/generated/sklearn.model_selection.GridSearchCV.html.

Random Forest Python. [(accessed on 14 January 2023)]. Available online: https://www.cienciadedatos.net/documentos/py08_random_forest_python.html.

Gradient Boosting Con Python. [(accessed on 16 January 2023)]. Available online: https://www.cienciadedatos.net/documentos/py09_gradient_boosting_python.html.

sklearn.ensemble.ExtraTreesRegressor. [(accessed on 13 January 2023)]. Available online: https://scikit-learn/stable/modules/generated/sklearn.ensemble.ExtraTreesRegressor.html.

sklearn.ensemble.AdaBoostRegressor. [(accessed on 13 January 2023)]. Available online: https://scikit-learn/stable/modules/generated/sklearn.ensemble.AdaBoostRegressor.html.

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...