Socioeconomic factors and machine learning algorithms applied to neglected diseases risk prediction.

Case study in the municipalities of the Goiás State and Federal District, Brazil

Authors

DOI:

https://doi.org/10.18055/Finis28635

Abstract

Analyzing the relation between socioeconomic variables and neglected tropical diseases can help managers in the conception of public policies to reduce cases. The objective of this study was to evaluate, based on machine learning algorithms, which socioeconomic variables are more important for the risk classification of three neglected diseases: leprosy, cutaneous leishmaniasis, and dengue. Three algorithms based on decision trees were evaluated: Random Forest (RF), XGBoost, and C5.0. As a study area, the municipalities of the state of Goiás and of the Federal District – Brazil, were delimited. For the dengue risk classes, both the RF algorithm and the XGBoost showed accuracy values above 0.6. Both emphasizing the low-income conditions, literacy, and race as the most important predictive variables. In the leprosy risk classes case, the three algorithms presented accuracy results above 0.6, indicating the variables water supply, literacy, race, and housing as important. For the tegumentary leishmaniasis risk classes, the algorithms showed an accuracy lower than 0.4, making the evaluation of possible predictive variables to the model unfeasible. The three evaluated algorithms revealed approximate predictive performance; however, the RF was slightly higher. The most important socioeconomic variables for dengue and leprosy risk classes prediction were similar.

Downloads

Download data is not yet available.

Downloads

Published

2022-12-30

How to Cite

Gioia, T., Juliana Ramalho Barros, & Renato Rodrigues da Silva. (2022). Socioeconomic factors and machine learning algorithms applied to neglected diseases risk prediction.: Case study in the municipalities of the Goiás State and Federal District, Brazil. Finisterra, 57(121), 109–123. https://doi.org/10.18055/Finis28635

Issue

Section

Thematic Section - Health and territory in a post-pandemic context