Analysis of stunting in East Java, Indonesia using random forest and geographically weighted random forest regression

Main Article Content

Yuliani Setia Dewi
https://orcid.org/0000-0002-1831-9082
Silvia Hastuti
https://orcid.org/0009-0009-7055-7457
Mohamad Fatekurohman

Abstract

Stunting is one of the problems that the world focuses on today to be resolved immediately. World Health Organization (WHO) stipulates that a country’s public health problems are said to be chronic if the stunting prevalence rate reaches more than 20%.The prevalence rate of stunting in Indonesia in 2021 reached 24.4%. This study aims to analyze factors that correlate with the prevalence of stunting in East Java Province using machine learning methods: Random Forest Regression (RFR) and Geographically Weighted Random Forest (GWRF) methods. The results of this research are the factors that correlate with the prevalence of stunting based on the RFR method, namely the number of babies who get early breastfeeding initiation, the number of malnourished toddlers, and the number of active integrated health posts. The RFR method results in RMSE values of 3.014, MAPE 11.69%, and R2 0.8168. The factors that correlate with the prevalence of stunting based on the GWRF method are divided into six groups according to the similarity of factors that correlate with stunting in the regency/city. The GWRF method gives better results than the RFR indicated by the resulting RMSE values of 1.023, MAPE 4.45%, and R2 0.9788.

Article Details

How to Cite
Dewi, Y. S., Hastuti, S. ., & Fatekurohman, M. . (2024). Analysis of stunting in East Java, Indonesia using random forest and geographically weighted random forest regression. Brazilian Journal of Biometrics, 42(3), 213–224. https://doi.org/10.28951/bjb.v42i3.679
Section
Articles

References

Abdullah, A. Z., Thaha, R. M., Hidayanty, H., Sirajuddin, S., Syafar, M., et al. Risk factor and interventions of behavioral changing strategy in acceleration of stunting prevention: A systematic review. Enfermería Clínica 31, S636–S639 (2021). https://doi.org/10.1016/j.enfcli.2021.07.008

Ahmed, K. Y., Agho, K. E., Page, A., Arora, A., Ogbo, F. A., Maternal, G. & (GloMACH), C. H. R. C. Mapping geographical differences and examining the determinants of childhood stunting in Ethiopia: a Bayesian geostatistical analysis. Nutrients 13, 2104 (2021). https://doi.org/10.3390/nu13062104

Al Azies, H., Cholid, F. & Trishnanti, D. Pemetaan Faktor-Faktor yang Mempengaruhi Stunting pada Balita dengan Geographically Weighted Regression (GWR). semnaskes, 156–165 (2019). http://dx.doi.org/10.17605/OSF.IO/9MZU7

Bitew, F. H., Sparks, C. S. & Nyarko, S. H. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public health nutrition 25, 269–280 (2022). https://doi.org/10.1017/S1368980021004262

BPS, J. T. Provinsi Jawa Timur dalam Angka 2022 (BPS Provinsi Jawa Timur, 2022).

Breusch, T. S. & Pagan, A. R. A simple test for heteroscedasticity and random coefficient variation. Econometrica: Journal of the econometric society, 1287–1294 (1979). https://doi.org/10.2307/1911963

Budge, S., Parker, A. H., Hutchings, P. T. & Garbutt, C. Environmental enteric dysfunction and child stunting. Nutrition reviews 77, 240–253 (2019). https://doi.org/10.1093/nutrit/nuy068

Chilyabanyama, O. N. et al. Performance of machine learning classifiers in classifying stunting among under-five children in Zambia. Children 9, 1082 (2022). https://doi.org/10.3390/children9071082

De Onis, M. et al. Prevalence thresholds for wasting, overweight and stunting in children under 5 years. Public health nutrition 22, 175–179 (2019). https://doi.org/10.1017/s1368980018002434

Dinkes, J. T. Profil Kesehatan Jawa Timur Tahun 2021 (Dinas Kesehatan Provinsi Jawa Timur, 2021).

Du, Y., Deng, F. & Liao, F. A model framework for discovering the spatio-temporal usage patterns of public free-floating bike-sharing system. Transportation Research Part C: Emerging Technologies 103, 39–55 (2019). https://doi.org/10.1016/j.trc.2019.04.006

Feng, L., Wang, Y., Zhang, Z. & Du, Q. Geographically and temporally weighted neural network for winter wheat yield prediction. Remote Sensing of Environment 262, 112514 (2021). https://doi.org/10.1016/j.rse.2021.112514

Georganos, S., Grippa, T., Niang Gadiaga, A., Linard, C., Lennert, M., Vanhuysse, S., Mboga, N.,Wolff, E. & Kalogirou, S. Geographical random forests: a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. Geocarto International 36, 121–136 (2021). https://doi.org/10.1080/10106049.2019.1595177

James, G., Witten, D., Hastie, T., Tibshirani, R., et al. An introduction to statistical learning (Springer, 2013).

Kadir, S.Nutritional needs of fish to prevent stunting in early childhood. Journal of Xi’an Shiyou University, Natural Science Edition 17, 477–484 (2021).

Kemenkes. Buku Saku Hasil Studi Status Gizi Indonesia (SSGI) Tingkat Nasional, Provinsi, Kabupaten/Kota Tahun 2021 (Badan Penelitian dan Pengembangan Kesehatan, 2021).

Kwami, C. S., Godfrey, S., Gavilan, H., Lakhanpaul, M. & Parikh, P. Water, sanitation, and hygiene: linkages with stunting in rural Ethiopia. International journal of environmental research and public health 16, 3793 (2019). https://doi.org/10.3390/ijerph16203793

Liang, H., Guo, Z.,Wu, J.&Chen, Z.GDP spatialization inNingbo City based onNPP/VIIRS night-time light and auxiliary data using random forest regression. Advances in Space Research 65, 481–493 (2020). https://doi.org/10.1016/j.asr.2019.09.035

Luo, Y., Yan, J. & McClure, S. Distribution of the environmental and socioeconomic risk factors on COVID-19 death rate across continental USA: a spatial nonlinear analysis. Environmental Science and Pollution Research 28, 6587–6599 (2021). https://doi.org/10.1007/s11356-020-10962-2

Luo, Y., Yan, J., McClure, S. C. & Li, F. Socioeconomic and environmental factors of poverty in China using geographically weighted randomforest regression model. Environmental Science and Pollution Research, 1–13 (2022). https://doi.org/10.1007/s11356-021-17513-3

Meador, M. R. Historical changes in fish communities in urban streams of the south-eastern United States and the relative importance of water-quality stressors. Ecology of Freshwater Fish 29, 156–169 (2020). https://doi.org/10.1111/eff.12503

Menon, P., Headey, D., Avula, R. & Nguyen, P. H. Understanding the geographical burden of stunting in India: A regression-decomposition analysis of district-level data from 2015–16. Maternal & child nutrition 14, e12620 (2018). https://doi.org/10.1111/mcn.12620

Mohammed, S. H., Muhammad, F., Pakzad, R. & Alizadeh, S. Socioeconomic inequality in stunting among under-5 children in Ethiopia: a decomposition analysis. BMC research notes 12, 1–5 (2019). https://doi.org/10.1186/s13104-019-4229-9

Muche, A., Melaku, M. S., Amsalu, E. T. & Adane, M. Using geographically weighted regression analysis to cluster under-nutrition and its predictors among under-five children in Ethiopia: evidence from demographic and health survey. PloS one 16, e0248156 (2021). https://doi.org/10.1371/journal.pone.0248156

Ouedraogo, I., Defourny, P. & Vanclooster, M. Application of random forest regression and comparison of its performance to multiple linear regression in modeling groundwater nitrate concentration at the African continent scale. Hydrogeology Journal (2018). https://doi.org/10.1007/s10040-018-1900-5

Probst, P.,Wright, M. N. & Boulesteix, A.-L. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: data mining and knowledge discovery 9, e1301 (2019). https://doi.org/10.1002/widm.1301

Roediger, R., Hendrixson, D. T. & Manary, M. J. A roadmap to reduce stunting (2020) https://doi.org/10.1093%2Fajcn%2Fnqaa205

Roth, M., Michiels, H.-G., Puhlmann, H., Sucker, C. & Hauck, M. Multiple soil factors explain eutrophication signals in the understorey vegetation of temperate forests. Journal of Vegetation Science 32, e13063 (2021). https://doi.org/10.1111/jvs.13063

Santos, F., Graw, V. & Bonilla, S. A geographically weighted random forest approach for evaluate forest change drivers in the Northern Ecuadorian Amazon. PLoS One 14, e0226224 (2019). https://doi.org/10.1371/journal.pone.0226224

Schonlau, M. & Zou, R. Y. The random forest algorithm for statistical learning. The Stata Journal 20, 3–29 (2020). https://doi.org/10.1177/1536867X20909688

Sisimayi, C., Mupandawana, M., Mutambwa, M., Sisimayi, T. & Njovo, H. Assessing the Multi-Dimensional Risk of Stunting Amongst ChildrenUnder FiveYears in Zimbabwe (2021).

Titaley, C. R., Ariawan, I., Hapsari, D., Muasyaroh, A. & Dibley, M. J. Determinants of the stunting of children under two years old in Indonesia: A multilevel analysis of the 2013 Indonesia basic health survey. Nutrients 11, 1106 (2019). https://doi.org/10.3390/nu11051106

Zhao, X., Yu, B., Liu, Y., Chen, Z., Li, Q., Wang, C. & Wu, J. Estimation of poverty using random forest regression with multi-source data: A case study in Bangladesh. Remote Sensing 11, 375 (2019). https://doi.org/10.3390/rs11040375