Brazilian Journal of Biometrics http://200.131.250.9/index.php/BBJ <p class="western" align="justify"><strong><span style="font-family: Arial;"><span style="font-family: Arial,serif;"><span lang="en-US">Promoting the development and application of statistical and data science methods to biological sciences. </span></span></span></strong><span style="font-family: Arial;"><span style="font-family: Arial,serif;"><span lang="en-US">The general objective of the journal is to publish original research papers that explore, promote and extend <span class="fontstyle0">statistical, mathematical and data science </span>methods in applied biological sciences.</span></span></span><span style="font-family: Arial;"><span style="font-family: Arial,serif;"><span lang="en-US"><br /></span></span></span></p> <p class="western" align="justify"><span style="font-family: Arial;"><span style="font-family: Arial,serif;"><span lang="en-US">Brazilian Journal of Biometrics is the official journal of the <a href="http://www.rbras.org.br/" target="_blank" rel="noopener">Brazilian Region of the International Biometric Society (RBras)</a>.</span></span></span></p> en-US <p><strong>Authors who publish with this journal agree to the following terms:</strong><br /><br /></p> <ol type="a"> <ol type="a"> <li><strong>Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a <a href="http://creativecommons.org/licenses/by/3.0/" target="_new">Creative Commons Attribution License</a> that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.</strong></li> <li><strong>Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.</strong></li> <li><strong>Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See <a href="http://opcit.eprints.org/oacitation-biblio.html" target="_new">The Effect of Open Access</a>).</strong></li> </ol> </ol> tales.jfernandes@ufla.br (Tales Jesus Fernandes) scalon@ufla.br (João Domingos Scalon) Fri, 30 Aug 2024 16:26:43 -0300 OJS 3.3.0.10 http://blogs.law.harvard.edu/tech/rss 60 Improving class probability estimates in asymmetric health data classification: An experimental comparison of novel calibration methods http://200.131.250.9/index.php/BBJ/article/view/684 <p>In the context of health data classification, imbalanced and asymmetric class distributions can significantly impact the performance of machine learning models. One critical aspect affected by these issues is the reliability of class probability estimates, which are crucial for informed decision-making in healthcare applications. Instead of predicting class values directly for a classification problem, it can be more convenient to predict the probability of an observation belonging to each possible class. This research aims to address the challenges posed by imbalanced and asymmetric responses in health data classification by evaluating the effectiveness of recent calibration methods in improving class probability estimates. We propose Beta calibration techniques and the Stratified Brier score and Jaccard's Score as novel calibration methods and evaluation metrics respectively. The experimental comparison involves implementing and assessing various calibration techniques to determine their impact on model performance and calibration accuracy of simulated and healthcare datasets with varying imbalance ratios. Our results show that the Beta calibration method consistently improved the classifiers' predictive ability. The findings of this study provide valuable insights into selecting the most suitable calibration method for enhancing class probability estimates in healthcare-related machine learning tasks.</p> Olushina Olawale Awe, Babatunde Adebola Adedeji, Ronaldo Dias Copyright (c) 2024 Olushina Olawale Awe, Babatunde Adebola Adedeji, Ronaldo Dias https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/684 Fri, 30 Aug 2024 00:00:00 -0300 Stratified sampling for roots biomass quantification in shifting cultivation in Amazon Brazil http://200.131.250.9/index.php/BBJ/article/view/663 <p>Several countries have been paying attention to carbon stocks and balances in the soil, a characteristic related to land management and use. Among the biomes that have great participation in the maintenance of these stocks, the Amazon biome stands out, which has great diversity by area. With the advances in markets aimed at buying carbon credits, estimates of the values of these stocks are highly susceptible to the intrinsic characteristics of the location. In order to solve these problems, several soil sampling techniques have been used to estimate these values. However, soil sampling techniques vary greatly in the amount of soil sampled, directly impacting the values of these estimates. In this sense, the present work aims to evaluate the point and interval estimates of carbon stocks in the soil in a peripheral region of the Brazilian Amazon, in the state of Maranhão. For this, three soil sampling techniques were compared, the large monolith (LM), the small monolith (SM) and the auger (RA). Considering a stratified sampling plan (STR), in which the different sampled depths are considered as strata, its efficiency was compared to a simple random sampling (SRS) and its amplitudes with the simulation through the Bootstrap technique. The samples were obtained by washing the samples and separating them into &lt; 2mm and &gt; 2mm for two biological groups (babassu roots and other roots). For interval estimates with the LM collection method, roots larger than 2 mm have a total of 2.56 to 4.62 t ha–1, and for smaller roots, 1.67 to 4.33 t ha–1. As for babassu roots, these values ranged from 0.38 to 1.44 t ha–1 and those smaller than 2 mm from 0.86 to 2.43. In contrast, the LM collection method can be replaced by SM and RA only for thick roots (&gt; 2 mm). Regarding the STR sampling plan, the variance of the total was reduced in relation to the SRS. The bootstrap technique managed to reduce the amplitude of the intervals to the total, showing an improvement in accuracy. Therefore, estimates of carbon stocks can be made for the RA method for stored carbon, but the method for carbon that will return to the atmosphere the LM method is the most suitable.</p> João Thiago Rodrigues de Sousa, Mike Lovatto, Marllon Fernando, Santiago Germán Delgado, Idemauro Antonio Rodrigues de Lara Copyright (c) 2024 João Thiago Rodrigues de Sousa, Mike Lovatto, Marllon Fernando, Santiago Germán Delgado, Idemauro Antonio Rodrigues de Lara https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/663 Fri, 30 Aug 2024 00:00:00 -0300 Preliminary estimators of population mean using ranked set sampling in the presence of measurement error and non-response error with applications and simulation study http://200.131.250.9/index.php/BBJ/article/view/702 <p>In order to estimate the population mean in the presence of both non-response and measurement errors that are uncorrelated, the paper presents some novel estimators employing ranked set sampling by utilizing auxiliary information. Up to the first order of approximation, the equations for the bias and mean squared error of the suggested estimators are produced, and it is found that the proposed estimators outperform the other existing estimators analysed in this study. Investigations using simulation studies and numerical examples show how well the suggested estimators perform in the presence of measurement and non-response errors. The relative efficiency of the suggested estimators compared to the existing estimators has been expressed as a percentage, and the impact of measurement errors has been expressed as a percentage computation of measurement errors.</p> Rajesh Singh, Anamika Kumari Copyright (c) 2024 Rajesh Singh, Anamika Kumari https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/702 Fri, 30 Aug 2024 00:00:00 -0300 Regression models applied to rhizosphere data: A bibliometric review http://200.131.250.9/index.php/BBJ/article/view/692 <p>The interaction of soil with plant roots in the rhizosphere plays an important role in various ecosystem services and food production, and it has been the focus of numerous studies. In turn, statistical modeling can aid in a more comprehensive understanding of this interaction, such as the application of regression models to rhizosphere data. Thus, the main objective of this work was to develop the first bibliometric analysis on regression models applied to rhizosphere data. Bibliometric data were obtained from Web of Science and Scopus databases. We use the topic retrival as ((“Rhizosphere”) AND (“Regression models” OR “Regression model” OR “Generalized Linear Models” OR “Generalized Linear Model”)) to search for scientific articles that contain these keywords in their title, abstract, or keywords. The search encompassed articles published from 1900 to 2022, resulting in 34 articles, with the earliest record dating back to 1985. While studies of the rhizosphere are increasing, few studies apply regression models to their data. The use of more advanced techniques, such as Generalized Linear Models (MLG), Artificial Neural Network (ANN), Random Forest Model (RFM), Support Vector Machines (SVM), and Generalized Boosted Regression Modeling (GBM), became evident from 2016 onwards, which was associated with the computational advances and the development of artificial intelligence. Some articles demonstrated that the use of more robust models can provide more meaningful results to the researcher. Only one article was published in a journal dedicated to statistics, highlighting the diffusion of regression models into various fields. Collaborations involving co-authorship between researchers from different countries have led to higher citation rates, increasing the importance of the research to the scientific community. Perhaps one of the most notable limitations to increasing research using regression models is the absence of a statistician or researcher within the research groups who is well versed in statistical models and procedures.</p> Aline Martineli Batista, Fábio Prataviera Copyright (c) 2024 Aline Martineli Batista, Fábio Prataviera https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/692 Fri, 30 Aug 2024 00:00:00 -0300 Analysis of stunting in East Java, Indonesia using random forest and geographically weighted random forest regression http://200.131.250.9/index.php/BBJ/article/view/679 <p>Stunting is one of the problems that the world focuses on today to be resolved immediately. World Health Organization (WHO) stipulates that a country’s public health problems are said to be chronic if the stunting prevalence rate reaches more than 20%.The prevalence rate of stunting in Indonesia in 2021 reached 24.4%. This study aims to analyze factors that correlate with the prevalence of stunting in East Java Province using machine learning methods: Random Forest Regression (RFR) and Geographically Weighted Random Forest (GWRF) methods. The results of this research are the factors that correlate with the prevalence of stunting based on the RFR method, namely the number of babies who get early breastfeeding initiation, the number of malnourished toddlers, and the number of active integrated health posts. The RFR method results in RMSE values of 3.014, MAPE 11.69%, and R2 0.8168. The factors that correlate with the prevalence of stunting based on the GWRF method are divided into six groups according to the similarity of factors that correlate with stunting in the regency/city. The GWRF method gives better results than the RFR indicated by the resulting RMSE values of 1.023, MAPE 4.45%, and R2 0.9788.</p> Yuliani Setia Dewi, Silvia Hastuti, Mohamad Fatekurohman Copyright (c) 2024 Yuliani Setia Dewi, Silvia Hastuti, Mohamad Fatekurohman https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/679 Fri, 30 Aug 2024 00:00:00 -0300 A comprehensive statistical analysis of Malaria dynamics in the Adamawa region of Cameroon, from 2018 to 2022 http://200.131.250.9/index.php/BBJ/article/view/703 <p>Malaria remains a prominent public health concern in Cameroon, with the potential for epidemic outbreaks, necessitating a robust understanding of its dynamics. This paper uses routinely collected surveillance data from health facilities in the Adamawa Region since January 2018. By applying statistical analysis, this study aims to enhance comprehension, enable data predictions, and facilitate informed decision-making for public health policy implementation. Focusing on weekly health districts data spanning from 2018 to 2022, our analysis employs key statistical metrics for central tendency, data spread, distribution shape, and variable dependence. The study reveals distinctive trends, highlighting peak malaria transmission periods consistently occurring between August and November each year. The highest weekly recorded case count in any health district reached 1,294. The data exhibits leptokurtic distributions, skewed to the left of the median. And in 2022, 11% of the population was reported to have contracted malaria. Despite an overall region-wide average growth rate of -1.21% over the past five years, maintaining vigilant attention to this critical health issue is imperative. Auto dependence analysis indicates that observations are weekly correlated, assuming the time series as stationary. The stationarity has been confirmed by ADF and KPSS tests that we performed. This comprehensive data analysis helps our understanding of the malaria landscape in the Adamawa Region of Cameroon. The paper also recommends the inclusion of additional variables in data collection for a more holistic perspective. These findings provide a basis for the formulation and implementation of targeted interventions by relevant stakeholders, aiding in the prediction of future cases and ultimately contributing to the effective management of malaria in the region.</p> Apollinaire Batoure Bamana, Ezekiel Dangbe, Hamadjam Abboubakar, Mahdi Shafiee Kamalabad Copyright (c) 2024 Apollinaire Batoure Bamana, Ezekiel Dangbe, Hamadjam Abboubakar, Mahdi Shafiee Kamalabad https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/703 Fri, 30 Aug 2024 00:00:00 -0300 Bayesian modeling of the Gompertz curve for meat quails growth data considering different error distributions http://200.131.250.9/index.php/BBJ/article/view/699 <p>This study applied the Gompertz model to quail growth data, assuming symmetric and asymmetric homoscedastic and heteroscedastic error distributions (Normal, t-Student, Skew normal, and Skew t), undera Bayesian framework. Model selection criteria included the Bayesian Deviance Information Criterion (DIC) and the analysis of residual standard deviation (σ), as well as graphical assessment of the fit. For both homoscedastic error structures (males: DIC=7.186; σ=10.73) and (females: DIC=5.572; σ=11.88) as well as heteroscedastic structures (males: DIC=6.493; σ=0.795) and (females: DIC=4.405; σ=0.824), the best fits were obtained by considering the Skew t distribution for errors. In homoscedastic fits, significant residual asymmetry (λ) was observed only for female quails (CI(λ)=[-8.039;-0.340]), whereas in heteroscedastic fits, the parameter was not significant for both sexes. Additionally, heteroscedasticity (δ) captured in the fits was significant for both sexes (males: CI(δ)=[1.66;2.13] and females: CI(δ)=[1.80;2.26]). Understanding animal growth is crucial for optimizing management and feeding practices, reducing time and costs in production. In this case, the use of nonlinear models considering heteroscedastic and/or asymmetric residual structures contributes to greater accuracy in decision-making.</p> Mateus Zubioli Faccin, Robson Marcelo Rossi Copyright (c) 2024 Mateus Zubioli Faccin, Robson Marcelo Rossi https://creativecommons.org/licenses/by/4.0 http://200.131.250.9/index.php/BBJ/article/view/699 Fri, 30 Aug 2024 00:00:00 -0300