Clustering calf growth curves using quantile regression and unsupervised learning
Conteúdo do artigo principal
Resumo
The study of growth characteristics can be crucial to the profitability of animal and plant production. An important aspect to be considered in this type of modeling is the potential presence of heterogeneous sample variances. The Quantile Regression (QR) methodology does not impose any distributional assumptions on the model error, such as normality or constant variances, making it an interesting alternative to conventional regression models. Additionally, it can provide more information about the relationship between the independent variable and the response by fitting different quantiles. This study analyzes data related to the weights in kilograms of 28 calves over a period of 26 weeks after birth. The objectives were to examine QR as an alternative to conventional methods for growth data, considering asymmetry and heterogeneity of residual variances, and to use it to classify animals into groups with different growth patterns. Furthermore, the clusters obtained by QR are compared with clusters obtained by unsupervised machine learning algorithms, a widely used statistical tool nowadays. QR proved to be a more robust alternative to conventional regression models and provided clustering that compete with unsupervised machine learning algorithms. Therefore, it can be recommended for inference purposes as well as for reference in clusters.
Detalhes do artigo

Este trabalho está licenciado sob uma licença Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Referências
1. Barbosa, A., Carneiro, P., Rezende, M., Ramos, I., Martins Filho, R&Malhado, C. M. Parâmetros
genéticos para características de crescimento e reprodutivas em bovinos Nelore no Brasil.
Archivos de zootecnia 66, 449–452. doi:https://doi.org/10.21071/az.v66i255.2523 (2017).
2. Buchinsky, M. The dynamics of changes in the femalewage distribution in theUSA: a quantile
regression approach. Journal of applied econometrics 13, 1–30. doi:https://doi.org/10.1002/(SICI)
1099-1255(199801/02)13:1<1::AID-JAE474>3.0.CO;2-A (1998).
3. Carvalho, S. d. P. C. Estimativa volumétrica por modelo misto e tecnologia laser aerotransportado em plantios clonais de Eucalyptus sp PhD thesis (Universidade de São Paulo, 2013).
4. Chen, K., Ying, Z., Zhang, H. & Zhao, L. Analysis of least absolute deviation. Biometrika 95,
107–122 (2008).
5. Da Silva, N. A. M., de Aquino, L. H., Fonseca, F., Muniz, J. A., et al. Estudo de parâmetros de
crescimento de bezerros Nelore por meio de um modelo de regressão linear: uma abordagem
Bayesiana. Ciência Animal Brasileira 7, 57–65 (2006).
6. De Rezende, M., da Silveira, M., da Silva, R., da Silva, L., Gondo, A, Ramires, G., de Souza, J.,
et al. Pre and post weaning weight gain in Nellore cattle raised in the Pantanal, Mato Grosso
do Sul, Brazil. Ciência Animal 24, 20–27 (2014).
7. Dufrenot, G., Mignon, V. & Tsangarides, C. The trade-growth nexus in the developing countries:
A quantile regression approach. Review ofWorld Economics 146, 731–761. doi:https://doi.
org/10.1007/s10290-010-0067-5 (2010).
8. Dunn, P. K. & Smyth, G. K. Randomized quantile residuals. Journal of Computational and
Graphical Statistics 5, 236–244. doi:10.1080/10618600.1996.10474708 (1996).
9. Farias, A. A. et al. Uso de regressão quantílica na predição da produção de povoamentos de
eucalipto (2018).
10. Fernandes, T. J., Pereira, A. A., Muniz, J. A. & Savian, T. V. Seleção de modelos não lineares
para a descrição das curvas de crescimento do fruto do cafeeiro (2014).
11. Fitzenberger, B., Koenker, R. & Machado, J. A. Economic applications of quantile regression
doi:https : / /doi . org / 10 .1007 / s00181 - 021 - 02186 - 1 (Springer Science & Business Media,
2013).
12. Geraci, M. & Bottai, M. Linear quantile mixed models. Statistics and computing 24, 461–479.
doi:https://doi.org/10.1007/s11222-013-9381-9 (2014).
13. Geraci, M.&Bottai, M.Quantile regression for longitudinal data using the asymmetric Laplace
distribution. Biostatistics 8, 140–154. doi:https://doi.org/10.1093/biostatistics/kxj039 (2007).
14. Hao, L & Naiman, D. Quantile Regression, Sage Publication 2007.
15. Hartigan, J. A. &Wong, M. A. Algorithm AS 136: A k-means clustering algorithm. Journal of
the royal statistical society. series c (applied statistics) 28, 100–108. doi:https://doi.org/10.2307/
2346830 (1979).
16. Hinostroza, A. A. A. Regressão quantílica bayesiana em modelos de fronteira de produção estocástica PhD thesis (Universidade Federal de Rio de Janeiro, 2017).
17. Johnson, R. A.,Wichern, D. W., et al. Applied multivariate statistical analysis (2002).
18. Kocherginsky, M., He, X. & Mu, Y. Practical confidence intervals for regression quantiles.
Journal of Computational and Graphical Statistics 14, 41–55. doi:https : / / doi . org / 10 . 1198 /
106186005X27563 (2005).
19. Koenker, R.&Bassett Jr, G. Regression quantiles. Econometrica: journal of the Econometric Society, 33–50. doi:https://doi.org/10.2307/1913643 (1978).
20. Koenker, R. & Machado, J. A. Goodness of fit and related inference processes for quantile
regression. Journal of the american statistical association 94, 1296–1310. doi:https://doi.org/10.
1080/01621459.1999.10473882 (1999).
21. Koenker, R. W. & d’Orey, V. Algorithm AS 229: Computing regression quantiles. Journal of
the Royal Statistical Society. Series C (Applied Statistics) 36, 383–393. doi:https://doi.org/10.2307/
2347802 (1987).
22. Laureano, M., Boligon, A., Costa, R., Forni, S, Severo, J. & Albuquerque, L. G. d. Estimativas
de herdabilidade e tendências genéticas para características de crescimento e reprodutivas em
bovinos da raça Nelore: Estimates of heritability and genetic trends for growth and reproduction
traits in Nelore cattle. Arquivo Brasileiro de Medicina Veterinária e Zootecnia 63, 143–152.
doi:10.1590/S0102-09352011000100022 (2011).
23. Li, Q., Xi, R., Lin, N., et al. Bayesian regularized quantile regression. Bayesian Analysis 5, 533–
556. doi:10.1214/10-BA521 (2010).
24. Morales, C. E. G. Quantile Regression for Mixed-Effects Models (2015).
25. Muggeo, V. M., Sciandra, M., Tomasello, A. & Calvo, S. Estimating growth charts via nonparametric quantile regression: a practical framework with application in ecology. Environmental and ecological statistics 20, 519–531. doi:https://doi.org/10.1007/s10651-012-0232-1
(2013).
26. Nascimento, M, Nascimento, A., Dekkers, J. & Serão, N. Using quantile regression methodology
to evaluate changes in the shape of growth curves in pigs selected for increased feed
efficiency based on residual feed intake. Animal 13, 1009–1019. doi:https://doi.org/10.1017/
S1751731118002616 (2019).
27. Oliveira, A. M. H. C. d. & Rios-Neto, E. L. G. Tendências da desigualdade salarial para coortes
de mulheres brancas e negras no Brasil. Estudos Econômicos (São Paulo) 36, 205–236. doi:https:
//doi.org/10.1590/S0101-41612006000200001 (2006).
28. Pollice, A., Muggeo, V. M., Torretta, F., Bochicchio, R. & Amato, M. Growth curves of sorghum
roots via quantile regression with P-splines in 47th Scientific Meeting of the Italian Statistical Society (2014).
29. Puiatti, G. A., Cecon, P. R.,Nascimento, M.,Nascimento, A. C. C., Carneiro, A. P. S., Puiatti,
M., Oliveira, A. C. R. d., et al. Quantile regression of nonlinear models to describe different
levels of dry matter accumulation in garlic plants. Ciência Rural 48. doi:https://doi.org/10.
1590/0103-8478cr20170322 (2018).
30. R Development Core Team. R: A Language and Environment for Statistical Computing ISBN 3-
900051-07-0. R Foundation for Statistical Computing (Vienna, Austria, 2020). http://www.Rproject.
org.
31. Reich, B. J., Bondell, H. D.&Wang, H. J. Flexible Bayesian quantile regression for independent
and clustered data. Biostatistics 11, 337–352. doi:https://doi.org/10.1093/biostatistics/kxp049
(2010).
32. Santos, B. & Bolfarine, H. Bayesian quantile regression analysis for continuous data with a
discrete component at zero. Statistical Modelling 18, 73–93. doi:https : / / doi . org / 10 . 1177 /
1471082X17719633 (2018).
33. Santos, P. M. d.,Nascimento, A. C. C.,Nascimento, M., Azevedo, C. F., Mota, R. R., Guimarães,
S. E. F., Lopes, P. S., et al. Use of regularized quantile regression to predict the genetic merit
of pigs for asymmetric carcass traits. Pesquisa Agropecuária Brasileira 53, 1011–1017. doi:https:
//doi.org/10.1590/S0100-204X2018000900004 (2018).
34. Silveira, M., Souza, J. d., Silva, L., Freitas, J., Gondo, A & Ferraz Filho, P. Interação genótipo x
ambiente sobre características produtivas e reprodutivas de fêmeasNelore. Archivos de zootecnia
63, 223–226. doi:https://dx.doi.org/10.4321/S0004-05922014000100026 (2014).
35. Singer, J. M., Rocha, F. M. & Nobre, J. S. Graphical tools for detecting departures from linear
mixed model assumptions and some remedial measures. International Statistical Review 85, 290–
324. doi:https://doi.org/10.1111/insr.12178 (2017).
36. Sorrell, B. K., Tanner, C. C. & Brix, H. Regression analysis of growth responses to water depth
in three wetland plant species. AoB Plants 2012. doi:https://doi.org/10.1093/aobpla/pls043
(2012).
37. Troster, V., Shahbaz, M. & Uddin, G. S. Renewable energy, oil prices, and economic activity:
A Granger-causality in quantiles analysis. Energy Economics 70, 440–452. doi:https://doi.org/
10.1016/j.eneco.2018.01.029 (2018).
38. Yu, K. & Moyeed, R. A. Bayesian quantile regression. Statistics & Probability Letters 54, 437–
447. doi:https://doi.org/10.1016/S0167-7152(01)00124-9 (2001).
39. Yu, K. & Zhang, J. A three-parameter asymmetric Laplace distribution and its extension. Communications in Statistics—Theory and Methods 34, 1867–1879. doi:https : / /doi . org / 10 . 1080 /03610920500199018 (2005).
40. Yuan, Y. & Yin, G. Bayesian quantile regression for longitudinal studies with nonignorable
missing data. Biometrics 66, 105–114. doi:https://doi.org/10.1111/j.1541-0420.2009.01269.x
(2010).
41. Zietz, J., Zietz, E. N. & Sirmans, G. S. Determinants of house prices: a quantile regression
approach. The Journal of Real Estate Finance and Economics 37, 317–333. doi:https://doi.org/10.
1007/s11146-007-9053-7 (2008).