The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets
The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets
DOI :
https://doi.org/10.21814/rpe.18044Mots-clés :
Regression tree model, General linear model, National Exam of Upper Secondary Education (ENEM), Complex datasetsRésumé
Any quantitative method is shaped by certain rules or assumptions which constitute its own rationale. It is not by chance that these assumptions determine the conditions and constraints which permit the evidence to be constructed. In this article, we argue why the Regression Tree Method’s rationale is more suitable than General Linear Model to analyze complex educational datasets. Furthermore, we apply the CART algorithm of Regression Tree Method and the Multiple Linear Regression in a model with 53 predictors, taking as outcome the students’ scores in reading of the 2011’s edition of the National Exam of Upper Secondary Education (ENEM; N = 3,670,089), which is a complex educational dataset. This empirical comparison illustrates how the Regression Tree Method is better suitable than General Linear Model for furnishing evidence about non-linear relationships, as well as, to deal with nominal variables with many categories and ordinal variables. We conclude that the Regression Tree Method constructs better evidence about the relationships between the predictors and the outcome in complex datasets.
##plugins.generic.usageStats.downloads##
Références
Brasil/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira [INEP]. (2015). Relatório pedagógico: Enem 2011-2012. Inep.
Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. (1984). Classification and regression trees. Chapman & Hall/CRC.
Ferreira, M. G. & Gomes, C. M. A. (2017). Intraindividual analysis of the Zarit Burden Interview: a Brazilian case study. Alzheimers & Dementia, 13, P1163-P1164. https://doi.org/0.1016/j.jalz.2017.06.1710
Fleith, D. S. & Gomes, C. M. A. (2019). Students' assessment of teaching practices for creativity in graduate programs. Avaliação Psicológica, 18(3), 306-315. https://doi.org/10.15689/ap.2019.1803.15579.10
Gauer, G., Gomes, C. M. A. & Haase V. G. (2010). Neuropsicometria: Modelo clássico e análise de Rasch. In L. F. Alloy-Diniz (Org.), Avaliação Neuropsicológica (pp. 22-30). Artmed.
Geurts, P., Irrthum, A. & Wehenkel, L. (2009). Supervised learning with decision tree-based methods in computational and systems biology. Molecular Biosystems, 5(12), 1593-1605. https://doi.org/10.1039/b907946g
Golino, H. F. & Gomes, C. M. A. (2014a). Four Machine Learning methods to predict academic achievement of college students: a comparison study. Revista E-Psi, 1, 68-101. https://revistaepsi.com/artigo/2014-ano4-volume1-artigo4/
Golino, H. F. & Gomes, C. M. A. (2014b). Visualizing random forest’s prediction results. Psychology, 5, 2084-2098. https://doi.org/10.4236/psych.2014.519211
Golino, H. F. & Gomes, C. M. A. (2016). Random forest as an imputation method for education and psychology research: its impact on item fit and difficulty of the Rasch model. International Journal of Research & Method in Education, 39(4), 401-421. https://doi.org/10.1080/1743727X.2016.1168798
Golino, H. F. & Gomes, C. M. A. (2019) TDRI: Teste de Desenvolvimento do Raciocínio Indutivo. Hogrefe.
Golino, H. F., Gomes, C. M. A., Amantes, A. & Coelho, G. (2015). Psicometria contemporânea: compreendendo os Modelos Rasch (1.ª ed). Casa do Psicólogo.
Gomes, C. M. A. (2013). A construção de uma medida em abordagens de aprendizagem. Psico (PUCRS. Online), 44(2), 193-203. http://revistaseletronicas.pucrs.br/ojs/index.php/revistapsico/article/view/11371
Gomes, C. M. A. (2020). Análises estatísticas para estudos de intervenção. In M. Mansur-Alves & J. B. Lopes-Silva (Orgs.), Intervenção cognitiva: dos conceitos às práticas baseadas em evidências para diferentes aplicações (pp. 93-107). T.Ser.
Gomes, C. M. A. & Almeida, L. S. (2017). Advocating the broad use of the decision tree method in Education. Practical Assessment, Research & Evaluation, 22(10), 1-10.
Gomes, C. M. A., Almeida, L. S. & Núñez, J. C. (2017). Rationale and applicability of exploratory structural equation modeling (ESEM) in psychoeducational contexts. Psicothema, 29(3), 396-401. https://doi.org/10.7334/psicothema2016.369
Gomes, C.M.A., Amantes, A. & Jelihovschi, E.G. (2020). Applying the regression tree method to predict students’ science achievement. Trends in Psychology, 28, 99-117. https://doi.org/10.9788/s43076-019-00002-5
Gomes, C. M. A., Araujo, J., Nascimento, E. & Jelihovschi, E. (2018). Routine Psychological Testing of the Individual Is Not Valid. Psychological Reports, 122(4), 1576-1593. https://doi.org/10.1177/0033294118785636
Gomes, C. M. A., Araujo, J. & Jelihovschi, E. G. (2020). Approaches to learning in the non-academic context: construct validity of Learning Approaches Test in Video Game (LAT-Video Game). International Journal of Development Research, 10(11), 41842-41849. https://doi.org/10.37118/ijdr.20350.11.2020
Gomes, C. M. A., Fleith, D. S., Marinho-Araujo, C. M. & Rabelo, M. L. (2020). Predictors of students’ mathematics achievement in secondary education. Psicologia: Teoria e Pesquisa, 36, e3638. https://doi.org/10.1590/0102.3772e3638
Gomes, C. M. A. & Gjikuria, J. (2017). Comparing the ESEM and CFA approaches to analyze the Big Five factors. Avaliação Psicológica, 16(3), 261-267. https://doi.org/10.15689/ap.2017.1603.12118
Gomes, C. M. A., Golino, H. F. & Costa, B. C. G. (2013). Dynamic system approach in psychology: proposition and application in the study of emotion, appraisal and cognitive achievement. Problems of Psychology in the 21st Century, 6, 15-28. http://www.journals.indexcopernicus.com/abstracted.php?level=5&icid=1059487
Gomes, C. M. A. & Jelihovschi, E. (2016). Proposing a new approach and a rigorous cut-off value for identifying precognition. Measurement, 93, 117-125. https://doi.org/10.1016/j.measurement.2016.06.066
Gomes, C. M. A. & Jelihovschi, E. (2019). Presenting the regression tree method and its application in a large-scale educational dataset. International Journal of Research & Method in Education, 43(2), 201-221. https://doi.org/10.1080/1743727X.2019.1654992
Gomes, C. M. A., Lemos, G. C. & Jelihovschi, E. G. (2020). Comparing the predictive power of the CART and CTREE algorithms. Avaliação Psicológica, 19(1), 87-96. https://doi.org/10.15689/ap.2020.1901.17737.10
Gomes, C. M. A., Linhares, I. S., Jelihovschi, E. G. & Rodrigues, M. N. S. (2021). Introducing rationality and content validity of SLAT-Thinking. International Journal of Development Research, 11(1), 43264-43272, https://doi.org/10.37118/ijdr.20586.01.2021
Gomes, C. M. A. & Nascimento, D. F. (2021). Presenting SLAT-Thinking Second Version and its contente validity. International Journal of Development Research, 11(3), 45590-45596. https://doi.org/10.37118/ijdr.21368.03.2021
Gomes, C. M. A., Nascimento, D. & Araujo, J. (2021a). Medindo a Inteligência Fluida: o Teste de Indução da Bateria de Fatores Cognitivos de Alta-Ordem (BAFACALO). Research Gate. https://doi.org/10.13140/RG.2.2.17087.84641/3
Gomes, C. M. A., Nascimento, D. & Araujo, J. (2021b). Projeto de Testes Gratuitos e Abertos do LAICO: Teste de Velocidade Perceptiva 3 da BAFACALO. Research Gate. https://doi.org/10.13140/RG.2.2.36278.42563/2
Gomes, C. M. A., Nascimento, D. & Araujo, J. (2021c). Teste de Velocidade Perceptiva 2 da Bateria de Fatores Cognitivos de Alta-Ordem (BAFACALO): Disponibilização Aberta e Gratuita aos Testes de Medida de Rapidez Cognitiva do LAICO. Research Gate. https://doi.org/10.13140/RG.2.2.29567.53928/2
Gomes, C. M. A., Nascimento, E. & Peres, A. J. S. (2019). Investigating causal relations in personality by combining path analysis and Search algoritms. Poster. 3rd World Conference on Personality, World Association for Personality Psychology (WAPP), Hanoi, Vietnam.
Gomes, C. M. A. & Valentini, F. (2019). Time series in educational psychology: application in the study of cognitive achievement. European Journal of Education Studies, 6(8), 214-229. https://doi.org/10.5281/zenodo.3551953
Härnqvist, K. (1975). The international study of educational achievement. Review of Research in Education, 3, 85-109. http://rre.aera.net
James, G., Witten, D., Hastie, T. & Tibshirani, R. (2013). An introduction to statistical learning with applications in R. Springer.
Jelihovschi, E. G. & Gomes, C. M. A. (2019). Proposing an achievement simulation methodology to allow the estimation of individual in clinical testing context. Revista Brasileira de Biometria, 37(4), 1-10. https://doi.org/10.28951/rbb.v37i4.423
Kuhn, M. (2017). caret: Classification and regression training. https://CRAN.Rproject.org/package=caret
Lantz, B. (2015). Machine learning with R. Packt Publishing.
Lumley, T. (2020). Bounded memory linear and generalized linear models [Package ‘biglm’]. https://cran.r-project.org/web/packages/biglm/biglm.pdf
Matos, D. A. S., Brown, G. T. L. & Gomes, C. M. A. (2019). Bifactor invariance analysis of student conceptions of assessment inventory. Psico-USF, 24(4), 737-750. https://doi.org/10.1590/1413-82712019240411
Ministério da Educação [MEC]/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira [INEP] (2012). Microdados do ENEM – 2011. Exame Nacional do Ensino Médio: Manual do Usuário. MEC/INEP.
Ministério da Educação [MEC]/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira [INEP]. (2013). Exame Nacional do Ensino Médio (Enem): Relatório pedagógico 2009-2010. INEP/MEC.
OECD (2019). PISA 2018 Assessment and Analytical Framework. OECD Publishing. https://doi.org/10.1787/b25efab8-en
Pazeto, T. C. B., Dias, N. M., Gomes, C. M. A. & Seabra, A. G. (2019). Prediction of arithmetic competence: role of cognitive abilities, socioeconomic variables and the perception of the teacher in early childhood education. Estudos de Psicologia, 24(3), 225-236. https://doi.org/10.22491/1678-4669.20190024
Pazeto, T. C. B., Dias, N. M., Gomes, C. M. A. & Seabra, A. G. (2020). Prediction of reading and writing in elementary education through early childhood education. Psicologia: Ciência e Profissão, 40, e205497, 1-14. https://doi.org/10.1590/1982-3703003205497
Pereira, B. L. S., Golino, M. T. S. & Gomes, C. M. A. (2019). Investigando os efeitos do programa de enriquecimento instrumental básico em um estudo de caso único. European Journal of Education Studies, 6(7). https://doi.org/10.5281/zenodo.3477577
Pires, A. A. M. & Gomes, C. M. A. (2017). Three mistaken procedures in the elaboration of school exams: explicitness and discussion. PONTE International Scientific Researches Journal, 73(3), 1-14. https://doi.org/10.21506/j.ponte.2017.3.1
Pires, A. A. M. & Gomes, C. M. A. (2018). Proposing a method to create metacognitive school exams. European Journal of Education Studies, 5(8), 119-142. https://doi.org/10.5281/zenodo.2313538
R Core Team (2017). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org
Rodrigues, M. N. S. & Gomes, C. M. A. (2020). Testing the hypothesis that the deep approach generates better academic performance. International Journal of Development Research, 10(12), 42925-42935. https://doi.org/10.37118/ijdr.20579.12.2020
Rokach, L. & Maimon, O. (2015). Data mining with decision trees: theory and applications. World Scientific Publishing.
Therneau, T. M. & Atkinson, E. J. (2015). An introduction to recursive partitioning using the rpart routines. https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf
Zhang, H. & Singer, B. H. (2010). Recursive partitioning and applications. Springer.
Téléchargements
Publiée
Comment citer
Numéro
Rubrique
Licence
(c) Tous droits réservés Portuguese Journal of Education 2021
Ce travail est disponible sous licence Creative Commons Attribution - Partage dans les Mêmes Conditions 4.0 International.
1. Autores conservam os direitos de autor e concedem à revista o direito de primeira publicação, com o trabalho simultaneamente licenciado sob a Licença Creative Commons Attribution 4.0 CC-BY-SA que permite a partilha do trabalho com reconhecimento da autoria e publicação inicial nesta revista;
2. Autores e autoras têm autorização para assumir contratos adicionais separadamente para distribuição não-exclusiva da versão do trabalho publicada nesta revista (ex.: depositar em repositório institucional ou como capítulo de livro), com reconhecimento de autoria e publicação inicial nesta revista;
3. Autores e autoras têm permissão e são estimulado/as a publicar e distribuir o seu trabalho online (ex.: em repositórios institucionais ou na sua página pessoal), já que isso pode aumentar o impacto e a citação do trabalho publicado (Veja O Efeito do Acesso Livre).
Esta obra está licenciada sob uma Licença Creative Commons - Atribuição Compartilhamento pela mesma Licença Internacional 4.0