The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets

The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets

Authors

DOI:

https://doi.org/10.21814/rpe.18044

Keywords:

Regression tree model, General linear model, National Exam of Upper Secondary Education (ENEM), Complex datasets

Abstract

Any quantitative method is shaped by certain rules or assumptions which constitute its own rationale. It is not by chance that these assumptions determine the conditions and constraints which permit the evidence to be constructed. In this article, we argue why the Regression Tree Method’s rationale is more suitable than General Linear Model to analyze complex educational datasets. Furthermore, we apply the CART algorithm of Regression Tree Method and the Multiple Linear Regression in a model with 53 predictors, taking as outcome the students’ scores in reading of the 2011’s edition of the National Exam of Upper Secondary Education (ENEM; N = 3,670,089), which is a complex educational dataset. This empirical comparison illustrates how the Regression Tree Method is better suitable than General Linear Model for furnishing evidence about non-linear relationships, as well as, to deal with nominal variables with many categories and ordinal variables. We conclude that the Regression Tree Method constructs better evidence about the relationships between the predictors and the outcome in complex datasets.

Downloads

Download data is not yet available.

References

Brasil/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira [INEP]. (2015). Relatório pedagógico: Enem 2011-2012. Inep.

Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. (1984). Classification and regression trees. Chapman & Hall/CRC.

Ferreira, M. G. & Gomes, C. M. A. (2017). Intraindividual analysis of the Zarit Burden Interview: a Brazilian case study. Alzheimers & Dementia, 13, P1163-P1164. https://doi.org/0.1016/j.jalz.2017.06.1710

Fleith, D. S. & Gomes, C. M. A. (2019). Students' assessment of teaching practices for creativity in graduate programs. Avaliação Psicológica, 18(3), 306-315. https://doi.org/10.15689/ap.2019.1803.15579.10

Gauer, G., Gomes, C. M. A. & Haase V. G. (2010). Neuropsicometria: Modelo clássico e análise de Rasch. In L. F. Alloy-Diniz (Org.), Avaliação Neuropsicológica (pp. 22-30). Artmed.

Geurts, P., Irrthum, A. & Wehenkel, L. (2009). Supervised learning with decision tree-based methods in computational and systems biology. Molecular Biosystems, 5(12), 1593-1605. https://doi.org/10.1039/b907946g

Golino, H. F. & Gomes, C. M. A. (2014a). Four Machine Learning methods to predict academic achievement of college students: a comparison study. Revista E-Psi, 1, 68-101. https://revistaepsi.com/artigo/2014-ano4-volume1-artigo4/

Golino, H. F. & Gomes, C. M. A. (2014b). Visualizing random forest’s prediction results. Psychology, 5, 2084-2098. https://doi.org/10.4236/psych.2014.519211

Golino, H. F. & Gomes, C. M. A. (2016). Random forest as an imputation method for education and psychology research: its impact on item fit and difficulty of the Rasch model. International Journal of Research & Method in Education, 39(4), 401-421. https://doi.org/10.1080/1743727X.2016.1168798

Golino, H. F. & Gomes, C. M. A. (2019) TDRI: Teste de Desenvolvimento do Raciocínio Indutivo. Hogrefe.

Golino, H. F., Gomes, C. M. A., Amantes, A. & Coelho, G. (2015). Psicometria contemporânea: compreendendo os Modelos Rasch (1.ª ed). Casa do Psicólogo.

Gomes, C. M. A. (2013). A construção de uma medida em abordagens de aprendizagem. Psico (PUCRS. Online), 44(2), 193-203. http://revistaseletronicas.pucrs.br/ojs/index.php/revistapsico/article/view/11371

Gomes, C. M. A. (2020). Análises estatísticas para estudos de intervenção. In M. Mansur-Alves & J. B. Lopes-Silva (Orgs.), Intervenção cognitiva: dos conceitos às práticas baseadas em evidências para diferentes aplicações (pp. 93-107). T.Ser.

Gomes, C. M. A. & Almeida, L. S. (2017). Advocating the broad use of the decision tree method in Education. Practical Assessment, Research & Evaluation, 22(10), 1-10.

Gomes, C. M. A., Almeida, L. S. & Núñez, J. C. (2017). Rationale and applicability of exploratory structural equation modeling (ESEM) in psychoeducational contexts. Psicothema, 29(3), 396-401. https://doi.org/10.7334/psicothema2016.369

Gomes, C.M.A., Amantes, A. & Jelihovschi, E.G. (2020). Applying the regression tree method to predict students’ science achievement. Trends in Psychology, 28, 99-117. https://doi.org/10.9788/s43076-019-00002-5

Gomes, C. M. A., Araujo, J., Nascimento, E. & Jelihovschi, E. (2018). Routine Psychological Testing of the Individual Is Not Valid. Psychological Reports, 122(4), 1576-1593. https://doi.org/10.1177/0033294118785636

Gomes, C. M. A., Araujo, J. & Jelihovschi, E. G. (2020). Approaches to learning in the non-academic context: construct validity of Learning Approaches Test in Video Game (LAT-Video Game). International Journal of Development Research, 10(11), 41842-41849. https://doi.org/10.37118/ijdr.20350.11.2020

Gomes, C. M. A., Fleith, D. S., Marinho-Araujo, C. M. & Rabelo, M. L. (2020). Predictors of students’ mathematics achievement in secondary education. Psicologia: Teoria e Pesquisa, 36, e3638. https://doi.org/10.1590/0102.3772e3638

Gomes, C. M. A. & Gjikuria, J. (2017). Comparing the ESEM and CFA approaches to analyze the Big Five factors. Avaliação Psicológica, 16(3), 261-267. https://doi.org/10.15689/ap.2017.1603.12118

Gomes, C. M. A., Golino, H. F. & Costa, B. C. G. (2013). Dynamic system approach in psychology: proposition and application in the study of emotion, appraisal and cognitive achievement. Problems of Psychology in the 21st Century, 6, 15-28. http://www.journals.indexcopernicus.com/abstracted.php?level=5&icid=1059487

Gomes, C. M. A. & Jelihovschi, E. (2016). Proposing a new approach and a rigorous cut-off value for identifying precognition. Measurement, 93, 117-125. https://doi.org/10.1016/j.measurement.2016.06.066

Gomes, C. M. A. & Jelihovschi, E. (2019). Presenting the regression tree method and its application in a large-scale educational dataset. International Journal of Research & Method in Education, 43(2), 201-221. https://doi.org/10.1080/1743727X.2019.1654992

Gomes, C. M. A., Lemos, G. C. & Jelihovschi, E. G. (2020). Comparing the predictive power of the CART and CTREE algorithms. Avaliação Psicológica, 19(1), 87-96. https://doi.org/10.15689/ap.2020.1901.17737.10

Gomes, C. M. A., Linhares, I. S., Jelihovschi, E. G. & Rodrigues, M. N. S. (2021). Introducing rationality and content validity of SLAT-Thinking. International Journal of Development Research, 11(1), 43264-43272, https://doi.org/10.37118/ijdr.20586.01.2021

Gomes, C. M. A. & Nascimento, D. F. (2021). Presenting SLAT-Thinking Second Version and its contente validity. International Journal of Development Research, 11(3), 45590-45596. https://doi.org/10.37118/ijdr.21368.03.2021

Gomes, C. M. A., Nascimento, D. & Araujo, J. (2021a). Medindo a Inteligência Fluida: o Teste de Indução da Bateria de Fatores Cognitivos de Alta-Ordem (BAFACALO). Research Gate. https://doi.org/10.13140/RG.2.2.17087.84641/3

Gomes, C. M. A., Nascimento, D. & Araujo, J. (2021b). Projeto de Testes Gratuitos e Abertos do LAICO: Teste de Velocidade Perceptiva 3 da BAFACALO. Research Gate. https://doi.org/10.13140/RG.2.2.36278.42563/2

Gomes, C. M. A., Nascimento, D. & Araujo, J. (2021c). Teste de Velocidade Perceptiva 2 da Bateria de Fatores Cognitivos de Alta-Ordem (BAFACALO): Disponibilização Aberta e Gratuita aos Testes de Medida de Rapidez Cognitiva do LAICO. Research Gate. https://doi.org/10.13140/RG.2.2.29567.53928/2

Gomes, C. M. A., Nascimento, E. & Peres, A. J. S. (2019). Investigating causal relations in personality by combining path analysis and Search algoritms. Poster. 3rd World Conference on Personality, World Association for Personality Psychology (WAPP), Hanoi, Vietnam.

Gomes, C. M. A. & Valentini, F. (2019). Time series in educational psychology: application in the study of cognitive achievement. European Journal of Education Studies, 6(8), 214-229. https://doi.org/10.5281/zenodo.3551953

Härnqvist, K. (1975). The international study of educational achievement. Review of Research in Education, 3, 85-109. http://rre.aera.net

James, G., Witten, D., Hastie, T. & Tibshirani, R. (2013). An introduction to statistical learning with applications in R. Springer.

Jelihovschi, E. G. & Gomes, C. M. A. (2019). Proposing an achievement simulation methodology to allow the estimation of individual in clinical testing context. Revista Brasileira de Biometria, 37(4), 1-10. https://doi.org/10.28951/rbb.v37i4.423

Kuhn, M. (2017). caret: Classification and regression training. https://CRAN.Rproject.org/package=caret

Lantz, B. (2015). Machine learning with R. Packt Publishing.

Lumley, T. (2020). Bounded memory linear and generalized linear models [Package ‘biglm’]. https://cran.r-project.org/web/packages/biglm/biglm.pdf

Matos, D. A. S., Brown, G. T. L. & Gomes, C. M. A. (2019). Bifactor invariance analysis of student conceptions of assessment inventory. Psico-USF, 24(4), 737-750. https://doi.org/10.1590/1413-82712019240411

Ministério da Educação [MEC]/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira [INEP] (2012). Microdados do ENEM – 2011. Exame Nacional do Ensino Médio: Manual do Usuário. MEC/INEP.

Ministério da Educação [MEC]/Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira [INEP]. (2013). Exame Nacional do Ensino Médio (Enem): Relatório pedagógico 2009-2010. INEP/MEC.

OECD (2019). PISA 2018 Assessment and Analytical Framework. OECD Publishing. https://doi.org/10.1787/b25efab8-en

Pazeto, T. C. B., Dias, N. M., Gomes, C. M. A. & Seabra, A. G. (2019). Prediction of arithmetic competence: role of cognitive abilities, socioeconomic variables and the perception of the teacher in early childhood education. Estudos de Psicologia, 24(3), 225-236. https://doi.org/10.22491/1678-4669.20190024

Pazeto, T. C. B., Dias, N. M., Gomes, C. M. A. & Seabra, A. G. (2020). Prediction of reading and writing in elementary education through early childhood education. Psicologia: Ciência e Profissão, 40, e205497, 1-14. https://doi.org/10.1590/1982-3703003205497

Pereira, B. L. S., Golino, M. T. S. & Gomes, C. M. A. (2019). Investigando os efeitos do programa de enriquecimento instrumental básico em um estudo de caso único. European Journal of Education Studies, 6(7). https://doi.org/10.5281/zenodo.3477577

Pires, A. A. M. & Gomes, C. M. A. (2017). Three mistaken procedures in the elaboration of school exams: explicitness and discussion. PONTE International Scientific Researches Journal, 73(3), 1-14. https://doi.org/10.21506/j.ponte.2017.3.1

Pires, A. A. M. & Gomes, C. M. A. (2018). Proposing a method to create metacognitive school exams. European Journal of Education Studies, 5(8), 119-142. https://doi.org/10.5281/zenodo.2313538

R Core Team (2017). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org

Rodrigues, M. N. S. & Gomes, C. M. A. (2020). Testing the hypothesis that the deep approach generates better academic performance. International Journal of Development Research, 10(12), 42925-42935. https://doi.org/10.37118/ijdr.20579.12.2020

Rokach, L. & Maimon, O. (2015). Data mining with decision trees: theory and applications. World Scientific Publishing.

Therneau, T. M. & Atkinson, E. J. (2015). An introduction to recursive partitioning using the rpart routines. https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf

Zhang, H. & Singer, B. H. (2010). Recursive partitioning and applications. Springer.

Downloads

Published

2021-12-30

How to Cite

Gomes, C. M. A., Lemos, G. C., & Jelihovschi, E. G. (2021). The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets: The reasons why the Regression Tree Method is more suitable than General Linear Model to analyze complex educational datasets. Portuguese Journal of Education, 34(2), 42–63. https://doi.org/10.21814/rpe.18044

Issue

Section

Artigos