Tree based algorithms: implementing Conditional Inference Trees and Forest for old age Frailty Index predictions

Speaker: Carlotta Montorsi (Luxembourg Institute of Socio-Economic Research (LISER), Luxembourg)
Title: Tree based algorithms: implementing Conditional Inference Trees and Forest for old age Frailty Index predictions
Time: Wednesday, 2021.02.24, 10:00 a.m. (CET)
Place: fully virtual (contact Dr. Jakub Lengiewicz to register)
Format: 30 min. presentation + 30 min. discussion

Abstract: Tree based algorithms are prediction algorithms introduced by Morgan and Sonquist (1963) and popularized by Breiman et al. (1984) almost 20 years later. These algorithms aim at predicting an outcome “out of sample” based on a number of covariates. This is done partitioning the space of the regressors in nonoverlapping regions. When the task is regression, the predicted income is simply the average outcome of units reaching each terminal node. Various methods to grow trees avoiding overfitting exist: Conditional Inference Trees introduced by Hothorn et al. (2006) prevent overfitting by growing the tree conditioning the splitting to a sequence of statistical tests.

In the upcoming presentation I will present a brief theoretical introduction to Tree based model with particular focus on Conditional Inference Trees and Conditional Inference Forest. Thus, I will show its implementation on real data, namely for predicting a Frailty Index of individual aged 50+ from different European Countries. Moreover, I will compare the predictive performance of these algorithms with other traditional ML methods and for different subsamples of the training set. Finally, I will discuss the “best predictors” identified by the most accurate of these algorithms in the different subsamples.

References:
[1] Morgan, J. N., and  Sonquist,J. A. (1963). Problems in the Analysis of Survey Data, and a Proposal, Journal of the American Statistical Association, 58(302), 415– 34.
[2] Breiman, L.,Friedman,J.,Stone,C. and R. Olshen (1984). Classification and Regression Trees, Taylor & Francis, Belmont, https://doi.org/10.1201/9781315139470.
[3] Hothorn, T.,Hornik, K. and Zeileis, A. (2006). Unbiased Recursive Partitioning: A Conditional Inference Framework. Journal of Computational and Graphical Statistics.
[4] Brunori, P. and Neidhöfer, G. (2021). The Evolution of Inequality of Opportunity in Germany: A Machine Learning Approach. Review of Income and Wealth.