Acessibilidade / Reportar erro

Application of bayesian additive regression trees in the development of credit scoring models in Brazil

Abstract

Paper aims

This paper presents a comparison of the performances of the Bayesian additive regression trees (BART), Random Forest (RF) and the logistic regression model (LRM) for the development of credit scoring models.

Originality

It is not usual the use of BART methodology for the analysis of credit scoring data. The database was provided by Serasa-Experian with information regarding direct retail consumer credit operations. The use of credit bureau variables is not usual in academic papers.

Research method

Several models were adjusted and their performances were compared by using regular methods.

Main findings

The analysis confirms the superiority of the BART model over the LRM for the analyzed data. RF was superior to LRM only for the balanced sample. The best-adjusted BART model was superior to RF.

Implications for theory and practice

The paper suggests that the use of BART or RF may bring better results for credit scoring modelling.

Keywords
Credit; Machine learning; Logistic regression; BART; Random Forest

Associação Brasileira de Engenharia de Produção Av. Prof. Almeida Prado, Travessa 2, 128 - 2º andar - Room 231, 05508-900 São Paulo - SP - São Paulo - SP - Brazil
E-mail: production@editoracubo.com.br