Acessibilidade / Reportar erro

Genome-enabled prediction through quantile random forest for complex traits

Predição genômica por meio do random forest quantílitico para características complexas

ABSTRACT:

Quantile Random Forest (QRF) is a non-parametric methodology that combines the advantages of Random Forest (RF) and Quantile Regression (QR). Specifically, this approach can explore non-linear functions, determining the probability distribution of a response variable and extracting information from different quantiles instead of just predicting the mean. This evaluated the performance of the QRF in the genomic prediction for complex traits (epistasis and dominance). In addition, compare the accuracies obtained with those derived from the G-BLUP. The simulation created an F2 population with 1,000 individuals and genotyped for 4,010 SNP markers. Besides, twelve traits were simulated from a model considering additive and non-additive effects, QTL (Quantitative trait loci) numbers ranging from eight to 120, and heritability of 0.3, 0.5, or 0.8. For training and validation, the 5-fold cross-validation approach was used. For each fold, the accuracies of all the proposed models were calculated: QRF in five different quantiles and three G-BLUP models (additive effect, additive and epistatic effects, additive and dominant effects). Finally, the predictive performance of these methodologies was compared. In all scenarios, the QRF accuracies were equal to or greater than the methodologies evaluated and proved to be an alternative tool to predict genetic values in complex traits.

Key words:
genomic selection; accuracy; epistasis; dominance; prediction

Universidade Federal de Santa Maria Universidade Federal de Santa Maria, Centro de Ciências Rurais , 97105-900 Santa Maria RS Brazil , Tel.: +55 55 3220-8698 , Fax: +55 55 3220-8695 - Santa Maria - RS - Brazil
E-mail: cienciarural@mail.ufsm.br