Acessibilidade / Reportar erro

Applications of linear mixed models in Cynodon spp. Breeding

Abstract

Species of the genus Cynodon are among the most cultivated forage crops in the world due to their high yield and nutritional quality, and its use in cattle feeding has been associated with gains in animal weight and increased milk production. The objective of this study was to model covariance structures in Cynodon spp. clones and study the changes in ranking of the selected genotypes, since it is believed that affects the ranking of genotypes. A total of 202 genotypes were evaluated in an experiment conducted in an augmented block design with four replications and four harvests. The genotypes were assessed for plant height, green weight, percentage of dry matter, and plant vigor. Nineteen repeated measures models with different covariance structures were tested. The best-fitted model adopts the CORH covariance structure for the genetic effects. Correct modeling of the covariance structure affected the ranking of genotypes in all variables evaluated.

Keywords:
Covariance structure; repeated measures; forage breeding; REML/BLUP

INTRODUCTION

Characterized by its rich genetic diversity and broad adaptation to different soil types, Cynodon is a genus of warm-season grasses globally distributed in tropical, subtropical and temperate regions (Singh et al. 2023Singh L, Wu Y, McCurdy JD, Stewart BR, Warburton ML, Baldwin BS, Dong H2023 Genetic diversity and population structure of bermudagrass (Cynodon spp.) revealed by genotyping-by-sequencing. Frontiers in Plant Science 14:1155721, Soares et al. 2023Soares PR, Galhano C, Gabriel R2023 Alternative methods to synthetic chemical control of Cynodon dactylon (L.). A systematic review. Agronomy for Sustainable Development 45:51). In Brazil, Cynodon spp. have been widely used as pasture forage for dairy cattle. Low production cost, ease of handling, good digestibility and palatability combined with grazing tolerance and high response to fertilization make Cynodon spp. a compelling forage choice for livestock (Araújo et al. 2018Araújo ED, Borges AC, Dias NM, Ribeiro DM2018 Effects of gibberellic acid on Tifton 85 bermudagrass (Cynodon spp.) in constructed wetland systems. PLoS One 13:1-26, Baxter et al. 2022Baxter LL, Anderson WF, Gates RN, Rios EF, Hancock DW2022 Moving warm-season forage bermudagrass (Cynodon spp.) into temperate regions of North America. Grass and Forage Science 77:141-150).

In Cynodon breeding programs, the phenotype of a given genotype is assessed through repeated measurements over the crop cycle. However, harvesting the same individual over time shows the existence of correlations between measures (Kozak and Piepho 2018Kozak M, Piepho HP2018 What’s normal anyway? Residual plots are more telling than significance tests when checking ANOVA assumptions. Journal of Agronomy and Crop Science 204:86-98, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21).

Mixed model provides a unified longitudinal data analysis strategy that facilitates the treatment of correlated experimental information, heterogeneous variances, and unbalanced databases (Chaves et al. 2021Chaves SFS, Alves RM, Alves RS, Sebbenn AM, Resende MDV, Dias LAS2021 Theobroma grandiflorum breeding optimization based on repeatability, stability and adaptability information. Euphytica 217:211, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21). Given these broad advantages, the mixed model methodology has been widely applied in perennial crop breeding programs (Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962, Rocha et al. 2018Rocha JRASC, Marçal TS, Salvador FV, Silva AC, Machado JC, Carneiro PCS2018 Genetic insights into elephantgrass persistence for bioenergy purpose. PLoS One 13:1-16, Shalizi and Isik 2019Shalizi MN, Isik F2019 Genetic parameter estimates and GxE interaction in a large cloned population of Pinus taeda L. Tree Genetics and Genomes 15:46, Brito da Silva et al. 2020Brito da Silva V, Daher RF, Souza YP, Menezes BRS, Santos EA, Freitas RS, Oliveira ES, Stida WF, Cassaro S2020 Assessment of energy production in full-sibling families of elephant grass by mixed models. Renewable Energy 146:744-749, Ferreira et al. 2020Ferreira FM, Rocha JRASC, Alves RS, Elizeu AM, Benites FRG, Resende MDV, Sobrinho FS, Bhering LL2020 Estimates of repeatability coefficients and optimum number of measures for genetic selection of Cynodon spp. Euphytica 216:70, Ferreira et al. 2021Ferreira FM, Bhering LL, Fernandes FD, Lédo FJS, Rangel JHA, Kopp M, Câmara TMM, Silva VQR, Machado JC2021 Optimal harvest number and genotypic evaluation of total dry biomass, stability, and adaptability of elephant grass clones for bioenergy purposes. Biomass and Bioenergy 149:106104, Malikouski et al. 2021Malikouski RG, Peixoto MA, Morais AL, Elizeu AM, Rocha JRASC, Zucoloto M, Bhering LL2021 Repeatability coefficient estimates and optimum number of harvests in graft/rootstock combinations for “tahiti” acid lime. Acta Scientiarum -Agronomy 43:1-10, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21).

Multiple statistical approaches are adopted to predict effects and model the covariance structure and correlation between measurements. However, intermediate covariance structures may be more efficient for evaluating perennial crops. These structures can assume homogeneity or heterogeneity of variances and covariances between measures and can be applied to the several random factors in the statistical model (Mariguele et al. 2011Mariguele KH, Resende MDV, Viana JMS, Silva FF, Silva PSL, Knop FC2011 Métodos de análise de dados longitudinais para o melhoramento genético da pinha. Pesquisa Agropecuaria Brasileira 46:1657-1664, Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962, Lara et al. 2019Lara LAC, Santos MF, Jank L, Chiari L, Vilela MM, Amadeu RR, Santos JPR, Pereira GS, Zeng Zeng, ZB ZB, Garcia AAF2019 Genomic selection with allele dosage in Panicum maximum Jacq. G3: Genes, Genomes, Genetics 9:2463-2475, Shalizi and Isik 2019Shalizi MN, Isik F2019 Genetic parameter estimates and GxE interaction in a large cloned population of Pinus taeda L. Tree Genetics and Genomes 15:46, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21).

Multivariate models are the most robust for analyzing repeated measurements, considering the correlation between crop seasons or measurements (Mariguele et al. 2011Mariguele KH, Resende MDV, Viana JMS, Silva FF, Silva PSL, Knop FC2011 Métodos de análise de dados longitudinais para o melhoramento genético da pinha. Pesquisa Agropecuaria Brasileira 46:1657-1664, Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962, Shalizi and Isik 2019Shalizi MN, Isik F2019 Genetic parameter estimates and GxE interaction in a large cloned population of Pinus taeda L. Tree Genetics and Genomes 15:46). However, fitting the model becomes challenging when considering more than three measurements due to the large number of parameters to be estimated and the high correlations between repeated measures, resulting in non-convergence in the variance component estimation process (Resende 2007Resende MDV2007 Matemática e estatística na análise de experimentos e no melhoramento genético. Embrapa Florestas, Colombo, 561p, Resende et al. 2014Resende MDV, Silva FF, Azevedo CF2014 Estatística matemática, biométrica e computacional: modelos mistos, multivariados, categóricos e generalizados (REML/BLUP), inferência bayesiana, regressão aleatória, seleção genômica, QTL, GWAS, estatística espacial e temporal, competição, sobrevivência. UFV, Viçosa, 881p).

The appropriate modeling of variance and covariance structures and temporal correlations between repeated measurements is essential for obtaining more accurate predictions of genetic and non-genetic effects (Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21). Despite the advantages of using mixed model methodologies in perennial crop breeding, there are few studies involving the estimation of genetic parameters and longitudinal data analysis in forage species (Rocha et al. 2018Rocha JRASC, Marçal TS, Salvador FV, Silva AC, Machado JC, Carneiro PCS2018 Genetic insights into elephantgrass persistence for bioenergy purpose. PLoS One 13:1-16, Lara et al. 2019Lara LAC, Santos MF, Jank L, Chiari L, Vilela MM, Amadeu RR, Santos JPR, Pereira GS, Zeng Zeng, ZB ZB, Garcia AAF2019 Genomic selection with allele dosage in Panicum maximum Jacq. G3: Genes, Genomes, Genetics 9:2463-2475, Ferreira et al. 2020Ferreira FM, Rocha JRASC, Alves RS, Elizeu AM, Benites FRG, Resende MDV, Sobrinho FS, Bhering LL2020 Estimates of repeatability coefficients and optimum number of measures for genetic selection of Cynodon spp. Euphytica 216:70, Ferreira et al. 2021Ferreira FM, Bhering LL, Fernandes FD, Lédo FJS, Rangel JHA, Kopp M, Câmara TMM, Silva VQR, Machado JC2021 Optimal harvest number and genotypic evaluation of total dry biomass, stability, and adaptability of elephant grass clones for bioenergy purposes. Biomass and Bioenergy 149:106104), and none with Cynodon. Since it is believed that the correct modeling of the covariance structure affects the ranking of elite genotypes, the objectives of this study were to (i) model covariance structures in Cynodon spp. clones to find the structure that best represents the data evaluated and (ii) study the changes in the ranking of the selected genotypes when using the simplest and the best-fitted models.

MATERIAL AND METHODS

The experiment was conducted in the experimental field of Embrapa Dairy Cattle, Valença, Rio de Janeiro, Brazil (lat 22º 14 '44" S, long 43º 42' 01" W, alt 560 m asl) in 2012 using an augmented block design with four replications. Spaced 0.5 m meters apart, each experimental plot consisted of one plant. A total of 197 progenies from self-fertilization of the cultivar Grama Estrela Roxa and five commercial checks were evaluated. The five checks Florona, Porto Rico, Roxa, Tifton 68 and Tifton 85 were referred to as T1, T2, T3, T4 and T5, respectively. Four harvests were made to evaluate plant height (PH), green weight (GW), percentage of dry matter (DM) and plant vigor (PV).

Plant height (cm) was obtained from the arithmetic mean of three random measurements in each plot, measured from the ground level to the curve of the last fully expanded leaf. Plant green weight (kg plot-1) was assessed by cutting 10 cm stubble height in the plots using a gasoline-powered trimmer and then the hand-harvested green biomass was weighed. The percentage of dry matter (%) was obtained by sampling plants from each plot, which were dried in a forced ventilation oven at 65 ºC for 72 hours. The samples were weighed again (dry weight) and the dry matter percentage was determined by the ratio between the dry weight and the fresh green weight. Phenotypic plant vigor was rated on a visual scale given by three evaluators, ranging from 1 to 5, where 1 indicates low plant vigor and 5 indicates high plant vigor.

The mixed model methodology (restricted maximum likelihood/best linear unbiased prediction - REML/BLUP) was applied to estimate the variance components and to predict genotypic values, according to Patterson and Thompson (1971Patterson HD, Thompson R1971 Recovery of inter-block information when block sizes are unequal. Biometrika 58:545-554) and Henderson and Quaas (1976Henderson CR, Quaas RL1976 Multiple trait evaluation using relatives’ records. Journal of Animal Science 43:1188-1197), respectively. The repeatability statistical model was given by:

y = X m + Z g + T i + Q p + e ,

where y(n x 1) is the vector of phenotypic data; m(j x 1) is the vector of the measurement-repeat combinations effects (assumed as fixed), added to the overall mean; g(ji x 1) is the vector of genotypic effects within measurement (assumed as random), g ~ NID (0, σg2); i(ij x 1) is the vector of the genotypes x measurements interactions (random), i ~ NID (0, σgm2); p(k x 1) is the vector of permanent plot effects (random), p ~ NID (0, σp2); and e(n x 1) is the vector of residuals (random), e ~ NID (0, σe2). σg2 is the genetic variance, σgm2 is the variance of the genotypes x measurements interactions, σp2 is the variance of the permanent plot effects and σe2 is the residual variance. The capital letters X(n x j), Z(n x ji), T(n x ij) and Q(n x k) represent the incidence matrix for m, g, i and p effects, respectively.

The covariance structures tested to model the residual effects were Compound Symmetry (CS), Heterogeneous diagonal (DIAGH), First-order autoregressive heterogeneous structure (AR1H), Second-order autoregressive heterogeneous structure (AR2H), Third-order autoregressive heterogeneous structure (AR3H), Power structure (PWR), Heterogeneous power structure (PWRH) and unstructured (US). After modeling residual effects, the permanent plot effects were modeled by the covariance structures of identity (IDV), DIAGH, heterogeneous compound symmetry (CORH), AR1H, AR2H and US. Lastly, the genotypic effect within measurement was modeled considering DIAGH, CORH, first-order analytical factor (FA1) and US covariance structures. A total of nineteen repeated measures models with different covariance structures were tested (Table 1).

Table 1
Values obtained by the Akaike information criterion (AIC) for all the models tested with the different covariance matrix structures and accuracy values

For models with the same number of fixed effects, the residual likelihoods are comparable and therefore information criteria based on the residual likelihood can be used (Verbyla 2019Verbyla AP2019 A note on model selection using information criteria for general linear models estimated using REML. Australian and New Zealand Journal of Statistics 61:39-50). The goodness-of-fit of the models to the data was tested by the Akaike information criterion (AIC) (Akaike 1974Akaike H1974 A new look at the statistical model identification. IEEE Trans Autom Control 19:716-723), with the lowest AIC value indicating the best-fitted model. To model the block effect, considering the nature of this effect as fixed or random, a modification of the AIC, proposed by Verbyla (2019Verbyla AP2019 A note on model selection using information criteria for general linear models estimated using REML. Australian and New Zealand Journal of Statistics 61:39-50), was adopted.

The significance of the random effects of the best-fitted model was tested using the likelihood ratio test (LRT) (Rao 1973Rao CR1973 Linear statistical inference and its applications. Wiley, Hoboken, 625p). This test evaluates the contribution of random effects to the model and provides the log of convergence (L) and deviation for the model with and without the effect to be tested. LRT also provides the difference between the deviations and compares the significance of this difference using the chi-square statistic with one degree of freedom and 0.01 of probability level.

Phenotypic variance (σ^phen2) and accuracy (rg^g) were obtained according Resende et al. (2014Resende MDV, Silva FF, Azevedo CF2014 Estatística matemática, biométrica e computacional: modelos mistos, multivariados, categóricos e generalizados (REML/BLUP), inferência bayesiana, regressão aleatória, seleção genômica, QTL, GWAS, estatística espacial e temporal, competição, sobrevivência. UFV, Viçosa, 881p):

σ ^ p h e n 2 = σ ^ g 2 + σ ^ p 2 + σ ^ g p 2 + σ ^ e 2 ,

r g ^ g = 1 - P E V σ ^ g 2

where σ^g2 is the genetic variance; σ^p2 is the variance of the permanent plot effect; σ^gp2 is the variance of the genotypes x measurements interaction and σ^e2 is the residual variance. PEV is the variance of the prediction error extracted from the diagonal of the generalized inverse matrix of coefficients of the mixed model equations.

The broad-sense heritability (H^c2) was estimated according to Cullis et al. (2006Cullis BR, Smith AB, Coombes NE2006 On the design of early generation variety trials with correlated data. Journal of Agricultural, Biological and Environmental Statistics 11:381-393):

H^c2=1-PEV2σ^g2 ,

The concordance of the selected genotypes in each pair of measurements was calculated using the Kappa coefficient (K) (Cohen 1960Cohen J1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37-46), given by:

K = A - C D - C

Where A is the number of matching selected genotypes, D is the number of selected genotypes (20) and C is the number of selected genotypes by chance (C = bD, and b is the selection percentage). Statistical analyses were performed in Rbio (Bhering 2017Bhering LL2017 Rbio: A tool for biometric and statistical analysis using the R platform. Crop Breeding and Applied Biotechnology 17:187-190) and R software (R Core Team 2024R Core Team2024 R: A language and environment for statistical computing. R Foundation for statistical computing. Vienna. Available at <https://www.r-project.org/>.
https://www.r-project.org...
) using the ASReml-R 4.1 package (Butler et al. 2017Butler DG, Cullis BR, Gilmour AR, Gogel BG, Thompson R2017 ASReml-R reference manual version 4. VSN International Ltd, Hemel Hempstead, 188p).

RESULTS AND DISCUSSION

Table 1 (models M1 and M2) shows the results obtained by the Akaike information criterion (AIC) with adaptations from Verbyla (2019Verbyla AP2019 A note on model selection using information criteria for general linear models estimated using REML. Australian and New Zealand Journal of Statistics 61:39-50), when considering the block effect as fixed or random, for the traits plant height (PH), green weight (GW), percentage of dry matter (DM) and plant vigor (PV). The AIC (Akaike 1974) is widely used to select the best-fitted model (Resende and Alves 2020Resende MDV, Alves RS2020 Linear, generalized, hierarchical, bayesian and random regression mixed models in genetics/genomics in plant breeding. Functional Plant Breeding Journal 2:1-31, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21). This criterion makes it possible to compare models with the same number of fixed effects, considering the maximum value of the logarithm of the likelihood function and the number of parameters estimated by the model. The modification of the AIC, proposed by Verbyla (2019), allows the comparison between models with different numbers of fixed effects.

Verbyla (2019Verbyla AP2019 A note on model selection using information criteria for general linear models estimated using REML. Australian and New Zealand Journal of Statistics 61:39-50) proposes that by partitioning the total log-likelihood into two portions, a marginal (residual) likelihood and a conditional likelihood, the models evaluated become comparable even if they have a different number of fixed effects. Thus, this modification made it possible to test whether the best-fitted model considers the block effect as fixed (M2) or as random (M1). For all the traits evaluated, the best-fitted model, according to the AIC modified by Verbyla (2019Verbyla AP2019 A note on model selection using information criteria for general linear models estimated using REML. Australian and New Zealand Journal of Statistics 61:39-50), was the one that considered block effect as a fixed. This choice corroborates Resende and Duarte (2007Resende MDV, Duarte JB2007 Precisão e controle de qualidade em experimentos de avaliação de cultivares. Pesquisa Agropecuária Tropical 37:182-194), who propose that if the number of blocks is less than or equal to five, it is preferable to treat this effect as fixed. Once the nature of the block effect had been defined as fixed, other covariance structures were tested for the other effects of the model, for the four traits evaluated (Table 1).

For the residual effects, the covariance structures that had the best fit were AR2H (M5) for plant height, DIAGH (M3) for dry matter percentage, AR1H for green weight, and AR3H for plant vigor. All these selected covariance structures suppose heterogeneity of variances, showing the effect of the environmental conditions on the residual effects of the harvests evaluated (Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962).

Heterogeneity of variances and covariances is often found for perennial crops, however, these heterogeneities are usually not taken into account in studies involving perennial crops (Acharya et al. 2020Acharya JP, Lopez B, Gouveia BT, Oliveira IB, Resende MFR, Muñoz PR, Rios EF2020 Breeding alfalfa (Medicago sativa l.) adapted to subtropical agroecosystems. Agronomy 10:742, Brito da Silva et al. 2020Brito da Silva V, Daher RF, Souza YP, Menezes BRS, Santos EA, Freitas RS, Oliveira ES, Stida WF, Cassaro S2020 Assessment of energy production in full-sibling families of elephant grass by mixed models. Renewable Energy 146:744-749, Rodrigues et al. 2020Rodrigues EV, Rocha JRASC, Alves RS, Teodoro PE, Laviola BG, Resende MDV, Carneiro PCS, Bhering LL2020 Selection of jatropha genotypes for bioenergy purpose: An approach with multitrait, multiharvest and effective population size. Bragantia 79:346-355). Similar results have been observed in Cynodon and other perennial forage grasses with the adoption of the compound symmetry structure, which assumes homogeneity of variances (Ferreira et al. 2020Ferreira FM, Rocha JRASC, Alves RS, Elizeu AM, Benites FRG, Resende MDV, Sobrinho FS, Bhering LL2020 Estimates of repeatability coefficients and optimum number of measures for genetic selection of Cynodon spp. Euphytica 216:70, Ferreira et al. 2021Ferreira FM, Bhering LL, Fernandes FD, Lédo FJS, Rangel JHA, Kopp M, Câmara TMM, Silva VQR, Machado JC2021 Optimal harvest number and genotypic evaluation of total dry biomass, stability, and adaptability of elephant grass clones for bioenergy purposes. Biomass and Bioenergy 149:106104).

Three out of the four traits evaluated showed an autoregressive heterogeneous structure (ARH) for the residual modeling (Table 1). This structure adopts a serial correlation as the component of the residual covariances between crop seasons and measurements. Autoregressive structures, especially AR1H, have been used in spatial analysis, considering that, as two individuals move away from each other, the correlation between them decreases (Andrade et al. 2020Andrade MHML, Fernandes Filho CC, Fernandes MO, Bastos AJR, Guedes ML, Marçal TS, Gonçalves FMA, Pinto CABP, Zotarelli L2020 Accounting for spatial trends to increase the selection efficiency in potato breeding. Crop Science 60:2354-2372, Bernardeli et al. 2021Bernardeli A, Rocha JRASC, Borém A, Lorenzoni R, Aguiar R, Silva JNB, Bueno RD, Alves RS, Jarquin D, Ribeiro C, Lamas Costa MDB2021 Modeling spatial trends and enhancing genetic selection: An approach to soybean seed composition breeding. Crop Science 61:976-988). The same concept can be incorporated into studies of repeated measures over time (Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962, Verbyla et al. 2021Verbyla AP, Faveri J, Deery DM, Rebetzke GJ2021 Modelling temporal genetic and spatio-temporal residual effects for high-throughput phenotyping data. Australian and New Zealand Journal of Statistics 63:284-308) since, as the interval between two measurements increases, the correlation between them tends to decrease, due to the environmental factors that influence the period of each harvest and the differences associated with the plant's maturity stage.

The use of the Akaike information criterion was effective in selecting the model that includes the appropriate covariance structure for the residual effects. The effectiveness of AIC has been proven in several previous studies (Faveri et al. 2015Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962, Pereira et al. 2018Pereira FAC, Carvalho SP, Rezende TT, Oliveira LL, Maia DRB2018 Selection of coffea arabica L. hybrids using mixed models with different structures of variance-covariance matrices. Coffee Science 13:304-311, Cavanaugh and Neath 2019Cavanaugh JE, Neath AA2019 The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements. Wiley Interdisciplinary Reviews: Computational Statistics 11:e1460, Melo et al. 2020Melo VL, Marçal TS, Rocha JRASC, Anjos RSR, Carneiro RCS, Carneiro JES2020 Modeling (co) variance structures for genetic and non- genetic effects in the selection of common bean progenies. Euphytica 216:77, Resende and Alves 2020Resende MDV, Alves RS2020 Linear, generalized, hierarchical, bayesian and random regression mixed models in genetics/genomics in plant breeding. Functional Plant Breeding Journal 2:1-31, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21). The best-fitted statistical model for the residual effects, identified by the lowest AIC value, was incorporated into the modeling of the permanent plot effect.

When modeling the permanent plot effect, the DIAGH structure had the best fit only for green weight, while the best-fitted model for the other traits was determined by the IDV covariance structure (Table 1). In fact, for green weight, there is likely to be heterogeneity in the variation of this effect over the crop seasons in Cynodon spp. and other perennial crops, due to the changing biotic and abiotic conditions that influence the differential expression of this trait. The better fit of a structure with homogeneous variance for the permanent plot effect in other traits can be explained by the fact that the intensity of this effect did not vary throughout the measurements of such traits.

The modeling of genotypic effects considered the best fit for the previously modeled effects. CORH was found to be the covariance structure that best fitted the model for all the traits evaluated. (Table 1). This structure indicates heterogeneity of variance between the measures but assumes that there is a correlation between them. The most complete model is the Unstructured as it allows particular predictions for each measurement by considering each of them as a separate variable (Piepho 1997Piepho HP1997 Analyzing genotype-environment data by mixed models with multiplicative terms. Biometrics 53:761-766). However, this approach becomes prohibitive when the number of measurements is high (five or more), leading to the model not converging (Mariguele et al. 2011Mariguele KH, Resende MDV, Viana JMS, Silva FF, Silva PSL, Knop FC2011 Métodos de análise de dados longitudinais para o melhoramento genético da pinha. Pesquisa Agropecuaria Brasileira 46:1657-1664). As multiple measurements are taken on the same genotype over time, adopting this structure to explain the genotypic effect makes biological sense as it is expected to find a correlation between measures from the same individual.

Selecting the best-fitted model also resulted in better estimates of selective accuracy for GW when compared to the simplest model, which assumes homogeneous variances (Table 1). The compound symmetry model had an accuracy of 0.59 and, when fitted, 0.63. In contrast, the other traits showed that selective accuracy does not increase when the best-fitted model is selected. PH had an accuracy of 0.89 for the simplest model and 0.86 after adjustment. DM showed accuracy values of 0.72 and 0.68 for the compound symmetry model and the heterogeneous correlation model, respectively. Finally, the selective accuracy of PV was 0.77 for M2 and 0.73 for the best-fitted model.

According to Resende and Alves (2020Resende MDV, Alves RS2020 Linear, generalized, hierarchical, bayesian and random regression mixed models in genetics/genomics in plant breeding. Functional Plant Breeding Journal 2:1-31), the accuracy parameter has the property of informing the correct arrangement of genotypes for selection purposes, as well as inferring the reliability of genotypic values. For PH, DM and PV, accuracy did not increase when the best-fitted model was adopted (Table 1), but the accuracy value was classified as high for the simplest model (M2), which is normally used in repeatability analyses. However, it is important to consider that the accuracy values found for the compound symmetry model (M2) may be overestimated as the most appropriate structure was not used to represent the data studied, which influences the estimation of genetic and non-genetic parameters used in the calculation of selective accuracy.

For GW, which had the lowest values among the traits evaluated, the selective accuracy increased from 0.59 (M2) to 0.63 (M17) with the adoption of the best-fitted model (Table 1). For greater reliability in selecting promising genotypes, accuracy values above 0.70 are recommended (Resende and Alves 2020Resende MDV, Alves RS2020 Linear, generalized, hierarchical, bayesian and random regression mixed models in genetics/genomics in plant breeding. Functional Plant Breeding Journal 2:1-31). Therefore, modeling the genetic and non-genetic effects for GW allowed reaching values ​​closer to the recommended, and thus, together with other parameters that reflect reliability, selection can be conducted.

The significance of the genotypic variance was found in both the simplest model (M2) and the best-fitted model, which makes the selection practice valid. Considering the model selected by AIC, the best-fitted model assumes heterogeneous variances for almost all the effects (Table 2). The permanent plot effect was the only one that, after modeling, continued to adopt homogeneous variances for the PH, DM and PV traits. The fact that the permanent plot effect does not show heterogeneity of variances for most of the traits may be associated with the nature of this effect. As a genotype has constant behavior over multiple measurements, it makes sense for this variance to be constant throughout the measurements. GW showed heterogeneous variance for this same effect, with DIAGH as the best-fitting structure.

Table 2
Variance components and coefficients of determination of the simplest model and the best model selected by AIC

In general, the genotypic variance increased in most of the cuts when comparing the best-fitted model with the simplest model (Table 2). The genotypic variances differed across the measurements, showing that a model that adopts homogeneous variances, such as M2, does not represent the real nature of the data (Melo et al. 2020Melo VL, Marçal TS, Rocha JRASC, Anjos RSR, Carneiro RCS, Carneiro JES2020 Modeling (co) variance structures for genetic and non- genetic effects in the selection of common bean progenies. Euphytica 216:77). In addition, modeling these matrices made it possible to identify the differential contribution of each effect to the phenotypic variance in the different harvests, which would not have been possible if only the simplest model had been considered.

The modeling of the genotypic effect was also reflected in the calculation of the average heritability, which increased considerably after the fitted-model strategy (Table 2). According to the classification proposed by Resende and Alves (2020Resende MDV, Alves RS2020 Linear, generalized, hierarchical, bayesian and random regression mixed models in genetics/genomics in plant breeding. Functional Plant Breeding Journal 2:1-31), heritability exhibited high magnitudes (0.50 < hg2 < 0.80) for PH and PV, indicating that the selection will be successful. For the DM and GW traits, even though this parameter remained in the moderate classification (0.30 < hg2 < 0.50), modeling these effects resulted in an important increase in heritability. Another parameter that has already been discussed and that supports experimental precision is high accuracy (Tables 1 and 2). The accuracy values found in this study are classified as high (>0.70) according to Resende and Alves (2020) for most of the traits, showing, along with the heritability, the reliability of selection.

The prediction of genetic values is also influenced by the model adopted and its respective covariance structures (Lara et al. 2019Lara LAC, Santos MF, Jank L, Chiari L, Vilela MM, Amadeu RR, Santos JPR, Pereira GS, Zeng Zeng, ZB ZB, Garcia AAF2019 Genomic selection with allele dosage in Panicum maximum Jacq. G3: Genes, Genomes, Genetics 9:2463-2475, Verbyla et al. 2021Verbyla AP, Faveri J, Deery DM, Rebetzke GJ2021 Modelling temporal genetic and spatio-temporal residual effects for high-throughput phenotyping data. Australian and New Zealand Journal of Statistics 63:284-308, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21). This influence acts directly on the ranking of the best genotypes as well as the individuals selected in each model (Table 3). It was identified that two of the 20 selected genotypes differed for the PH trait according to the model adopted. This difference increases even more for the other traits evaluated, with four different genotypes among those selected for DM and GW and five for PV (Table 3). The differences observed in the group of selected genotypes are 10%, 20%, 25% and 25% for the PH, DM, GW, and PV traits, respectively.

Table 3
Ranking of the 20 Cynodon spp. genotypes selected based on the genetic values obtained by the simplest (M2) and best-fitted (M17) models

The selection of inferior genotypes due to the use of an inadequate model can result in reduced genetic gains. Thus, modeling genetic and non-genetic effects represents an efficient strategy for optimizing a breeding program and a powerful tool for selecting the best genotypes. Differences in predicted genetic values and changes in the ranking of selected individuals are concerns in breeding programs due to the risk of genotypes being wrongly selected. The importance of modeling covariance structures is directly associated with obtaining more reliable inferences about the reality of the data, as well as more accurate selection within a breeding program.

Modeling in the evaluation of different measures proved to be efficient in selecting superior individuals, similar to the results found in studies involving the evaluation of different crop seasons (Melo et al. 2020Melo VL, Marçal TS, Rocha JRASC, Anjos RSR, Carneiro RCS, Carneiro JES2020 Modeling (co) variance structures for genetic and non- genetic effects in the selection of common bean progenies. Euphytica 216:77, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21). The combined use of appropriate statistical methods and high-quality phenotyping enables the selection of truly superior genotypes and, consequently, the optimization of a breeding program (Stringer et al. 2017Stringer JK, Atkin FC, Gezan SA2017 Statistical approaches in plant breeding: Maximising the use of genetic information. In Campos H and Caligari PDS (eds) Genetic improvement of tropical crops. Springer International Publishing, Berlin, p. 1-320, Melo et al. 2020Melo VL, Marçal TS, Rocha JRASC, Anjos RSR, Carneiro RCS, Carneiro JES2020 Modeling (co) variance structures for genetic and non- genetic effects in the selection of common bean progenies. Euphytica 216:77, Evangelista et al. 2023Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21).

According to the Kappa coefficient (K) (Cohen 1960Cohen J1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37-46), the concordances between the genotypes selected in pairs of measures for all the traits considering the best-fitted model ranged from 0.78 to 0.94. The concordances between the 10% best genotypes for the plant height trait in the four measures were: measure1 x measure2: 0.89; measure1 x measure3: 0.83; measure1 x measure4: 0.89; measure2 x measure3: 0.83; measure2 x measure4: 0.83; measure3 x measure4: 0.94. For dry matter percentage, the concordances of the selected genotypes were measure1 x measure2: 0.78; measure1 x measure3: 0.83; measure1 x measure4: 0.78; measure2 x measure3: 0.78; measure2 x measure4: 0.83; measure3 x measure4: 0.83. For green weight, the concordances were measure1 x measure2: 0.83; measure1 x measure3: 0.83; measure1 x measure4: 0.89; measure2 x measure3: 0.94; measure2 x measure4: 0.89; measure3 x measure4: 0.89. Lastly, for plant vigor, the concordances found were measure1 x measure2: 0.94; measure1 x measure3: 0.78; measure1 x measure4: 0.83; measure2 x measure3: 0.78; measure2 x measure4: 0.78; measure3 x measure4: 0.83.

It can be inferred that selection can be carried out on any crop, as the performance of the best genotypes based on the average of the four measurements reveals a pattern. There were only variations in the values obtained for the traits evaluated in each measure, few changes in the ranking order and a high degree of concordance of the genotypes selected for all pairs of measurements, as evidenced by the Kappa coefficient. As there is high concordance of the genotypes selected in each measure, selection based on the average of the cuts for each trait becomes feasible. As a result, significant gains were obtained in all traits evaluated in this study. The direct gain for PH was 21.19% when selecting the 10% best genotypes. GW, DM and PV showed direct gains of 37.19, 6.52 and 20.92%, respectively.

Modeling covariance structures and identifying the best-fitted model generate more reliable results when estimating variance components, predicting genotypic values, and selecting superior genotypes for the traits evaluated. In addition, it is possible to make reliable selections based on the average of the four cuts, which can facilitate decision-making in Cynodon spp. breeding programs.

CONCLUSION

Model 17 with heterogeneous compound symmetry (CORH) covariance structure was the best-fitted model for all the traits evaluated. When using the best-chosen model, the ranking of selected genotypes was changed, showing that this type of analysis should be used in breeding program.

ACKNOWLEDGEMENTS

To Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Capes, Finance Code 001), to Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (Fapemig).

REFERENCES

  • Acharya JP, Lopez B, Gouveia BT, Oliveira IB, Resende MFR, Muñoz PR, Rios EF2020 Breeding alfalfa (Medicago sativa l.) adapted to subtropical agroecosystems. Agronomy 10:742
  • Akaike H1974 A new look at the statistical model identification. IEEE Trans Autom Control 19:716-723
  • Andrade MHML, Fernandes Filho CC, Fernandes MO, Bastos AJR, Guedes ML, Marçal TS, Gonçalves FMA, Pinto CABP, Zotarelli L2020 Accounting for spatial trends to increase the selection efficiency in potato breeding. Crop Science 60:2354-2372
  • Araújo ED, Borges AC, Dias NM, Ribeiro DM2018 Effects of gibberellic acid on Tifton 85 bermudagrass (Cynodon spp.) in constructed wetland systems. PLoS One 13:1-26
  • Baxter LL, Anderson WF, Gates RN, Rios EF, Hancock DW2022 Moving warm-season forage bermudagrass (Cynodon spp.) into temperate regions of North America. Grass and Forage Science 77:141-150
  • Bernardeli A, Rocha JRASC, Borém A, Lorenzoni R, Aguiar R, Silva JNB, Bueno RD, Alves RS, Jarquin D, Ribeiro C, Lamas Costa MDB2021 Modeling spatial trends and enhancing genetic selection: An approach to soybean seed composition breeding. Crop Science 61:976-988
  • Bhering LL2017 Rbio: A tool for biometric and statistical analysis using the R platform. Crop Breeding and Applied Biotechnology 17:187-190
  • Brito da Silva V, Daher RF, Souza YP, Menezes BRS, Santos EA, Freitas RS, Oliveira ES, Stida WF, Cassaro S2020 Assessment of energy production in full-sibling families of elephant grass by mixed models. Renewable Energy 146:744-749
  • Butler DG, Cullis BR, Gilmour AR, Gogel BG, Thompson R2017 ASReml-R reference manual version 4. VSN International Ltd, Hemel Hempstead, 188p
  • Cavanaugh JE, Neath AA2019 The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements. Wiley Interdisciplinary Reviews: Computational Statistics 11:e1460
  • Chaves SFS, Alves RM, Alves RS, Sebbenn AM, Resende MDV, Dias LAS2021 Theobroma grandiflorum breeding optimization based on repeatability, stability and adaptability information. Euphytica 217:211
  • Cohen J1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37-46
  • Cullis BR, Smith AB, Coombes NE2006 On the design of early generation variety trials with correlated data. Journal of Agricultural, Biological and Environmental Statistics 11:381-393
  • Evangelista JSPC, Peixoto MA, Coelho IF, Ferreira FM, Marçal TS, Alves RS, Chaves SFS, Rodrigues EV, Laviola BG, Resende MDV, Dias KOG, Bhering LL2023 Modeling covariance structures and optimizing Jatropha curcas breeding. Tree Genetics & Genomes 19:21
  • Faveri J, Verbyla AP, Pitchford WS, Venkatanagappa S, Cullis BR2015 Statistical methods for analysis of multi-harvest data from perennial pasture variety selection trials. Crop and Pasture Science 66:947-962
  • Ferreira FM, Bhering LL, Fernandes FD, Lédo FJS, Rangel JHA, Kopp M, Câmara TMM, Silva VQR, Machado JC2021 Optimal harvest number and genotypic evaluation of total dry biomass, stability, and adaptability of elephant grass clones for bioenergy purposes. Biomass and Bioenergy 149:106104
  • Ferreira FM, Rocha JRASC, Alves RS, Elizeu AM, Benites FRG, Resende MDV, Sobrinho FS, Bhering LL2020 Estimates of repeatability coefficients and optimum number of measures for genetic selection of Cynodon spp. Euphytica 216:70
  • Henderson CR, Quaas RL1976 Multiple trait evaluation using relatives’ records. Journal of Animal Science 43:1188-1197
  • Kozak M, Piepho HP2018 What’s normal anyway? Residual plots are more telling than significance tests when checking ANOVA assumptions. Journal of Agronomy and Crop Science 204:86-98
  • Lara LAC, Santos MF, Jank L, Chiari L, Vilela MM, Amadeu RR, Santos JPR, Pereira GS, Zeng Zeng, ZB ZB, Garcia AAF2019 Genomic selection with allele dosage in Panicum maximum Jacq. G3: Genes, Genomes, Genetics 9:2463-2475
  • Malikouski RG, Peixoto MA, Morais AL, Elizeu AM, Rocha JRASC, Zucoloto M, Bhering LL2021 Repeatability coefficient estimates and optimum number of harvests in graft/rootstock combinations for “tahiti” acid lime. Acta Scientiarum -Agronomy 43:1-10
  • Mariguele KH, Resende MDV, Viana JMS, Silva FF, Silva PSL, Knop FC2011 Métodos de análise de dados longitudinais para o melhoramento genético da pinha. Pesquisa Agropecuaria Brasileira 46:1657-1664
  • Melo VL, Marçal TS, Rocha JRASC, Anjos RSR, Carneiro RCS, Carneiro JES2020 Modeling (co) variance structures for genetic and non- genetic effects in the selection of common bean progenies. Euphytica 216:77
  • Patterson HD, Thompson R1971 Recovery of inter-block information when block sizes are unequal. Biometrika 58:545-554
  • Pereira FAC, Carvalho SP, Rezende TT, Oliveira LL, Maia DRB2018 Selection of coffea arabica L. hybrids using mixed models with different structures of variance-covariance matrices. Coffee Science 13:304-311
  • Piepho HP1997 Analyzing genotype-environment data by mixed models with multiplicative terms. Biometrics 53:761-766
  • R Core Team2024 R: A language and environment for statistical computing. R Foundation for statistical computing. Vienna. Available at <https://www.r-project.org/>.
    » https://www.r-project.org
  • Rao CR1973 Linear statistical inference and its applications. Wiley, Hoboken, 625p
  • Resende MDV2007 Matemática e estatística na análise de experimentos e no melhoramento genético. Embrapa Florestas, Colombo, 561p
  • Resende MDV, Alves RS2020 Linear, generalized, hierarchical, bayesian and random regression mixed models in genetics/genomics in plant breeding. Functional Plant Breeding Journal 2:1-31
  • Resende MDV, Duarte JB2007 Precisão e controle de qualidade em experimentos de avaliação de cultivares. Pesquisa Agropecuária Tropical 37:182-194
  • Resende MDV, Silva FF, Azevedo CF2014 Estatística matemática, biométrica e computacional: modelos mistos, multivariados, categóricos e generalizados (REML/BLUP), inferência bayesiana, regressão aleatória, seleção genômica, QTL, GWAS, estatística espacial e temporal, competição, sobrevivência. UFV, Viçosa, 881p
  • Rocha JRASC, Marçal TS, Salvador FV, Silva AC, Machado JC, Carneiro PCS2018 Genetic insights into elephantgrass persistence for bioenergy purpose. PLoS One 13:1-16
  • Rodrigues EV, Rocha JRASC, Alves RS, Teodoro PE, Laviola BG, Resende MDV, Carneiro PCS, Bhering LL2020 Selection of jatropha genotypes for bioenergy purpose: An approach with multitrait, multiharvest and effective population size. Bragantia 79:346-355
  • Shalizi MN, Isik F2019 Genetic parameter estimates and GxE interaction in a large cloned population of Pinus taeda L. Tree Genetics and Genomes 15:46
  • Singh L, Wu Y, McCurdy JD, Stewart BR, Warburton ML, Baldwin BS, Dong H2023 Genetic diversity and population structure of bermudagrass (Cynodon spp.) revealed by genotyping-by-sequencing. Frontiers in Plant Science 14:1155721
  • Soares PR, Galhano C, Gabriel R2023 Alternative methods to synthetic chemical control of Cynodon dactylon (L.). A systematic review. Agronomy for Sustainable Development 45:51
  • Stringer JK, Atkin FC, Gezan SA2017 Statistical approaches in plant breeding: Maximising the use of genetic information. In Campos H and Caligari PDS (eds) Genetic improvement of tropical crops. Springer International Publishing, Berlin, p. 1-320
  • Verbyla AP2019 A note on model selection using information criteria for general linear models estimated using REML. Australian and New Zealand Journal of Statistics 61:39-50
  • Verbyla AP, Faveri J, Deery DM, Rebetzke GJ2021 Modelling temporal genetic and spatio-temporal residual effects for high-throughput phenotyping data. Australian and New Zealand Journal of Statistics 63:284-308

Publication Dates

  • Publication in this collection
    09 Sept 2024
  • Date of issue
    2024

History

  • Received
    22 Feb 2024
  • Accepted
    10 Apr 2024
  • Published
    01 May 2024
Crop Breeding and Applied Biotechnology Universidade Federal de Viçosa, Departamento de Fitotecnia, 36570-000 Viçosa - Minas Gerais/Brasil, Tel.: (55 31)3899-2611, Fax: (55 31)3899-2611 - Viçosa - MG - Brazil
E-mail: cbab@ufv.br