pab
Pesquisa Agropecuária Brasileira
Pesq. agropec. bras.
0100-204X
1678-3921
Embrapa Secretaria de Pesquisa e Desenvolvimento; Pesquisa Agropecuária Brasileira
Resumo:
O objetivo deste trabalho foi desenvolver um pacote em ambiente R para automatizar e facilitar análises de regressão. Denominado easyreg, o pacote disponibiliza cinco funções. A função er1 realiza análises em 13 modelos, inclusive modelos lineares, não lineares e mistos. A função er2 leva em conta a falta de ajuste nas análises e nos seguintes delineamentos: inteiramente casualizado, blocos ao acaso, quadrados latinos e quadrados latinos repetidos. A função regplot gera gráficos; a função bl estima modelos bissegmentados; e a função regtest testa a igualdade dos parâmetros e a identidade dos modelos. Estas funções permitem um grande número de análises e conferem praticidade e versatilidade à análise de regressão.
The R environment (R Core Team, 2017) was created in 1996 by Ross Ihaka and Robert Gentleman, at the University of Auckland, New Zealand (Peternelli & Mello, 2011). Collaborators from different locations worldwide have further developed it. Among other advantages, its functions can be extended because of its easy programming, and its system of “packages” containing specific functions that considerably increase the capacity of analysis. R software is widely used in universities, research centers, and businesses. It is an important technological tool for the analysis and manipulation of data, competing with the best statistical software for this purpose, with the advantage of being free of charge and freely available for Mac, Windows, and Linux platforms.
The R base system displays many functions, as well as packages able to perform regression analyses. However, these functions generally perform separate analyses, and different functions are necessary to create a model, test parameters, or to analyze residues, in order to obtain a greater analysis control. However, for less experienced users, these many functions can turn the analyses a very difficult task.
Many packages have been developed that offer functions for automating analyses. These packages include “multcomp” (Hothorn et al., 2008), “pedigreemm” (Vazquez et al., 2010), “ExpDes” (Ferreira et al., 2013), “easyanova and ds” (Arnhold, 2013, 2014), “GGEBiplotGUI” (Frutos et al., 2014), “ScottKnott” (Jelihovschi et al., 2014), “lsmeans” (Lenth, 2016), and “agricolae” (Mendiburu, 2016). With these packages, analyses can be performed using R base functions, or creating new functions, thus offering users a more practical means of conducting regression analyses. These packages have been used by both less experienced users and users seeking practicality and versatility in their analyses.
With this approach, the present package, easyreg was developed, aiming at automating regression analyses in very common models and in agricultural sciences. The package’s guide offers many examples of applications to agricultural data. The five functions (er1, er2, regplot, regtest, and bl) included in the package, in the R environment (Arnhold, 2016), are described as follows.
The er1 function can perform regression analysis in 13 models (Table 1), including linear, nonlinear, and mixed models. This function extracts parameters from the models for analyses and other uses, and offers parameter testing and measures related to the quality of models, such as the coefficient and adjusted coefficient of determination, Akaike’s information criterion (AIC), and Bayesian information criterion (BIC). Residuals, standard residues, discrepant data, and residual normality test are also provided. For some models, the maximum and minimal values, plateau, and line breaks are also estimated.
Table 1.
Models available in the er1, regplot, and regtest functions.
Name
Mathematical description
Model number
Linear
y ~ a + bx
1
Quadratic
y ~ a + bx + cx2
2
Linear plateau
y ~ a + b × (x - c) × (x ≤ c)
3
Quadratic plateau
y ~ (a + bx + c × I(x2)) × (x ≤ -0.5 × b/c) + (a + I (-b2 / (4c))) × (x > -0.5 × b/c)
4
Two linear
if else (x ≥d, (a - c × d) + (b + c) × x, a + b × x)
5
Exponential
y ~ a × exp (bx)
6
Logistic
y ~ a × (1+b × (exp (-c × x)))-1
7
van Bertalanffy
y ~ a × (1+b × (exp (-c × x)))3
8
Brody
y ~ a × (1+b × (exp (-c × x)))
9
Gompertz
y ~ a × exp(-b × exp(-c × x))
10
Lactation curve
y ~ (a × xb) × exp (-c × x)
11
Ruminal degradation curve
y ~ a ×(1 - exp (-c × x))
12
Logistic bi-compartmental
y ~ (a /(1 + exp (2 - 4 × c × (x - e)))) + (b/(1 + exp (2 - 4 × d × (x - e))))
13
The mixed models are performed using the nlme function. It is possible to estimate models with all random coefficients.
The er2 function performs regression analysis based on the method of lack of fit. It considers completely randomized designs, randomized complete block designs, Latin squares, and repeated Latin squares. The models considered are linear, quadratic, and cubic. This function estimates model parameters, and offers parameter testing (considering the design and the lack of fit), as well as the coefficient of determination and adjusted coefficient of determination.
The regplot function creates graphics and allows of the insertion of data and equations. An example of the regplot function is given in Figure 1. Here, a linear model was estimated using a plateau of the weight gain in the function of the methionine level in turkey chicks. In the regplot function, data should be inserted into a table, including explanatory and dependent variables in the first and second columns, respectively. The argument “design” describes the model. The model number can be found in the help function and description given in Table 1. In addition, defining the number of digits (digits), legend position (position), and the axes label (xlab and ylab) (Figure 1) is possible.
Figure 1
Example of an application of the regplot function with the programming in the console and the respective graph. This example considers a linear function with a plateau for daily weight gain (g) in the function of the methionine quantity (% of NRC) in turkey chicks.
Like the regplot function, the bl function also creates figures. However, this function is specific to the analysis of models with two linear segments.
The regtest function performs tests to evaluate the equality of parameters and the identity of regression models based on the methodology of Regazzi (1993, 1999, 2003) and Regazzi et al. (2004). With this function, it is possible to apply tests in all models described by the er1 function (Table 1).
Finally, similarly to packages such as easyanova (Arnhold, 2013) and ExpDes (Ferreira et al., 2013), and many others available for the R environment, the functions from the easyreg package provide results in a practical manner. Therefore, the package can aid less experienced users, or users who have some difficulty in using the R software for data analyses. It can also help users who seek agility in the process of data analysis
References
ARNHOLD, E. easyreg: Easy Regression. R package version 1.0. 2016. Available at: <Available at: http://CRAN.R-project.org/package=easyreg
>. Accessed on: Nov. 18 2016.
ARNHOLD
E.
easyreg: Easy Regression
R package
version 1.0
2016
Available at: http://CRAN.R-project.org/package=easyreg
Nov. 18 2016
ARNHOLD, E. Package in the R-environment for analysis of variance and complementary analyses. Brazilian Journal of Veterinary Research and Animal Science, v.50, p.488-492, 2013. DOI: 10.11606/issn.1678-4456.v50i6p488-492.
ARNHOLD
E.
Package in the R-environment for analysis of variance and complementary analyses
Brazilian Journal of Veterinary Research and Animal Science
50
488
492
2013
10.11606/issn.1678-4456.v50i6p488-4
ARNHOLD, E. Pacote em ambiente R para automatizar estatísticas descritivas. Sigmae, v.3, p.36-42, 2014.
ARNHOLD
E.
Pacote em ambiente R para automatizar estatísticas descritivas
Sigmae
3
36
42
2014
FERREIRA, E.B.; CAVALCANTI, P.P.; NOGUEIRA, D.A. ExpDes: experimental designs package. R package version 1.1.2. 2013. Available at: <Available at: http://CRAN.R-project.org/package=ExpDes
>. Accessed on: Nov. 8 2016.
FERREIRA
E.B.
CAVALCANTI
P.P.
NOGUEIRA
D.A.
ExpDes: experimental designs package
R package
version 1.1.2.
2013
Available at: http://CRAN.R-project.org/package=ExpDes
Nov. 8 2016
FRUTOS, E.; PURIFICACIÓN GALINDO, M.; LEIVA, V. An interactive biplot implementation in R for modeling genotype-by-environment interaction. Stochastic Environmental Research and Risk Assessment, v.28, p.1629-1641, 2014. DOI: 10.1007/s00477-013-0821-z.
FRUTOS
E.
PURIFICACIÓN GALINDO
M.
LEIVA
V.
An interactive biplot implementation in R for modeling genotype-by-environment interaction
Stochastic Environmental Research and Risk Assessment
28
1629
1641
2014
10.1007/s00477-013-0821
HOTHORN, T.; BRETZ, F.; WESTFALL, P. Simultaneous inference in general parametric models. Biometrical Journal, v.50, p.346-363, 2008. DOI: 10.1002/bimj.200810425.
HOTHORN
T.
BRETZ
F.
WESTFALL
P.
Simultaneous inference in general parametric models
Biometrical Journal
50
346
363
2008
10.1002/bimj.2008104
JELIHOVSCHI, E.G.; FARIA, J.C.; ALLAMAN, I.B. ScottKnott: A package for performing the Scott-Knott clustering algorithm in R. Trends in Applied and Computational Mathematics, v.15, p.3-17, 2014.
JELIHOVSCHI
E.G.
FARIA
J.C.
ALLAMAN
I.B.
ScottKnott: A package for performing the Scott-Knott clustering algorithm in R
Trends in Applied and Computational Mathematics
,
15
3
17
2014
LENTH, R.V. Least-squares means: The R package lsmeans. Journal of Statistical Software, v.69, p.1-33, 2016. DOI: 10.18637/jss.v069.i01.
LENTH
R.V.
Least-squares means: The R package lsmeans
Journal of Statistical Software
69
1
33
2016
10.18637/jss.v069.i
MENDIBURU, F. de. agricolae: Statistical procedures for agricultural research. R package version 1.2-4. 2016. Available at: <Available at: http://CRAN.R-project.org/package=agricolae
>. Accessed on: Nov. 5 2016.
MENDIBURU
F. de
agricolae: Statistical procedures for agricultural research
R package
version 1.2-4
2016
Available at: http://CRAN.R-project.org/package=agricolae
Nov. 5 2016
PETERNELLI, L.A.; MELLO, M.P. Conhecendo o R: uma visão estatística. Viçosa: Ed. da UFV, 2011. 185p.
PETERNELLI
L.A.
MELLO
M.P.
Conhecendo o R: uma visão estatística
Viçosa
Ed. da UFV
2011
185p
R CORE TEAM. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing, 2017. Available at: <Available at: http://www.R-project.org
/>. Accessed on: May 27 2017.
R CORE TEAM
R: A language and environment for statistical computing
Vienna
R Foundation for Statistical Computing
2017
Available at: http://www.R-project.org
May 27 2017
REGAZZI, A.J. Teste para verificar a identidade de modelos de regressão e a igualdade de parâmetros no caso de dados de delineamentos experimentais. Revista Ceres, v.46, p.383-409, 1999.
REGAZZI
A.J.
Teste para verificar a identidade de modelos de regressão e a igualdade de parâmetros no caso de dados de delineamentos experimentais
Revista Ceres
46
383
409
1999
REGAZZI, A.J. Teste para verificar a identidade de modelos de regressão e a igualdade de alguns parâmetros num modelo polinomial ortogonal. Revista Ceres, v.40, p.176-195, 1993.
REGAZZI
A.J.
Teste para verificar a identidade de modelos de regressão e a igualdade de alguns parâmetros num modelo polinomial ortogonal
Revista Ceres
40
176
195
1993
REGAZZI, A.J. Teste para verificar a igualdade de parâmetros e a identidade de modelos de regressão não-linear. Revista Ceres, v.50, p.9-26, 2003.
REGAZZI
A.J.
Teste para verificar a igualdade de parâmetros e a identidade de modelos de regressão não-linear
Revista Ceres
50
9
26
2003
REGAZZI, A.J.; SILVA, C.H.O. Teste para verificar a igualdade de parâmetros e a identidade de modelos de regressão não-linear. I. Dados no delineamento inteiramente casualizado. Revista de Matemática e Estatística, v.22, p.33-45, 2004.
REGAZZI
A.J.
SILVA
C.H.O.
Teste para verificar a igualdade de parâmetros e a identidade de modelos de regressão não-linear. I. Dados no delineamento inteiramente casualizado
Revista de Matemática e Estatística
22
33
45
2004
VAZQUEZ, A.I.; BATES, D.; ROSA, G.J.M.; GIANOLA, D.; WEIGEL, K.A. Technical note: an R package for fitting generalized linear mixed models in animal breeding. Journal of Animal Science, v.88, p.497-504, 2010. DOI: 10.2527/jas.2009-1952
VAZQUEZ
A.I.
BATES
D.
ROSA
G.J.M.
GIANOLA
D.
WEIGEL
K.A.
Technical note: an R package for fitting generalized linear mixed models in animal breeding
Journal of Animal Science
88
497
504
2010
10.2527/jas.2009-19
Autoria
Emmanuel Arnhold
Universidade Federal de Goiás, Escola de Veterinária e Zootecnia, Campus Samambaia, Caixa Postal 131, CEP 74001-970 Goiânia, GO, Brazil. E-mail: earnhold@pq.cnpq.brUniversidade Federal de GoiásBrazilGoiânia, GO, BrazilUniversidade Federal de Goiás, Escola de Veterinária e Zootecnia, Campus Samambaia, Caixa Postal 131, CEP 74001-970 Goiânia, GO, Brazil. E-mail: earnhold@pq.cnpq.br
SCIMAGO INSTITUTIONS RANKINGS
Universidade Federal de Goiás, Escola de Veterinária e Zootecnia, Campus Samambaia, Caixa Postal 131, CEP 74001-970 Goiânia, GO, Brazil. E-mail: earnhold@pq.cnpq.brUniversidade Federal de GoiásBrazilGoiânia, GO, BrazilUniversidade Federal de Goiás, Escola de Veterinária e Zootecnia, Campus Samambaia, Caixa Postal 131, CEP 74001-970 Goiânia, GO, Brazil. E-mail: earnhold@pq.cnpq.br
Figure 1
Example of an application of the regplot function with the programming in the console and the respective graph. This example considers a linear function with a plateau for daily weight gain (g) in the function of the methionine quantity (% of NRC) in turkey chicks.
Table 1.
Models available in the er1, regplot, and regtest functions.
imageFigure 1
Example of an application of the regplot function with the programming in the console and the respective graph. This example considers a linear function with a plateau for daily weight gain (g) in the function of the methionine quantity (% of NRC) in turkey chicks.
open_in_new
table_chartTable 1.
Models available in the er1, regplot, and regtest functions.
Name
Mathematical description
Model number
Linear
y ~ a + bx
1
Quadratic
y ~ a + bx + cx2
2
Linear plateau
y ~ a + b × (x - c) × (x ≤ c)
3
Quadratic plateau
y ~ (a + bx + c × I(x2)) × (x ≤ -0.5 × b/c) + (a + I (-b2 / (4c))) × (x > -0.5 × b/c)
4
Two linear
if else (x ≥d, (a - c × d) + (b + c) × x, a + b × x)
5
Exponential
y ~ a × exp (bx)
6
Logistic
y ~ a × (1+b × (exp (-c × x)))-1
7
van Bertalanffy
y ~ a × (1+b × (exp (-c × x)))3
8
Brody
y ~ a × (1+b × (exp (-c × x)))
9
Gompertz
y ~ a × exp(-b × exp(-c × x))
10
Lactation curve
y ~ (a × xb) × exp (-c × x)
11
Ruminal degradation curve
y ~ a ×(1 - exp (-c × x))
12
Logistic bi-compartmental
y ~ (a /(1 + exp (2 - 4 × c × (x - e)))) + (b/(1 + exp (2 - 4 × d × (x - e))))
13
Como citar
Arnhold, Emmanuel. Pacote em ambiente R para análises de regressão. Pesquisa Agropecuária Brasileira [online]. 2018, v. 53, n. 07 [Acessado 8 Abril 2025], pp. 870-873. Disponível em: <https://doi.org/10.1590/S0100-204X2018000700012>. ISSN 1678-3921. https://doi.org/10.1590/S0100-204X2018000700012.
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.