Abstract
Abstract: The normal distribution has a central place in distribution theory and statistics. We propose the log-odd normal generalized (LONG) family of distributions based on log-odds and obtain some of its mathematical properties including a useful linear representation for the new family. We investigate, as a special model, the log-odd normal power-Cauchy (LONPC) distribution. Some structural properties of LONPC distribution are obtained including quantile function, ordinary and incomplete moments, generating function and some asymptotics. We estimate the model parameters using the maximum likelihood method. The usefulness of the proposed family is proved empirically by means of a real air pollution data set.
Key words
Generalized class; maximum likelihood estimation; normal distribution; power-Cauchy distribution; Shannon entropy
INTRODUCTION
The normal distribution plays an important role in statistical theory and real data applications. The probability density function (pdf) and cumulative distribution function (cdf) of the normal random variable (rv) with mean and standard deviation , say , are given by
and
where and denote the pdf and cdf (Laplace function) of the standard normal distribution, respectively, and stands for the indicator of the event .
Some closely related alternatives and generalizations of the normal distribution have been reported in literature. Thehalf-normal (HN) distribution is obtained when the distribution is folded about the origin, and has been used as a model for left-truncated data, which has applications in many fields ( Wiper et al. 2008WIPER MP, GIRÓN FJ and PESWEY A. 2008. Objective Bayesian inference for the half-normal and half-$t$ distributions. Commun Stat Theory Methods 37: 3165-3185.). Leone et al. (1961)LEONE FC, NELSON LS and NOTTINGHAM RB. 1961. The folded normal distribution. Technometrics 3: 543-550. proposed thefolded-normal (FN) distribution when the measurement system produces only non-negative measurements from a normally distributed process. When the location parameter , then the FN rv reduces to the HN rv (Gui et al. 2013GUI W, CHEN P-C and WU H. 2013. A folded normal slash distribution and its applications to non-negative measurements. J Data Sci 11: 231-247.). The pdf and cdf of the HN distribution with scale parameter are, respectively, given by
and
where and is the error function.
Rogers and Tukey (1972)ROGERS WH and TUKEY JW. 1972. Understanding some long-tailed symmetrical distributions. Stat Neerl 26: 211-226. proposed and studied the properties of the slash-normal (SN) family, which has heavier tails than the normal ones, i.e., it has greater kurtosis. The slash distribution is closely related to the normal distribution and is represented as the quotient of a normal rv (numerator) and the power of a uniform rv (denominator), both independent rvs. Hence, we can say that a rv has a slash distribution if it can be represented as , where is independent of and . In particular, if , then it follows the standard normal distribution and, for , we obtain the canonic (standard) slash density given by
O’Hagan and Leonard (1976)O'HAGAN and LEONARD T. 1976. Bayes estimation subject to uncertainty about parameter constraints. Biometrika 63: 201-202. pioneered the skew-normal(SN) distribution with asymmetry parameter (see also Azzalini 1985AZZALINI A. 1985. A class of distributions which includes the normal ones. Scand J Statist 12: 171-178.). We write if the pdf of is
Obviously, denotes the standard normal distribution .
Cooray and Ananda (2008)COORAY K and ANANDA MMA 2008. A generalization of the half-normal distribution with applications to lifetime data. Commun Stat Theory Methods 37: 1323-1337. defined the generalized half-normal (GHN) distribution, which has both monotone and non-monotone hazard rate shapes. The pdf and cdf of the GHN distribution with positive scale parameter and shape parameter are given by
and
respectively. Clearly, for , the GHN distribution reduces to the HN distribution.
Any extended normal distribution becomes flexible when shape parameter(s) are added to the normal density through generalized classes reported in literature. Some published generalizations of the normal distributions provide flexible shapes for their densities and hazard rates. Eugene et al. (2002)EUGENE N, LEE C and FAMOYE F. 2002. Beta-normal distribution and its applications. Commun Stat Theory Methods 31: 497-512. and Famoye et al. (2004)FAMOYE F, LEE C and EUGENE N. 2004. Beta-normal Distribution: Bimodality properties and application. J Mod Appl Stat Methods 3: 85-103. proposed the beta-G class and studied some properties of thebeta-normal (BN) distribution. Correa et al. (2012)CORREA MA, NOGUEIRA DA and FERREIRA EB. 2012. Kumaraswamy Normal and Azzalini’s skew Normal modeling asymmetry. Sigmae 1: 65-83. introduced the Kumaraswamy-normal (KwN) distribution by using the Kumaraswamy-G class originally proposed by Cordeiro and de-Castro (2011)CORDEIRO GM and DE-CASTRO M. 2011. A new family of generalized distributions. J Stat Comput Simul 81: 883-893.. Alzaatreh et al. (2014a)ALZAATREH A, FAMOYE F and LEE C. 2014a. The gamma-normal distribution: Properties and applications. Comput Stat Data Anal 69: 67-80. introduced thegamma-normal (GaN) distribution from the T-X (Gamma-X and Weibull-X) family defined by Alzaatreh et al. (2013)ALZAATREH A, FAMOYE F and LEE C. 2013. A new method for generating families of continuous distributions. Metron 71: 63-79.. Alzaatreh et al. (2014b)ALZAATREH A, FAMOYE F and LEE C. 2014b. T-normal family of distributions: A new approach to generalize the normal distribution. J Stat Dist Applic 1: Art 16. defined and discussed theT-normal family. Lima et al. (2015)LIMA MCS, CORDEIRO GM and ORTEGA EMM 2015. A new extension of the normal distribution. J Data Sci 13: 385-408 studied the GaN model from the gamma-G class reported by Zografos and Balakrishnan (2009)ZOGRAFOS K and BALAKRISHNAN N. 2009. On families of beta- and generalized gamma-generated distributions and associated inference. Stat Methodol 6: 344-362.. Cordeiro et al. (2012)CORDEIRO GM, CINTRA RJ, RÊGO LC and ORTEGA EEM 2012. The McDonald normal distribution. Pak J Stat Oper Res 8: 301-329. introduced theMcDonald-normal (McN) distribution using theMcDonald-G class pioneered by Alexander et al. (2012)ALEXANDER C., CORDEIRO GM, ORTEGA EMM AND SARABIA JM. 2012. Generalized beta-generated distributions. Comput Stat Data Anal 56: 1880-1897.. Braga et al. (2016) defined the odd log-logistic normal (OLLN) distribution from the odd log-logistic-G (OLL-G) class proposed by Gleaton and Lynch (2006)GLEATON JU and LYNCH JD. 2006. Properties of generalized log-logistic families of lifetime distributions. J Probab Stat Sci 4: 51-64..
The main objectives of the paper are to define a new extended normal class named the log-odd normal generalized(LONG) family and derive a simple general linear representation for obtaining some of its mathematical properties. It is unfolded as follows. In section ‘The LONG FAMILY’, we define the LONG family and describe its motivation. In section ‘LINEAR REPRESENTATION OF LONG FAMILY DENSITY’, a linear representation for the family density is obtained, In section ‘MATHEMATICAL PROPERTIES OF LONG FAMILY’, we perform a modality analysis and obtain some useful properties of the new family such as asymptotics, moments and generating function. In section ‘LONG POWER CAUCHY DISTRIBUTION AND ITS PROPERTIES’, some structural properties of a special model of the LONG family, namely the log-odd normal power-Cauchy (LONPC) distribution are investigated along with the maximum likelihood method which is used to estimate the parameters of LONPC model. In section ‘ APPLICATION AS AN ILLUSTRATION OF LONG FAMILY’, the usefulness of the LONGPC model is illustrated by means of a real data set on air pollution data, and prove empirically that the LONPC distribution outperforms some well-known lifetime models. The last section offers some concluding remarks.
THE LONG FAMILY
Studying the beta-generated family, answering in the same time to the question: Can we use other distributions with different supports as the generators to derive different classes of distributions?, Alzaatreh et al. (2013) have proposed theT–X family of distributions.
Let be the pdf and be the cdf of a rv for and let 1 1 Here and in what follows, x↦f1∘f2(x)=f1(f2(x)) denotes the composite function of f1,f2 , respectively. be a function of a baseline cdf defined on a standard probability space so that satisfies the following conditions (Alzaatreh et al. 2013):
-
(i) ;
-
(ii) is differentiable and monotonically non-decreasing, and
-
(iii) and .
The cdf of the T–X family is defined by
where satisfies the conditions (i)-(iii).
The pdf corresponding to (4) reduces to (Alzaatreh et al. 2013)
Let , , and be the cdf, pdf, survival function (sf) and qf of a baseline continuous rv. Then, the odds (O), log-odds (LO) and log-odd ratio (LOR) functions are defined by , and , respectively. The use of the odd ratio is becoming very popular and has applications in the fields of reliability and survival analysis, large sample theory, discriminant analysis, among others. The LOR is also a useful measure for modeling data that exhibits non-monotone failure rate. The distributions, being non-monotone in terms of failure rate, are monotone in terms of LOR (see Wang et al. 2003WANG Y, HOSSAIN AM and ZIMMER WJ. 2003. Monotone log-odds rate distribution in reliability analysis. Commun Stat Theory Methods 32: 2227-2244.).
Some generalized classes have been proposed using the O-function in literature viz. odd log-logistic-G (Gleaton and Lynch 2006), odd gamma-G (Torabi and Montazeri 2012TORABI H and MONTAZERI NH. 2012. The gamma-uniform distribution and its application. Kybernetika 48: 16-30.), odd generalized exponential-G (Tahir et al. 2015TAHIR MH, CORDEIRO GM, ALIZADEH M, MANSOOR M, ZUBAIR M and HAMEDANI GG 2015. The odd generalized exponential family of distributions with applications. J Stat Dist Applic 2: Art. 1.), odd Burr-G (Alizadeh et al. 2017ALIZADEH M, CORDEIRO GM, NASCIMENTO ADC, LIMA MCS and ORTEGA EMM. 2017. Odd-Burr generalized family of distributions with some applications. J Stat Comput Simul 87: 367-389.), generalized odd half-Cauchy-G (Cordeiro et al. 2017CORDEIRO GM, ALIZADEH M, RAMIRES TG and ORTEGA EEM. 2017. The generalized odd half-Cauchy family of distributions: Properties and applications. Commun Stat Theory Methods 46: 5685-5705.), odd Birnbaum-Saunders-G (Ortega et al. 2016ORTEGA EMM, LEMONTE AJ, CORDEIRO GM and DA-CRUZ JN. 2016. The odd Birnaum-Saunders regression model with applications to lifetime data. J Stat Theory Prac 10: 780-804) and odd Weibull-G (Bourguignon et al. 2014BOURGUIGNON M, SILVA RB and CORDEIRO GM 2014. The Weibull--G family of probability distributions. J Data Sci 12: 53-68.). Only two generalized classes have been proposed so far from the LO function called the LO-logistic-G (Torabi and Montazeri 2014TORABI H and MONTAZERI NH. 2014. The logistic-uniform distribution and its application. Commun Stat Simul Comput 43: 2551-2569.) and LO-Gumbel-G (Al-Aqtash et al. 2014AL-AQTASH R, LEE C and FAMOYE F. 2014. Gumbel-Weibull distribution: Properties and applications. J Mod Appl Stat Methods 13: 201-225., 2015AL-AQTASH R, FAMOYE F and LEE C. 2015. On generating a new family of distributions using the logit function. J Probab Stat Sci 13: 135-152.) models (when ).
In this paper, we propose and study a new generalized family by considering the LO function, i.e. . Henceforth, we write the short-hands and omitting by convention the parameters throughout. Then, we define the cdf of the LONG family by
Two interpretations of this family can be given as follow. For the first interpretation, let be a rv defined on a standard probability space describing a stochastic system by the cdf . If the rv represents the odds, the risk that the system following the lifetime will be not working at time is given by . If we are interested in modeling the randomness of the LO by the cdf , the cdf of is given by
For the second interpretation of (6), we take a LONG rv and a rv with cdf , for . Then, . Since the function is always monotonic and non-decreasing, this implies that 2 2 Here, as usual we write =𝒟 for equality of random variables in distribution. . So, if has the LONG distribution, then has cdf given by , holding for every continuous cdf .
Note that the SN distribution is just a special case of the LONG family. In fact, we can obtain Azzalini’s model (Azzalini 1985) by taking the Burr type II as baseline in our LONG cdf (6).
The pdf corresponding to(6) is given by
whereas the associated hazard rate function (hrf) becomes
Henceforth, we denote by a rv having pdf (7) with parameters and .
We emphasize that hundreds extended distributions have been developed by introducing two or more parameters to a baseline distribution in the last two decades for modeling data in several applied areas such as biology, oncology, environmental and medical sciences, engineering and economics. However, there is a clear need for further simple extended families in these areas, that is, new very flexible models to fit real data that present large intervals for skewness and kurtosis and heavy-tailed shapes. The LONG family aims to fill part of this gap by constructing new flexible distributions with just one additional parameter, while some other known families such as beta and Kumaraswamy have two extra parameters.
Furthermore, the basic motivation for considering the LONG family in practice are the following desired tasks:
-
(i) have one extra parameter, whereas some known generators such as beta and Kumaraswamy have two;
-
(ii) make the kurtosis more flexible compared to the baseline model;
-
(iii) produce a skewness for symmetrical distributions;
-
(iv) construct heavy–tailed distributions that are not longer-tailed for modeling real data;
-
(v) generate distributions with symmetric, left-skewed, right-skewed and reversed -shaped;
-
(vi) define special models with all types of the hrf;
-
(vii) provide consistently better fits than other generated models under the same baseline distribution.
Lemma 1. The qf of X can be obtained by inverting (6) as
If , the solution of the nonlinear equation X = Q(U) possesses pdf (7).
The next lemma connects the normal distribution and the LONG family.
Lemma 2. If , then follows the LONG family.
Further, the following two lemmata provide the th ordinary moment and Shannon entropy of .
Lemma 3. The rth ordinary moment of X is given by
The assertion follows from Lemma 2. Moreover, due to non–closed form of the integral in (10) numerical integration can be used to obtain the th moment of the LONG family and its special models.
Lemma 4.The Shannon entropy of , when , equals to
Here, and are the qf and pdf of the parent G model, respectively.
We omit the proof since it follows by applying the result in Alzaatreh et al. (2013).
LINEAR REPRESENTATION OF LONG FAMILY DENSITY
Our next task is to establish a power series representation for the cdf given by (6) in powers of , where is the baseline cdf. We take relations from Equations (39) and (40) (consult also Appendix A for notation) by which the algebraic developments are based on well-known hypergeometric functions
where
The inferred form of the cdf given in (12) includes the determination of positive even integer powers of (see (39)) which one reduces to the calculation of the coefficients of power series raised to positive integer powers being the zeroth order coefficient equal to unity (consult Appendix B), that is
Here, by virtue of (42), we clearly conclude
where for is the Pochhammer symbol (or raising factorial). Now, the collected relations (6) and(12)–(14) imply
The multiplication of power series gives
and then
where
Next, since , the termwise differentiation of (16) gives the pdf of as
By using Bailey’s transformation (Slater 1996SLATER LJ. 1966. Generalized Hypergeometric Functions. Cambridge: Cambridge University Press.;p.58) in (18) to interchange with the equivalent summation (see Srivastava and Manocha 1984SRIVATAVA HM and MANOCHA L. 1984. A treatise on generating functions. Chichester: Ellis Harwood Limited Publishers.;p.100, Lemma 2), we obtain
The LONG family pdf follows from the last equation as
where is the greatest integer less than or equal to .
The exponentiated-G family with power parameter , say exp-G( ), has pdf given by . The corresponding cdf turns out to be . Clearly, is the baseline pdf . For an associated rv this correspondence is denoted by exp-G( ). Several properties of exp-G distributions have been studied by many authors in recent years.
Equation (19) can be reduced to
where
and is the exp-G density with power parameter (for ).
Equation (20) reveals that the LONG family density can be represented by an infinite linear combination of exp-G densities. However, this mutually means that the pdf/cdf of the LONG family possesses a linear representation in terms of the pdf/cdf of the associated exp-G random variable. Thus, some mathematical properties of the proposed family can be derived from those properties of the exp-G class. We emphasize that more than thirty five exp-G distributions have been published so far. All these distributions can be used to generate new special models of the LONG family. Clearly, we can obtain directly from the linear representation some of their structural properties from those exp-G properties.
MATHEMATICAL PROPERTIES OF LONG FAMILY
The formulae derived throughout the paper can be easily handled in most symbolic computation software platforms such as Maple, Mathematica andMatlab. These platforms have currently the ability to deal with analytic expressions of formidable size and complexity. Established explicit expressions to calculate statistical measures can be more efficient than computing them directly by numerical integration.
The modality analysis of a continuous distribution includes as starting point the number and magnitudes of the related pdf’s peaks. The rv belonging to the LONG family is determined by the pdf given in (7), and then the stationary points set associated with the maxima consists from the values for which . Assuming that the baseline cdf is twice differentiable, the results are the roots of the second-order nonlinear ordinary differential equation (ODE)
Obviously, we consider this equation inside the set . For the solution by setting and reducing (21), we obtain
Integrating twice, first with respect to , and then with respect to , we have
where stands for the inverse error function3 3 The inverse error function y=erf−1(x) can be defined as a function which satisfies erf−1∘erf(x)=x,x∈ℝ ; also it is a particular solution of the nonlinear ODE y″−yy′2=0 . (the related Mathematica code is InverseErf[x]) and the integration constants depend on the form of . For the sake of simplicity, we consider only the case , and then
where , being necessarily normalized.
Next, the saddle point of is determined from
Accordingly, setting , the previous equation reduces to
There is a unique solution , say, of the previous equation (22). Indeed, if , the left–hand–side expression in is greater then . Consequently, the hyperbola and the exponential function have an unique intersection inside the vertical half-strip . So, the abscissa of the pdf’s peak becomes . Obviously, the left –half-plane does not contain any real solution of (22). So, since the stationary point of the pdf(7) is inside , the second order non-linear ODE (21) characterizes the peak which describes the mode of the rv . The rest is obvious.
We can note that (22) should be treated by some of numerical solving methods and it has to be mentioned that the case cannot harm the previous conclusion.
By these we have proved the following result.
Theorem 5. Let X be a rv coming form the LONG distribution family having twice continuously differentiable input baseline cdf G(x). Then, for all λ > 0, the rv X is unimodal with the mode
where z0 = z0(λ) is the solution of the auxiliary equation (22).
The shape of the hrf (8) of can be described by , which readily follows from the familiar formula .
ASYMPTOTICS OF THE CDF AND PDF OF LONG FAMILY. Let us consider the asymptotics of the cdf and pdf of the LONG family near to the infimum of the support interval for the baseline cdf , and also when the argument is growing to the supremum of the support set.
We omit the straightforward proofs of these results, remarking that the asymptotic of the related hrf follows ad definitionem by the previous results for the cdf and pdf.
Proposition 6. Let Then, the asymptotics for the cdf and pdf of the LONG family, presented in Equations (6) and (7), respectively, turn out to be
Moreover, let ; then, there holds true
We omit the straightforward proofs of these results, remarking that the asymptotic of the related hrf follows ad definitionem by the previous results for the cdf and pdf.
MOMENTS. Henceforth, let exp-G . The th moment of can be obtained from (7)and (20) as
where is given by (20) and
The ordinary moments of several special LONG distributions can follow directly from (23). Further, the central moments and cumulants of can be determined from the ordinary moments using well-known formulae.
The th incomplete moment of is determined as
where the last integral can be evaluated at least numerically for most baseline distributions.
The first incomplete moment plays an important role for measuring inequality such as the mean deviations and Lorenz and Bonferroni curves.
GENERATING FUNCTION. Here, we provide a formula for the moment generating function (mgf) ) of . It follows from (7) that
where is the mgf of and
can be evaluated at least numerically for most baseline models.
We can obtain the mgfs of several special LONG distributions directly from both equations in (25).
Inference can be carried out in three different ways: point estimation, interval estimation and hypothesis tests. Several approaches for parameter point estimation were proposed in the literature but the maximum likelihood method is the most commonly employed. The maximum likelihood estimates (MLEs) enjoy desirable properties that can be used when constructing confidence intervals for the model parameters. Large sample theory for these estimates delivers simple approximations that work well in finite samples. The normal approximation for the MLE in distribution theory is easily handled either analytically or numerically.
Here, we consider the estimation of the unknown parameters of the new distribution by the maximum likelihood method. Let be observed values from the LONG family of distribution given by(7)with vector of parameters . The log-likelihood function related to the parameter vector becomes
where and by convention. The function can be maximized either directly by using well-known platforms such as the R (optim function), SAS (PROC NLMIXED), Ox program (MaxBFGS sub-routine) or by solving the nonlinear likelihood equations obtained by differentiation.
The compactness of the parameter space and the continuity of the likelihood function on are sufficient for the existence of the MLE. Also, if this parameter space is convex and the likelihood function is strictly concave in the model parameters, then the MLE is unique when it exists. General conclusions about these items depend on the nature of the parameter space which is related to the baseline distribution . In fact, we do not need the third derivative of the log-likelihood function (as it is assumed by Cramér’s theorem regarding asymptotic of the MLE) with respect to the parameter as stated by Theorem II in Gurland (1954)GURLAND J. 1954. On regularity conditions for maximum likelihood estimators. Scand Actuarial J 1: 71-76. which guarantees the existence of a solution of the likelihood equation, being in the same time a consistent estimator of the involved parameter requiring derivatives only of second order. The MLE solution remains in the same time asymptotical normal and efficient. The LONG cdf given by(6) is assumed to be twice differentiable with respect to the variable . However, for the Gurland’s conditions, it should be three times differentiable with respect to the parameter vector . Since is composite function built by three times differentiable function G, Gurland’s conditions are satisfied. So, the regularity and the existence of MLE with desired properties follow.
The components of the score vector are given by
where denotes the derivative of the function with respect to . Setting and equal to zero and solving the equations simultaneously yield the MLE .
Under general regularity conditions , where is the expected information matrix and denotes asymptotic distribution. For large, can be approximated by the observed information matrix. This normal approximation for the MLE can be used for construing approximate confidence intervals for the parameters and . Likelihood ratio statistics can be adopted in the usual way for testing hypotheses on these parameters.
LONG POWER-CAUCHY DISTRIBUTION AND ITS PROPERTIES
Rooks et al. (2010)ROOKS A, SCHUMACHER A and COORAY K. 2010. The power Cauchy distribution: derivation, description, and composite models.NSF-REU Program Reports. Available from \url{http://www.cst.cmich.edu/mathematics/research/REU_and_LURE.shtml}
http://www.cst.cmich.edu/mathematics/res...
introduced a two-parameter power-Cauchy (PC) distribution. The rv has the PC distribution when the associated cdf and pdf are given by
and
respectively, where is the shape parameter and the scale parameter. We will write this correspondence in the sequel.
From (6),(7), (26) and (27), the LONPC cdf is given by
The related pdf is equal to
Henceforth, a random variable having pdf (29) is denoted by .
Note. For the equation (29)reduces to the log-odd-normal half-Cauchy (LONHC) distribution not known in literature yet.
Figures 1 and 2 display some plots of the density and hrf of when = 1 for different values of and . The plots in Figure 1 reveal that the LONPC density produces only unimodal (right-skewed) shape. The plots in Figure 2 indicate that the hrf of can have decreasing failure rate (DFR) and upside-down bathtub (UBT) shapes.
The shapes of the density and hazard rate functions of can be described analytically using the Theorem 5 which identifies the modality of the LONG distribution family. So, does the LONPC distribution family too. We recall now that
i.e. the ordinary nonlinear second order differential equation covers the functional behavior of the cdf of the LONPC distribution given in (28) when we are looking for the saddle point of the pdf giving the mode of LONPC distributed rv . The related hazard rate shape is a corollary of this result.
The asymptotics of the LONPC cdf and pdf follow the lines of previous LONG Proposition. We have the result.
Proposition 7.The cdf and the pdf of the rv X coming from the LONPC distribution possesses the following asymptotic behavior either when and :
The proof is an immediate consequence of earlier LONG Proposition, the properties of the Laplace function and the fact that the standard normal density is an even function. The asymptotic of the hazard rate function we can derive easily.
The qf of is given by
where defines the inverse error function in terms of the qf of the normal distribution.
Let . Based on second Lemma of LONG family, we obtain
For , we have .
Theorem 8.The rth ordinary moment of has the following form:
where is defined by the coefficients chain
where stands for the Bernoulli number of the order .
Proof. We have ad definitionem with the aid of (10)
The Maclaurin series of the function reads Abramowitz and Stegun (1972, p. 75; Eq. 4.3.67)
where denotes the Bernoulli number of the order . Hence,
for all real as is in accordance with (33). The power series raised by real power (see Appendix B) gives
where according to (43)
Next,
By repeating once more the whole procedure of the odd integer power of a power series reported in Appendix B combined with the multiplication of two similar structure power series, we obtain
where the coefficients can be obtained recursively.
Now, in establishing the computational series expansion result (31) in treating the integral expression (32) of
it remains to establish the value of the constituting integral,viz.
which proves the assertion.
The th ordinary moment of the LONPC distribution can be determined from (23). The ordinary and incomplete moments of the exponentiated-PC (exp-PC) with power parameter can follow from the procedure used in (Tahir et al. 2016TAHIR MH, ZUBAIR M, CORDEIRO GM, ALZAATREH A and MANSOOR M. 2016. The Poisson-X family of distributions. J Stat Comput Simul 86: 2901-2921.) and the power series
where , , , etc.
After some algebra, an alternative expression for the th moment of based on the rv can be expressed as
where , is obtained from (36).
From Equation (24), the th incomplete moment of is given by
where .
Tahir et al. (2016, Sect. 6.10) obtained the mgf of the EPC distribution using exponential partial Bell polynomials given by
where the sum varies over all integers such that and . These polynomials can be evaluated inMaTHEMATICA andMAPLE.
They demonstrated that the mgf of , say , can be expressed as
where , comes from Equation (36) and, for while, .
Then, the mgf of follows from Equation (25) as
where can be determined from Equation (37).
The estimation of the unknown parameters of the LONPC distribution is dealt by the maximum likelihood method. Let be observed values from the LONPC distribution given by (29) with vector of parameters . The log-likelihood for is given by
where .
The MLE of can be obtained by maximizing the log-likelihood function with respect to . There are several routines available for numerical maximization of (38) given in theR program (optimfunction), SAS(PROC NLMIXED), Ox(sub-routine MaxBFGS). Alternatively, we can differentiating (38)and solving the resulting nonlinear likelihood equations.
The partial derivatives of with respect to the parameters are given by
where .
APPLICATION AS AN EMPIRICAL ILLUSTRATION OF LONG FAMILY
In this section, we fit the LONPC model along with some other competing models to a real data set. We compare the goodness-of-fit of the LONPC model with the beta-half-Cauchy (BHC) (Cordeiro and Lemonte, 2011CORDEIRO GM and LEMONTE AJ. 2011. The beta-half-Cauchy distribution. J Probab Statist ArtID-904705: 18 p.), Kumaraswamy half-Cauchy (KHC) (Ghosh, 2014GHOSH I. 2014. The Kumaraswamy-half Cauchy distribution: Properties and applications. J Stat Theory Applic 13: 122-134.), gamma half-Cauchy (GHC) (Alzaatreh et al. 2016ALZAATREH A, MANSOOR M, TAHIR MH, ZUBAIR M and GHAZALI SSA. 2016. The gamma half-Cauchy distribution: properties and applications. Hacet J Math Stat 45: 1143-1159.), exponentiated half-Cauchy (EHC) and PC models. For each model, we estimate the parameters by using the method of maximum likelihood and adopt the Cramér–von Mises ( ), Anderson-Darling ( ) and Kolmogrov-Smirnov (K-S) statistics for model comparison purposes. In general, the smaller the values of these statistics, the better the fit to the data. The densities (for ) and parameters ( ) of the BHC, KHC, GHC and
EHC models are, respectively, given by
The data set represents the results of a life testing experiment in which specimens of a type of electrical insulating fluid were subjected to a constant voltage stress. The length of time until each specimen failed or “broke down” was observed. Seven groups of specimens were tested at voltages ranging from 26 to 38 kilovolts (KV). More details about the data can be found in Nelson (1972)NELSEN WB. 1972. Graphical analysis of accelerated life test data with the inverse power law model. IEEE Trans Reliab R-21: 2-11. A summary of the data is: =76, =98.55763, =340.7395, skewness=5.14647 and kurtosis=27.29282. The MLEs (with SEs in parentheses), , and K-S statistics are listed in Table I. The three goodness-of-fit statistics indicate that the LONPC model provides the best fit. The histogram of these data and the estimated pdfs of the LONPC distribution and its competitive models are displayed in Figure 3. It is clear from Table I that the LONPC model provides the best fit. Furthermore, the TTT plot of these data is shown in Figure 4a. It is convex, which suggests a decreasing failure rate. Figure 4b displays the plot of the estimated LONPC hrf, which is in agreement with that TTT plot of Figure 4a.
CONCLUSIONS
We introduce and study a new wide log-odd normal generalized family of distributions and obtain some properties of a special model called the log-odd normal power-Cauchy distribution. We derive a linear representation for the family density in terms of exponentiated densities. Some structural properties of the new family and of its special distribution are determined including quantile and generating functions, ordinary and incomplete moments, asymptotics, among others. We estimate the model parameters by the maximum likelihood method. We compare the performance of the new distribution with other related distributions by means of a real air pollution data set using classical goodness-of-fit statistics.
-
1
Here and in what follows, denotes the composite function of , respectively.
-
2
Here, as usual we write for equality of random variables in distribution.
-
3
The inverse error function can be defined as a function which satisfies ; also it is a particular solution of the nonlinear ODE .
ACKNOWLEGMENTS
The authors wish to thank the two referees for their valuable comments which led to improvement in the earlier version of the article.
REFERENCES
- ABRAMOWITZ M and STEGUN IA. 1972. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Applied Mathematics Series 55, National Bureau of Standards, Washington, D. C., 9th Reprinted ed. New York: Dover Publications.
- AL-AQTASH R, FAMOYE F and LEE C. 2015. On generating a new family of distributions using the logit function. J Probab Stat Sci 13: 135-152.
- AL-AQTASH R, LEE C and FAMOYE F. 2014. Gumbel-Weibull distribution: Properties and applications. J Mod Appl Stat Methods 13: 201-225.
- ALEXANDER C., CORDEIRO GM, ORTEGA EMM AND SARABIA JM. 2012. Generalized beta-generated distributions. Comput Stat Data Anal 56: 1880-1897.
- ALIZADEH M, CORDEIRO GM, NASCIMENTO ADC, LIMA MCS and ORTEGA EMM. 2017. Odd-Burr generalized family of distributions with some applications. J Stat Comput Simul 87: 367-389.
- ALZAATREH A, FAMOYE F and LEE C. 2013. A new method for generating families of continuous distributions. Metron 71: 63-79.
- ALZAATREH A, FAMOYE F and LEE C. 2014a. The gamma-normal distribution: Properties and applications. Comput Stat Data Anal 69: 67-80.
- ALZAATREH A, FAMOYE F and LEE C. 2014b. T-normal family of distributions: A new approach to generalize the normal distribution. J Stat Dist Applic 1: Art 16.
- ALZAATREH A, MANSOOR M, TAHIR MH, ZUBAIR M and GHAZALI SSA. 2016. The gamma half-Cauchy distribution: properties and applications. Hacet J Math Stat 45: 1143-1159.
- AZZALINI A. 1985. A class of distributions which includes the normal ones. Scand J Statist 12: 171-178.
- BARGA AS, CORDEIRO GM, ORTEGA EMM and da-CRUZ JN. 2016. The odd log-logistic normal distribution: Theory and applications. J Stat Theory Prac 10: 311-335.
- BOURGUIGNON M, SILVA RB and CORDEIRO GM 2014. The Weibull--G family of probability distributions. J Data Sci 12: 53-68.
- COMTET L. 1974. Advanced Combinatorics. The art of finite and infinite expansions. Revised and enlarged edition. Dordrecht: D. Reidel Pubishing Corporation.
- COORAY K and ANANDA MMA 2008. A generalization of the half-normal distribution with applications to lifetime data. Commun Stat Theory Methods 37: 1323-1337.
- CORDEIRO GM, ALIZADEH M, RAMIRES TG and ORTEGA EEM. 2017. The generalized odd half-Cauchy family of distributions: Properties and applications. Commun Stat Theory Methods 46: 5685-5705.
- CORDEIRO GM, CINTRA RJ, RÊGO LC and ORTEGA EEM 2012. The McDonald normal distribution. Pak J Stat Oper Res 8: 301-329.
- CORDEIRO GM and DE-CASTRO M. 2011. A new family of generalized distributions. J Stat Comput Simul 81: 883-893.
- CORDEIRO GM and LEMONTE AJ. 2011. The beta-half-Cauchy distribution. J Probab Statist ArtID-904705: 18 p.
- CORREA MA, NOGUEIRA DA and FERREIRA EB. 2012. Kumaraswamy Normal and Azzalini’s skew Normal modeling asymmetry. Sigmae 1: 65-83.
- EUGENE N, LEE C and FAMOYE F. 2002. Beta-normal distribution and its applications. Commun Stat Theory Methods 31: 497-512.
- FAMOYE F, LEE C and EUGENE N. 2004. Beta-normal Distribution: Bimodality properties and application. J Mod Appl Stat Methods 3: 85-103.
- GHOSH I. 2014. The Kumaraswamy-half Cauchy distribution: Properties and applications. J Stat Theory Applic 13: 122-134.
- GLEATON JU and LYNCH JD. 2006. Properties of generalized log-logistic families of lifetime distributions. J Probab Stat Sci 4: 51-64.
- GRADSHTEYN IS and RYZHIK IM. 2000. Table of Integrals, Series and Products. 6th ed. San Diego: Academic Press.
- GUI W, CHEN P-C and WU H. 2013. A folded normal slash distribution and its applications to non-negative measurements. J Data Sci 11: 231-247.
- GURLAND J. 1954. On regularity conditions for maximum likelihood estimators. Scand Actuarial J 1: 71-76.
- LEONE FC, NELSON LS and NOTTINGHAM RB. 1961. The folded normal distribution. Technometrics 3: 543-550.
- LIMA MCS, CORDEIRO GM and ORTEGA EMM 2015. A new extension of the normal distribution. J Data Sci 13: 385-408
- MÜLLER JW. 1987. New light on powers of power series. Rapport BIPM 87: 15 p.
- NELSEN WB. 1972. Graphical analysis of accelerated life test data with the inverse power law model. IEEE Trans Reliab R-21: 2-11
- O'HAGAN and LEONARD T. 1976. Bayes estimation subject to uncertainty about parameter constraints. Biometrika 63: 201-202.
- ORTEGA EMM, LEMONTE AJ, CORDEIRO GM and DA-CRUZ JN. 2016. The odd Birnaum-Saunders regression model with applications to lifetime data. J Stat Theory Prac 10: 780-804
- ROGERS WH and TUKEY JW. 1972. Understanding some long-tailed symmetrical distributions. Stat Neerl 26: 211-226.
- ROOKS A, SCHUMACHER A and COORAY K. 2010. The power Cauchy distribution: derivation, description, and composite models.NSF-REU Program Reports. Available from \url{http://www.cst.cmich.edu/mathematics/research/REU_and_LURE.shtml}
» http://www.cst.cmich.edu/mathematics/research/REU_and_LURE.shtml - SLATER LJ. 1966. Generalized Hypergeometric Functions. Cambridge: Cambridge University Press.
- SRIVATAVA HM and MANOCHA L. 1984. A treatise on generating functions. Chichester: Ellis Harwood Limited Publishers.
- TAHIR MH, CORDEIRO GM, ALIZADEH M, MANSOOR M, ZUBAIR M and HAMEDANI GG 2015. The odd generalized exponential family of distributions with applications. J Stat Dist Applic 2: Art. 1.
- TAHIR MH, ZUBAIR M, CORDEIRO GM, ALZAATREH A and MANSOOR M. 2016. The Poisson-X family of distributions. J Stat Comput Simul 86: 2901-2921.
- TORABI H and MONTAZERI NH. 2012. The gamma-uniform distribution and its application. Kybernetika 48: 16-30.
- TORABI H and MONTAZERI NH. 2014. The logistic-uniform distribution and its application. Commun Stat Simul Comput 43: 2551-2569.
- WANG Y, HOSSAIN AM and ZIMMER WJ. 2003. Monotone log-odds rate distribution in reliability analysis. Commun Stat Theory Methods 32: 2227-2244.
- WIPER MP, GIRÓN FJ and PESWEY A. 2008. Objective Bayesian inference for the half-normal and half-$t$ distributions. Commun Stat Theory Methods 37: 3165-3185.
- ZOGRAFOS K and BALAKRISHNAN N. 2009. On families of beta- and generalized gamma-generated distributions and associated inference. Stat Methodol 6: 344-362.
APPENDIX A
For the sake of simplicity denote which satisfies being . Then mutatis mutandis (Abramowitz and StegunABRAMOWITZ M and STEGUN IA. 1972. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Applied Mathematics Series 55, National Bureau of Standards, Washington, D. C., 9th Reprinted ed. New York: Dover Publications., 1972, pp. 87-8, Eqs. 4.6.22, 4.6.33)
where stands for the familiar Gaussian hypergeometric function. The representation holds true for all whenever the baseline function .
The hypergeometric form of the error function reads (Abramowitz and Stegun, 1972, p. 297, Eq. 7.1.5)
For numerical evaluations in both cases the partial sums which turns out to be polynomials of hypergeometric type can be used.
APPENDIX B
The power series raised to positive integer powers reads as follows (Gradshteyn and RyzhikGRADSHTEYN IS and RYZHIK IM. 2000. Table of Integrals, Series and Products. 6th ed. San Diego: Academic Press., 2000; p. 17, Sect. 0. 314)
where the coefficients are given recursively
The real power case is investigated by Müller (1987MÜLLER JW. 1987. New light on powers of power series. Rapport BIPM 87: 15 p.; p. 3, Eq. (6a))
Here the summation should be used and upon all . On the other hand is written in the Bell’s polynomial terminology.
Recalling in brief, the exponential partial Bell polynomials (Comtet 1974COMTET L. 1974. Advanced Combinatorics. The art of finite and infinite expansions. Revised and enlarged edition. Dordrecht: D. Reidel Pubishing Corporation.) are described by
where
and the summation takes place over all integers , which verify and . These polynomials can be computed in Mathematica using theBellY function and in Maple using theIncompleteBellB function.
Publication Dates
-
Publication in this collection
01 July 2019 -
Date of issue
2019
History
-
Received
20 Mar 2018 -
Accepted
17 July 2018