FRACTIONAL ORDER LOG BARRIER INTERIOR POINT ALGORITHM FOR POLYNOMIAL REGRESSION IN THE ℓ p -NORM

Grigoletto, Eliana Contharteze

doi:10.1590/0101-7438.2022.042.00253422

ABSTRACT

Fractional calculus is the branch of mathematics that studies the several possibilities of generalizing the derivative and integral of a function to noninteger order. Recent studies found in literature have confirmed the importance of fractional calculus for minimization problems. However, the study of fractional calculus in interior point methods for solving optimization problems is still new. In this study, inspired in applications of fractional calculus in many fields, was developed the so-called fractional order log barrier interior point algorithm by replacing some integer derivatives for the corresponding fractional ones on the first order optimality conditions of Karush-Kuhn-Tucker to solve polynomial regression models in the ℓ _p −norm for 1 < p < 2. Finally, numerical experiments are performed to illustrate the proposed algorithm.

Keywords: nonlinear programming; polynomial regression; ℓp−norm; interior point method; fractional derivative

1 INTRODUCTION

It is fundamentally important to make predictions based upon scientific data. The problem of fitting curves to data points has many practical applications Bard (1974); Sevaux & Mineur (2007); Chatterjee & Hadi (2012).

Given a set of m data points in $ℝ^{2} : {\{(a_{i}, b_{i})\}}_{i = 1}^{m}$ , where a _i is an argument value and b _i a corresponding dependent value, with $a_{i} \neq a_{j}$ for all $i \neq j$ . The curve fitting procedure tries to build a linear or nonlinear function y = f (x), defined for all possible choices of x, that approximately fits the data set. The fitted curves to the data by f are most often chosen to be polynomials Süli & Mayers (2003).

Let y = f (x) be a polynomial function of degree n− 1 of the form $f (x) = x_{0} + x_{1} x + \dots + x_{n - 1} x^{n - 1}$ , the candidate function to fit data. The procedure to fit the polynomial function to data, in the ℓ _p −norm, determines the vector $x = {(x_{0}, x_{1}, x_{2}, \dots, x_{n - 1})}^{⊺} \in ℝ^{n}$ , where the superscript T represents transpose, that minimizes the ℓ _p −norm of the residual error as follows

\min Φ (x) = {||b - A x||}_{p},

(1)

where ${||\cdot||}_{p}$ denotes ℓ _p −norm, $b = {(b_{1}, b_{2}, \dots, b_{m})}^{⊺} \in ℝ^{m}$ and $A \in ℝ^{m \times n}$ is a Vandermonde matrix, with rank (A) = m, which can be written as

A = (\begin{matrix} 1 & a_{1} & {a_{1}}^{2} & \dots & {a_{1}}^{n - 1} \\ 1 & a_{2} & {a_{2}}^{2} & \dots & {a_{2}}^{n - 1} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & a_{m} & {a_{m}}^{2} & \dots & {a_{m}}^{n - 1} \end{matrix}) .

(2)

This is a nonlinear (or linear) regression model if n > 2 (or n = 2). If m ≤ n, there is an (n − 1)th degree polynomial satisfying $f (a_{i}) = b_{i} (i = 1, 2, \dots, m)$ . If m > n, the problem cannot be exactly solved and one needs to find the vector x in (1).

In many industrial applications, missing data or anomalous values can arise as errors of measurement instruments during the data generation. In this cases, polynomial regression models in the ℓ _p −norm, with $p \neq 2$ , are more robust Forsythe (1972). It is important to choose an appropriate value for p and several criteria for the choice of p have been studied Rice & White (1964).

The calculus of integrals and derivatives of arbitrary order, named as fractional calculus, was conceptualized in connection with the infinitesimal calculus since 1695 Oldham & Spanier (1974). Some of the application areas include: viscoelasticity, signal processing, probability, statistics, electrochemistry, diffusion in porous media, fluid flow, backpropagation training of neural networks, fuzzy control method, and so on Kilbas et al. (2006); Dalir & Bashour (2010); Mohammadzadeh & Kayacan (2020); Wang et al. (2017a); Chen et al. (2017a); Grigoletto & Oliveira (2020). Also, its importance in minimization problems has been confirmed by recent works found in literature. For example, Chen et al. Chen et al. (2017b) presented the fractional order gradient methods (FOGMs) by writing the Riemann-Liouville and Caputo fractional derivatives as Taylor series, and Wang et al. Wang et al. (2017b) proposed the fractional gradient descent method employing the Caputo derivative for the backpropagation training of neural networks. However, the study of fractional calculus in the field of interior point methods for solving optimization problems is still new.

The goal of this study is to investigate the fractional order log barrier interior point algorithm, involving the Caputo fractional derivative. It is based on replacing some integer derivatives for the corresponding fractional ones on the first order optimality conditions for solving polynomial regression models in the ℓ _p −norm for 1  0 and 1 < p < 2, arise when solving problem 1), according to the approach which will be discussed throughout this study. The Caputo fractional derivative, in addition to generalize the integer order derivative, is a useful tool to obtain the derivative for functions as g(x), as p is on the interval 1 < p < 2, then it will be a non integer number.

This paper is organized as follows. Preliminary concepts of fractional calculus are presented in Section 2. In Section 3, the fractional order log barrier interior point algorithm for solving polynomial regression models in the ℓ _p −norm is discussed. Numerical experiments are performed to illustrate the proposed algorithm in Section 4, and the Section 5 contains the conclusions.

2 PRELIMINARY CONCEPTS OF FRACTIONAL CALCULUS

Some basic concepts and definitions involving special functions and the Caputo fractional derivatives will be presented in this section.

Definition 1. The gamma function Γ(z) is originally defined by Erdélyi et al. (1981); Andrews et al. (1999):

Γ (z) = \int_{0}^{\infty} t^{z - 1} e^{- t} d t (z > 0) .

(3)

For gamma function, the reduction formula

Γ (z + 1) = z Γ (z),

(4)

holds. In particular, when $z = n \in ℕ$ , then $Γ (n + 1) = n!$ .

Definition 2. The beta function B(z, w) is defined by Erdélyi et al. (1981); Andrews et al. (1999):

B (z, w) = \int_{0}^{\infty} ξ^{z - 1} {(1 + ξ)}^{- z - w} d ξ (z > 0; w > 0) .

(5)

It is connected with the gamma function by the following relation

B (z, w) = \frac{Γ (z) Γ (w)}{Γ (z + w)} (z > 0; w > 0) .

(6)

The fractional calculus basically defines several integral and derivative operators of arbitrary order. The Caputo fractional derivative is one of the fractional derivative operators Kilbas et al. (2006).

Definition 3. The Caputo left-sided fractional derivative with respect to x, of order 0 < α < 1, of functions f on a subset Ω = [a, b] of real axis $ℝ = (- \infty, \infty)$ , where $f \in C^{1} [a, b]$ , denoted by $({}_{x}^{C}D_{a +}^{α} f) (x)$ , is given by

({}_{x}^{C}D_{a +}^{α} f) (x) : = \frac{1}{Γ (1 - α)} \int_{a}^{x} {(x - τ)}^{- α} f' (τ) d τ (x \in Ω) .

(7)

Note that the Caputo fractional derivative is a nonlocal fractional derivative, because it depends on the choice of the order α, function f (x), x, and also relies on the total effects on the interval [a, x]. Usually, this is called memory effect.

In particular, $({}_{x}^{C}D_{a +}^{1} f) (x) : = f' (x)$ .

Property 1. Let $g (x) = {(x + κ)}^{p}$ be defined for $- κ \leq x < \infty$ , with κ ≥ 0, and 1 < p < 2, and let 0 < α < 1. Then

[{}_{x}^{C}D_{- κ +}^{α} {(x + κ)}^{p}] (x) = \frac{Γ (p + 1)}{Γ (p - α + 1)} {(x + κ)}^{p - α} .

(8)

Proof. Applying the Caputo left-sided fractional derivative (7) to the function g(x) with a = −κ,

[{}_{x}^{C}D_{- κ +}^{α} {(x + κ)}^{p}] (x) \frac{1}{Γ (1 - α)} \int_{- κ}^{x} {(x - τ)}^{- α} p {(τ + κ)}^{p - 1} d τ .

(9)

Using the change of variables $ξ = \frac{τ + κ}{x - τ}$ , then:

τ = \frac{ξ x - κ}{1 + ξ}, d τ = \frac{(x + κ)}{{(1 + ξ)}^{2}} d ξ, τ \to - κ \Rightarrow ξ \to 0 and τ \to x \Rightarrow ξ \to \infty .

The integral (9) under this change of variables and by means of (4)-(6) takes the form

[{}_{x}^{C}D_{- κ +}^{α} {(x + κ)}^{p}] (x) = \frac{p}{Γ (1 - α)} \int_{- κ}^{x} {(x - τ)}^{- α} {(τ + κ)}^{p - 1} d τ = \frac{p}{Γ (1 - α)} \int_{0}^{\infty} {(\frac{x + κ}{1 + ξ})}^{- α} {(\frac{ξ (x + κ)}{1 + ξ})}^{p - 1} \frac{(x + κ)}{{(1 + ξ)}^{2}} d ξ = \frac{p {(x + κ)}^{p - α}}{Γ (1 - α)} \int_{0}^{\infty} ξ^{p - 1} {(1 + ξ)}^{α - p - 1} d ξ = \frac{p {(x + κ)}^{p - α}}{Γ (1 - α)} B (p, 1 - α) = \frac{p {(x + κ)}^{p - α}}{Γ (1 - α)} \frac{Γ (p) Γ (1 - α)}{Γ (p - α + 1)} = \frac{Γ (p + 1)}{Γ (p - α + 1)} {(x + κ)}^{p - α} .

□

3 FRACTIONAL ORDER LOG BARRIER INTERIOR POINT ALGORITHM

Raising $Φ (x) = {||b - A x||}_{p}$ to the power p, the problem (1) can be rewritten in the alternative form as follows

\min {||b - A x||}_{p}^{p},

(10)

where x ∈ ℝⁿ . The problem (10) is similar to the following nonlinear optimization problem

\min {||r||}_{p}^{p} s.t. A x + r = b, x \in ℝ^{n},

(11)

where $r = {(r_{1}, r_{2}, \dots, r_{m})}^{⊺} \in ℝ^{m}$ is the residual vector of regression and ℓ _p −norm is defined of the form ${||r||}_{p} = {(\sum_{i = 1}^{m} {|r_{i}|}^{p})}^{\frac{1}{p}} (1 \leq p < \infty)$ and ${||r||}_{\infty} = \max |r_{i}|$ .

The parameter values p most commonly used to the problem (11) are p = 1, p = 2 and p → ∞. Linear programming procedures Charnes et al. (1955); Oliveira et al. (2000); Oliveira & Lyra (2004) can be used for p = 1 and p → ∞ cases. For p = 2, the direct solution is given by x = (A ^⊺ A)⁻¹ A ^⊺ b. For other values of p, unconstrained minimization procedures can be used Dennis Jr. & Schnabel (1983); Li (1993); Cantante et al. (2012).

Considering the unrestricted residual term r _i as the difference between two nonnegative variables u _i and v _i: r _i = u _i −v _i , with u _i and v _i being defined by

u_{i} = \{\begin{cases} r_{i}, if r_{i} \geq 0, \\ 0, if r_{i} < 0, \end{cases} v_{i} = \{\begin{cases} 0, if r_{i} \geq 0, \\ - r_{i}, if r_{i} < 0, \end{cases}

(12)

for all i = 1, 2, . . . , m, then |r _i | = u _i + v _i and the problem (11) can be converted into a convex programming problem Charnes et al. (1955); Cantante et al. (2012):

\min \sum_{i = 1}^{m} {(u_{i} + v_{i})}^{p} s.t. A x + u - v = b, x \in ℝ^{n}, u \in ℝ_{+}^{m}, v \in ℝ_{+}^{m},

(13)

where $ℝ_{+}^{m}$ denote the set of m-dimensional nonnegative vectors.

Interior point methods are widely used to solve convex optimization problems because of their good performance in practice Biegler (2010); Gondzio (2012); Lilian et al. (2016). By adding a logarithmic barrier function to the objective function in (13), the barrier problem is given by

\min \sum_{i = 1}^{m} [{(u_{i} + v_{i})}^{p} - μ \ln u_{i} - μ \ln v_{i}] s.t. A x + u - v = b, x \in ℝ^{n},

(14)

where µ > 0 is the barrier parameter. An optimal solution of (13) can be found by solving a series of barrier problems of the form (14) while µ is decreasing and going to zero. The Lagrangian function associated with the problem (14) is

L (x, λ, u, v) = \sum_{i = 1}^{m} [{(u_{i} + v_{i})}^{p} - μ \ln u_{i} - μ \ln v_{i}] + λ^{⊺} (A x + u - v - b),

(15)

where $λ = {(λ_{1}, λ_{2}, \dots, λ_{m})}^{⊺} \in ℝ^{m}$ is a Lagrange multiplier vector.

Let ϕ and φ be the functions given by

ϕ (x, λ, u, v) = \sum_{i = 1}^{m} [{(u_{i} + v_{i})}^{p}],

(16)

and

φ (x, λ, u, v) = - μ \sum_{i = 1}^{m} (\ln u_{i} + \ln v_{i}) + λ^{⊺} (A x + u - v - b),

(17)

then the Lagrangian function can be written as

L (x, λ, u, v) = ϕ (x, λ, u, v) + φ (x, λ, u, v) .

(18)

The fractional order log barrier interior point algorithm, involving the Caputo fractional derivative, is based on replacing some integer derivatives for the corresponding fractional ones on the first order optimality conditions (𝛻L = 𝛻ϕ + 𝛻φ = 0), of the following form

\nabla^{α} ϕ + \nabla φ = [\begin{matrix} \nabla_{x} ϕ + \nabla_{x} φ \\ \nabla_{λ} ϕ + \nabla_{λ} φ \\ \nabla_{u}^{α} ϕ + \nabla_{u} φ \\ \nabla_{v}^{α} ϕ + \nabla_{v} φ \end{matrix}] = [\begin{matrix} A^{⊺} λ \\ A x + u - v - b \\ g_{α} - μ U^{- 1} 1_{m} + λ \\ g_{α} - μ V^{- 1} 1_{m} - λ \end{matrix}] = [\begin{array}{r} 0 \\ 0 \\ 0 \\ 0 \end{array}],

(19)

where 𝛻_x , 𝛻_λ , 𝛻_u and 𝛻_v represent the gradient with respect to x, λ , u and v, respectively, $\nabla_{u}^{α}$ and $\nabla_{v}^{α}$ represent the fractional order gradient with respect to u and v, respectively, given by Caputo fractional derivative (7).

Furthermore, $1_{m} = {(1, 1, \dots, 1)}^{⊺} \in ℝ^{m}, U = diag (u), V = diag (v)$ , where diag(w) denote the diagonal matrix from a vector w, and $g_{α} \in ℝ^{m}$ . The ith component of the vector g _α is ${(g_{α})}_{i} = [{}_{u_{i}}^{C}{D_{- v_{i} +}^{α}} {(u_{i} + v_{i})}^{p}] (u_{i}) = [{}_{v_{i}}^{C}{D_{- u_{i} +}^{α}} {(u_{i} + v_{i})}^{p}] (v_{i})$ . By Property 1, for i = 1, 2, . . . , m, (g _α )_i is given by

{(g_{α})}_{i} = \frac{Γ (p + 1)}{Γ (p + 1 - α)} {(u_{i} + v_{i})}^{p - α} (1 < p < 2; 0 < α < 1) .

(20)

The nonlinear system of equations (19) can be rewritten in the alternative form as follows

[\begin{matrix} A^{⊺} λ \\ A x + u - v - b \\ U (g_{α} + λ) \\ V (g_{α} - λ) \end{matrix}] = [\begin{array}{r} 0 \\ 0 \\ μ_{m} \\ μ_{m} \end{array}],

(21)

where $μ_{m} = {(μ, μ, \dots, μ)}^{⊺} \in ℝ^{m}$ .

For given µ > 0, if α = 1 in the system (21), then it recovers the first order optimality conditions, since $\lim_{α \to 1} (\nabla^{α} ϕ + \nabla φ) = \nabla ϕ + \nabla φ = \nabla L$ , and has a unique solution $(x_{μ}^{*}, λ_{μ}^{*}, u_{μ}^{*}, v_{μ}^{*})$ . When the classical gradient is replaced by the fractional one (0 < α < 1), then if the system (21) has a solution $({x_{μ}^{α}}^{*}, {λ_{μ}^{α}}^{*}, {u_{μ}^{α}}^{*}, {v_{μ}^{α}}^{*})$ , it can be called fractional solution and $\lim_{α \to 1} ({x_{μ}^{α}}^{*}, {λ_{μ}^{α}}^{*}, {u_{μ}^{α}}^{*}, {v_{μ}^{α}}^{*}) = (x_{μ}^{*}, λ_{μ}^{*}, u_{μ}^{*}, v_{μ}^{*})$ .

For given µ > 0 and 0 < α ≤ 1, suppose that the system (21) has a solution. Given an initial point (x ⁰, λ ⁰, u ⁰, v ⁰) such that $u^{0} \in ℝ_{+ +}^{m}, v^{0} \in ℝ_{+ +}^{m}$ , where $ℝ_{+ +}^{m}$ denote the set of m-dimensional positive vectors, and Ax ⁰ + u ⁰ −v ⁰ = b, by applying Newton method to the system (21), then the search direction (Δx, Δλ, Δu, Δv) can be found by solving the following Newton system

[\begin{matrix} 0 & A^{⊺} & 0 & 0 \\ A & 0_{m} & I_{m} & - I_{m} \\ 0 & U & D_{u} & U H_{α} \\ 0 & - V & V H_{α} & D_{v} \end{matrix}] [\begin{array}{r} Δ x \\ Δ λ \\ Δ u \\ Δ v \end{array}] = [\begin{array}{r} r_{1} \\ r_{2} \\ r_{3} \\ r_{4} \end{array}],

(22)

where

\begin{array}{rcl} r_{1} & = & - A^{⊺} λ, \\ r_{2} & = & - A x - u + v + b, \\ r_{3} & = & - U (g_{α} + λ) + μ_{m}, \\ r_{4} & = & - V (g_{α} - λ) + μ_{m}, \end{array}

(23)

I_m is the identity matrix of order m, D _u = G _α + λI _m +UH _α , D _v = G _α −λI _m +VH _α , where G _α = diag(g _α ), H _α = diag(h _α ), where h _α ∈ ℝ_m and for i = 1, 2, . . . , m, the ith component of the vector h _α is given by

{(h_{α})}_{i} = \frac{Γ (p + 1)}{Γ (p - α)} {(u_{i} + v_{i})}^{p - α - 1} .

(24)

The (h _α )_i component is obtained by evaluating the derivative

{(h_{α})}_{i} = \frac{\partial}{\partial u_{i}} [{(g_{α})}_{i}] = \frac{(p - α) Γ (p + 1)}{Γ (p + 1 - α)} {(u_{i} + v_{i})}^{p - α - 1} = \frac{\partial}{\partial v_{i}} [{(g_{α})}_{i}],

(25)

and taking (4) into account, Γ(p + 1 −α) can be rewritten in the form (p−α)Γ(p−α). So, (h _α )_i in equation (24) can be obtained.

For given µ > 0, the new iterate barrier parameter $\hat{μ}$ is updated in the following form Wright (1996):

\hat{μ} = \frac{μ}{β},

(26)

where β > 1 is added to control your decay and to improve the convergence process.

To keep off the next û and $\hat{v}$ in the border region is required to shorten the step size α _uv by introducing a parameter σ, with 0 < σ < 1, which is often a value close to 1, as follows Biegler (2010); Vanderbei (2020):

α_{u v} = \min \{σ (\min_{{(Δ u)}_{i} < 0} \frac{u_{i}}{|{(Δ u)}_{i}|}), σ (\min_{{(Δ v)}_{i} < 0} \frac{v_{i}}{|{(Δ v)}_{i}|}), σ\} .

(27)

Given a point (x, λ, u, v), the new iterate $(\hat{x}, \hat{λ}, \hat{u}, \hat{v})$ is given by

\hat{x} = x + α_{u v} Δ x, \hat{λ} = λ + α_{u v} Δ λ, \hat{u} = u + α_{u v} Δ u, \hat{v} = v + α_{u v} Δ v .

3.1 Search direction

The search direction (∆x, ∆λ, ∆u, ∆v) is unique and can be obtained by solving the Newton system (22):

A^{⊺} Δ λ = r_{1},

(28)

A Δ x + Δ u - Δ v = r_{2},

(29)

U Δ λ + D_{u} Δ u + U H_{α} Δ v = r_{3},

(30)

- V Δ λ + V H_{α} Δ u + D_{v} Δ v = r_{4} .

(31)

Isolating ∆u in (30),

Δ u = D_{u}^{- 1} (r_{3} - U Δ λ - U H_{α} Δ v) .

(32)

Substituting (32) into (31) and isolating ∆v,

Δ v = {D_{1}}^{- 1} [r_{4} + V Δ λ + V H_{α} D_{u}^{- 1} (U Δ λ - r_{3})],

(33)

where

D_{1} = D_{v} - U V {(H_{α})}^{2} D_{u}^{- 1} .

(34)

Substituting (32) into (29),

A Δ x + D_{u}^{- 1} (r_{3} - U Δ λ) - (U H_{α} D_{u}^{- 1} + I_{m}) Δ v = r_{2} .

(35)

Now, substituting (33) into (35),

Δ λ = {D_{2}}^{- 1} [r_{2} - A Δ x - D_{u}^{- 1} r_{3} + {D_{1}}^{- 1} (U H_{α} D_{u}^{- 1} + I_{m}) (r_{4} - V H_{α} D_{u}^{- 1} r_{3})],

(36)

where

D_{2} = - U D_{u}^{- 1} - V {D_{1}}^{- 1} {(U H_{α} D_{u}^{- 1} + I_{m})}^{2} .

(37)

Then, ∆x can be obtained by replacing ∆λ given by (36) into (28). In this case, ∆x can be written as follows

Δ x = {(A^{⊺} {D_{2}}^{- 1} A)}^{- 1} \bar{r},

(38)

where

\bar{r} = - r_{1} + A^{⊺} {D_{2}}^{- 1} [r_{2} - D_{u}^{- 1} r_{3} + {D_{1}}^{- 1} (U H_{α} D_{u}^{- 1} + I_{m}) (r_{4} - V H_{α} D_{u}^{- 1} r_{3})] .

(39)

3.2 Initialization

The initial point (x ⁰, λ ⁰, u ⁰, v ⁰) is chosen so that $u^{0} \in ℝ_{+ +}^{m}, v^{0} \in ℝ_{+ +}^{m}$ and $A x^{0} + u^{0} - v^{0} = b$ , and is given by Coleman & Li (1992); Oliveira & Cantante (2004):

\begin{array}{rcl} x^{0} & = & {(A^{⊺} A)}^{- 1} A^{⊺} b, \\ r^{0} & = & b - A x^{0}, \\ λ^{0} & = & \frac{δ r^{0}}{{||r^{0}||}_{\infty}},with δ = 0.975, \\ u_{i}^{0} & = & \{\begin{cases} r_{i}^{0} + ϵ_{1}, & if r_{i}^{0} \geq 0, \\ ϵ_{1}, & if r_{i}^{0} < 0, \end{cases} (for i = 1, 2, \dots, m), \\ v_{i}^{0} & = & \{\begin{cases} ϵ_{1}, & if r_{i}^{0} \geq 0, \\ ϵ_{1} - r_{i}^{0}, & if r_{i}^{0} < 0, \end{cases} (for i = 1, 2, \dots, m), \end{array}

(40)

where ε ₁ is a positive value close to zero.

The x ⁰ choice is the direct solution to the problem (11) for p = 2, and has proved to be a good choice in numerical experiments performed for polynomial regression models in the ℓ _p −norm for values of p reasonably closer to 2 Cantante et al. (2012); Oliveira & Cantante (2004); Cantane (2004); Grigoletto (2011).

3.3 Termination criteria

The fractional order log barrier interior point algorithm terminates if at least one of the two following convergence criteria is satisfied:

N = \frac{||(\nabla^{α} ϕ + \nabla φ)||}{(1 + ||x|| + ||u|| + ||v|| + ||λ||) (2 m)} \leq ϵ_{2},

(41)

where m is the number of rows of the matrix A, and

|\hat{N} - N| \leq ϵ_{3},

(42)

where ε ₂ and ε ₃ are positive values close to zero, and $\hat{N}$ is the new iterate, given by

\hat{N} = \frac{||(\nabla^{α} \hat{ϕ} + \nabla \hat{φ})||}{(1 + ||\hat{x}|| + ||\hat{u}|| + ||\hat{v}|| + ||\hat{λ}||) (2 m)},

where $\nabla^{α} \hat{ϕ} + \nabla \hat{φ}$ is obtained by considering the new iterates $x : = \hat{x}, λ : = \hat{λ}, u : = \hat{u}, v : = \hat{v}$ and $μ : = \hat{μ}$ in equation (19).

3.4 Algorithm

The fractional order log barrier interior point algorithm works as follows. For given p, α, µ, β, σ, ε ₁, ε ₂, ε ₃, δ, and initial point (x, λ, u, v) from (40), the Newton system (22) is solved, and its solution (Δx, Δλ, Δu, Δv) is obtained. The step size α _uv is determined by (27) and the new iterate $(\hat{x}, \hat{λ}, \hat{u}, \hat{v})$ is obtained. The new barrier parameter $\hat{μ}$ is updated by (26). This procedure repeat until the termination criterion is satisfied.

In particular, when α = 1, the fractional order log barrier interior point algorithm recover the classical log barrier interior point algorithm.

Algorithm 1
Fractional order log barrier interior point algorithm for polynomial regression models in the ℓ _p −norm.

4 NUMERICAL EXPERIMENTS

In order to illustrate the proposed fractional order log barrier interior point algorithm to solve polynomial regression models in the ℓ _p −norm, numerical experiments were performed to compare it with the classical log barrier interior point algorithm. The implementation of the fractional order log barrier interior point algorithm was performed under Windows 10 and Matlab (R2016a) running on a desktop with 2.20 GHz Intel Core i5-5200 central processing unit (CPU) and 4G random-access memory (RAM).

A data set containing daily interest rates observed over 40 years was used for the analysis of polynomial regressions. The data set contains 10958 observations: ${\{(a_{i}, b_{i})\}}_{i = 1}^{10958}$ , where a _i represents the day of the week for a given date and b _i the interest rate (in percentage) for the specific day a _i . The ai values were normalized to the [0, 1] interval of the real line ℝ to avoid numerical stability problems of the algorithm Oliveira & Cantante (2004); Cantane (2004). Figure 1 shows the data set.

Figure 1
Daily changes in the interest rate.

For the numerical results below, a comparison of the fractional order log barrier interior point algorithm for different values of the order α considers: “It.”, the number of iterations; “Res. Err.”, the residual error, given by ∥r∥_p = ∥b−Ax∥_p ; and “Time (s)”, the CPU time in seconds.

The number of iterations and the value for the residual error for different values of α and p, obtained from the fractional order log barrier interior point algorithm for linear regression (n = 2) in the ℓ _p −norm, are shown in Tables 1-3. If the algorithm does not converge (divergence of Newton’s method for solving the nonlinear system of equations (21): ${||b - A x^{k + 1}||}_{p} \geq {||b - A x^{k}||}_{p})$ , or if the algorithm fails (ill conditioned matrix), then the results will be shown in Tables 1-3 as “-” or “*”, respectively. When α = 1, the results of the classical log barrier interior point algorithm are recovered.

Thumbnail

Table 1
Numerical results for linear regression in the ℓ _p −norm (p = 1.1, 1.2, 1.3) for different values of α.

Thumbnail

Table 2
Numerical results for linear regression in the ℓ _p −norm (p = 1.4, 1.5, 1.6) for different values of α.

Thumbnail

Table 3
Numerical results for linear regression in the ℓ _p −norm (p = 1.7, 1.8, 1.9) for different values of α.

The algorithm failed for α = 0.5 and p = 1.1, as can be seen in Table 1, but for example, for α = 0.55 and p = 1.1, the algorithm converges.

According to the Tables 1-3, the fractional order log barrier interior point algorithm for linear regression models in the ℓ _p −norm (p = 1.1, 1.2, . . . , 1.7) yield smaller residual error when α = p − 0.8. It is a relationship between p and α, which can also be written as p−α = 0.8. In this cases, the algorithm takes more iterations until convergence.

For linear regression in the ℓ _1.2 −norm, a smaller residual error, given by 5242, was obtained with 16 iterations when α = 0.39 (p−α = 0.81). For α = 0.36, the residual error is 5256 and the number of iterations is 22. For α = 0.37, the residual error is 5252 and the number of iterations is 28. For α = 0.38, the residual error is 5247 and the number of iterations is 14.

For linear regression in the ℓ _1.8 −norm, a smaller residual error, given by 509.967, was obtained with 26 iterations when α = 0.96 (p − α = 0.84). For α = 0.94, the residual error is 510.011 and the number of iterations is 28. For α = 0.95, the residual error is 509.987 and the number of iterations is 16. For α = 0.97, the residual error is 510.015 and the number of iterations is 17.

For linear regression in the ℓ _1.9 −norm, the algorithm does not converge for α ≤ 0.92. For α = 0.93, the residual error is 403.48 and the number of iterations is 4. For α = 0.95, the residual error is 403.44 and the number of iterations is 5. For α = 0.97, the residual error is 403.42 and the number of iterations is 6. For α = 0.99, the residual error is 403.41 and the number of iterations is 6. When α = 0.99, the algorithm takes fewer iterations than when α = 1 (See Table 3), and the residual error is very close to the residual error obtained when α = 1.

The number of iterations, the residual error and the CPU time for (n − 1)th degree polynomial regressions (n = 2, 5, 10, 15), in the ℓ _1.3 −norm, for different values of α, obtained from the fractional order log barrier interior point algorithm, are given in the following tables.

The smallest residual errors for n = 2 and n = 5 were obtained with a greater number of iterations when α = 0.5 (see Table 4), while for n = 10 and n = 15 (see Table 5), the residual errors were smallest for α = 0.4.

Thumbnail

Table 4
Numerical results for polynomial regressions (n = 2, 5) in the ℓ _1.3 −norm.

Thumbnail

Table 5
Numerical results for polynomial regressions (n = 10, 15) in the ℓ _1.3 −norm.

The performance of the fractional order log barrier interior point algorithm appears to be computationally consistent. For example, the values of the residual error at each iteration k for linear regression in the ℓ _1.3 −norm are shown in Figure 2.

Figure 2
Linear regression in the ℓ1.3−norm.

In the Figure 2, one can observe that the use of any one of the fractional order derivatives (α = 0, 4, 0.5, . . . , 0.9) produces smaller residual errors than with the use of the integer order derivative (α = 1). The use of the fractional derivatives does not interfere in the iterations running times, that compare similarly.

The approximate solution for (n − 1)th degree polynomial regression in the ℓ _p −norm, obtained at the kth iteration of the fractional order log barrier interior point algorithm and given by $x^{k} = (x_{0}^{k}, x_{1}^{k}, x_{2}^{k}, \dots, x_{n - 1}^{k})$ , provides the coefficients of the (n − 1)th degree polynomial that approximately fits the data set, given by $f (x) = x_{0} + x_{1} x + \dots + x_{n - 1} x^{n - 1}$ . In particular, the approximate solutions x ^k , obtained at the kth iteration of the fractional order log barrier interior point algorithm, for (n − 1)th degree polynomial regressions (n = 2, 5, 10, 15) in the ℓ _1.3 −norm, with smaller residual errors are:

\begin{array}{ll} \begin{aligned} x^{20} = & [\begin{array}{r} x_{0}^{20} \\ x_{1}^{20} \end{array}] = [\begin{array}{r} 4.5 E + 00 \\ 5.9 E + 00 \end{array}] \end{aligned} & (n = 2; α = 0.5), \end{array}

i.e., at the 20th iteration of the algorithm with α = 0.5, the linear regression (n = 2) provides the polynomial coefficients from the above vector x ²⁰. So, the linear regression is given by f (x) = 4.5 + 5.9 x.

x^{20} = [\begin{array}{r} x_{0}^{20} \\ x_{1}^{20} \\ x_{2}^{20} \\ x_{3}^{20} \\ x_{4}^{20} \end{array}] = [\begin{array}{r} 5.3 E + 00 \\ - 3.4 E + 01 \\ 1.9 E + 02 \\ - 2.8 E + 02 \\ 1.2 E + 02 \end{array}] (n = 5; α = 0.5), x^{2} = [\begin{array}{r} x_{0}^{2} \\ x_{1}^{2} \\ x_{2}^{2} \\ x_{3}^{2} \\ x_{4}^{2} \\ x_{5}^{2} \\ x_{6}^{2} \\ x_{7}^{2} \\ x_{8}^{2} \\ x_{9}^{2} \end{array}] = [\begin{array}{r} 1.8 E + 00 \\ 1.7 E + 02 \\ - 3.7 E + 03 \\ 3.5 E + 04 \\ - 1.6 E + 05 \\ 4.6 E + 05 \\ - 7.6 E + 05 \\ 7.2 E + 00 \\ - 3.7 E + 05 \\ 7.9 E + 04 \end{array}] (n = 10; α = 0.4), x^{1} = [\begin{array}{r} x_{0}^{1} \\ x_{1}^{1} \\ x_{2}^{1} \\ x_{3}^{1} \\ x_{4}^{1} \\ x_{5}^{1} \\ x_{6}^{1} \\ x_{7}^{1} \\ x_{8}^{1} \\ x_{9}^{1} \\ x_{10}^{1} \\ x_{11}^{1} \\ x_{12}^{1} \\ x_{13}^{1} \\ x_{14}^{1} \end{array}] = [\begin{array}{r} 3.6 E + 00 \\ 2.6 E + 00 \\ - 1.9 E + 02 \\ 1.3 E + 04 \\ - 2.1 E + 05 \\ 1.4 E + 06 \\ - 5.6 E + 06 \\ 1.1 E + 07 \\ - 1.3 E + 07 \\ 5.9 E + 06 \\ - 2.6 E + 06 \\ 1.2 E + 07 \\ - 1.9 E + 07 \\ 1.2 E + 07 \\ - 2.8 E + 06 \end{array}] (n = 15; α = 0.4) .

The polynomials from the solutions described are shown in Figure 3.

Figure 3
Polynomial regressions in the ℓ _1.3 −norm.

5 CONCLUSIONS

The fractional order log barrier interior point algorithm for polynomial regression models in the ℓ _p −norm (1 < p < 2), obtained by replacing some integer derivatives for the corresponding fractional ones on the first order optimality conditions, was investigated in this paper. This algorithm appears to be computationally consistent but depending on the fractional order α, the algorithm does not converge or fails. The numerical experiments showed that the use of the fractional derivatives can be beneficial for solving optimization problems. In future, further studies on this subject could be undertaken.

Acknowledgments

The author thanks the reviewers for their numerous helpful suggestions.

References

ANDREWS GE, ASKEY R & ROY R. 1999. Special Functions. Cambridge: Cambridge University Press.
BARD Y. 1974. Nonlinear Parameter Estimation. New York: Academic Press.
BIEGLER LT. 2010. Nonlinear Programming. Philadelphia: SIAM.
CANTANE DR. 2004. Métodos de Pontos Interiores Aplicados ao Problema de Regressão pela Norma Lp. São Carlos: Dissertação de Mestrado, Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo.
CANTANTE DR, GRIGOLETTO EC & OLIVEIRA ARL. 2012. Método de pontos interiores barreira logarítmica preditor-corretor especializado para o problema de regressão pela norma Lp. Tendências em Matemática Aplicada e Computacional - TEMA, 13: 219-231.
CHARNES A, COOPER WW & FERGUSON R. 1955. Optimal estimation of executive compensation by linear programming. Management Sci., 1: 138-151.
CHATTERJEE S & HADI AS. 2012. Regression Analysis by Example. Hoboken: Wiley.
CHEN Y, GAO Q, WEI Y & WANG Y. 2017a. Study on fractional order gradient methods. Appl. Math. Comput., 314: 310-321.
CHEN Y, GAO Q, WEI Y & WANG Y. 2017b. Study on fractional order gradient methods. Appl. Math. Comput. , 314: 310-321.
COLEMAN TF & LI Y. 1992. A globally and quadratically convergent affine scaling method for linear ℓ₁ problems. Math. Programming, 56: 189-222.
DALIR M & BASHOUR M. 2010. Applications of fractional calculus. Appl. Math. Sci., 4: 1021-1032.
DENNIS JR JE & SCHNABEL RB. 1983. Numerical Methods for Unconstrained Optimization. NJ: Prentice-Hall.
ERDÉLYI A, MAGNUS W, OBERHETTINGER F & TRICOMI F. 1981. Higher Transcendental Functions. vol. I-III. Melbourne, Florida: Krieger Pub.
FORSYTHE AB. 1972. Robust estimation of straight line regression coefficients by minimizing pth power deviations. Technometrics, 14: 159-166.
GONDZIO J. 2012. Interior point methods 25 years later. Eur. J. Oper. Res., 218: 587-601.
GRIGOLETTO EC. 2011. Implementação Eficiente dos Métodos de Pontos Interiores Especializados para o Problema de Regressão pela Norma Lp. Campinas: Dissertação de Mestrado, Instituto de Matemática, Estatística e Computação Científica, Universidade Estadual de Campinas.
GRIGOLETTO EC & OLIVEIRA ARL. 2020. Fractional order gradient descent algorithm. In: Proceeding Series of the Brazilian Society of Computational and Applied Mathematics - XXXIX CNMAC. p. 010387. São Carlos: SBMAC.
KILBAS AA, SRIVASTAVA HM & TRUJILLO JJ. 2006. Theory and Applications of Fractional Differential Equations. Amsterdam: Elsevier.
LI Y. 1993. A globally convergent method for ℓ_p problems. SIAM J. Optimiz., 3: 609-629.
LILIAN FB, OLIVEIRA ARL & GHIDINI CTLS. 2016. Use of continued iteration on the reduction of iterations of the interior point method. Pesqui. Oper., 36: 487-501.
MOHAMMADZADEH A & KAYACAN E. 2020. A novel fractional-order type-2 fuzzy control method for online frequency regulation in ac microgrid. Eng. Appl. Artif. Intell., 90: 103483.
OLDHAM KB & SPANIER J. 1974. The Fractional Calculus. New York: Academic Press .
OLIVEIRA ARL & CANTANTE DR. 2004. Método de pontos interiores aplicados ao problema de regressão pela norma Lp. Tendências em Matemática Aplicada e Computacional - TEMA , 5: 281-291.
OLIVEIRA ARL & LYRA C. 2004. Interior point methods for the polynomial ℓ_∞ fitting problems. Int. Trans. Oper. Res., 11: 309322.
OLIVEIRA ARL, NASCIMENTO MA & LYRA C. 2000. Efficient implementation and benchmark of interior point methods for the polynomial ℓ₁ fitting problem. Comput. Statist. Data Anal., 35: 119-135.
RICE JR & WHITE JS. 1964. Norms for Smoothing and Estimation. SIAM Review, 6: 243-256.
SEVAUX M & MINEUR Y. 2007. A curve-fitting genetic algorithm for a styling application. Eur. J. Oper. Res. , 179: 895-905.
SÜLI E & MAYERS DF. 2003. An Introduction to Numerical Analysis. New York: Cambridge University Press.
VANDERBEI RJ. 2020. Linear Programming: Foundations and Extensions. Springer Nature.
WANG J, WEN Y, GOU Y, YE Z & CHEN H. 2017a. Fractional-order gradient descent learning of BP neural networks with Caputo derivative. Neural Netw., 89: 19-30.
WANG J, WEN Y, GOU Y, YE Z & CHEN H. 2017b. Fractional-order gradient descent learning of BP neural networks with Caputo derivative. Neural Netw. , 89: 19-30.
WRIGHT SJ. 1996. Primal-Dual Interior-Point Methods. Philadelphia, USA: SIAM.

Publication Dates

Publication in this collection
25 Nov 2022
Date of issue
2022

History

Received
18 June 2021
Accepted
23 June 2022

This is an open-access article distributed under the terms of the Creative Commons Attribution License

[1] ANDREWS GE, ASKEY R & ROY R. 1999. Special Functions. Cambridge: Cambridge University Press.

[2] BARD Y. 1974. Nonlinear Parameter Estimation. New York: Academic Press.

[3] BIEGLER LT. 2010. Nonlinear Programming. Philadelphia: SIAM.

[4] CANTANE DR. 2004. Métodos de Pontos Interiores Aplicados ao Problema de Regressão pela Norma Lp. São Carlos: Dissertação de Mestrado, Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo.

[5] CANTANTE DR, GRIGOLETTO EC & OLIVEIRA ARL. 2012. Método de pontos interiores barreira logarítmica preditor-corretor especializado para o problema de regressão pela norma Lp. Tendências em Matemática Aplicada e Computacional - TEMA, 13: 219-231.

[6] CHARNES A, COOPER WW & FERGUSON R. 1955. Optimal estimation of executive compensation by linear programming. Management Sci., 1: 138-151.

[7] CHATTERJEE S & HADI AS. 2012. Regression Analysis by Example. Hoboken: Wiley.

[8] CHEN Y, GAO Q, WEI Y & WANG Y. 2017a. Study on fractional order gradient methods. Appl. Math. Comput., 314: 310-321.

[9] CHEN Y, GAO Q, WEI Y & WANG Y. 2017b. Study on fractional order gradient methods. Appl. Math. Comput. , 314: 310-321.

[10] COLEMAN TF & LI Y. 1992. A globally and quadratically convergent affine scaling method for linear ℓ₁ problems. Math. Programming, 56: 189-222.

[11] DALIR M & BASHOUR M. 2010. Applications of fractional calculus. Appl. Math. Sci., 4: 1021-1032.

[12] DENNIS JR JE & SCHNABEL RB. 1983. Numerical Methods for Unconstrained Optimization. NJ: Prentice-Hall.

[13] ERDÉLYI A, MAGNUS W, OBERHETTINGER F & TRICOMI F. 1981. Higher Transcendental Functions. vol. I-III. Melbourne, Florida: Krieger Pub.

[14] FORSYTHE AB. 1972. Robust estimation of straight line regression coefficients by minimizing pth power deviations. Technometrics, 14: 159-166.

[15] GONDZIO J. 2012. Interior point methods 25 years later. Eur. J. Oper. Res., 218: 587-601.

[16] GRIGOLETTO EC. 2011. Implementação Eficiente dos Métodos de Pontos Interiores Especializados para o Problema de Regressão pela Norma Lp. Campinas: Dissertação de Mestrado, Instituto de Matemática, Estatística e Computação Científica, Universidade Estadual de Campinas.

[17] GRIGOLETTO EC & OLIVEIRA ARL. 2020. Fractional order gradient descent algorithm. In: Proceeding Series of the Brazilian Society of Computational and Applied Mathematics - XXXIX CNMAC. p. 010387. São Carlos: SBMAC.

[18] KILBAS AA, SRIVASTAVA HM & TRUJILLO JJ. 2006. Theory and Applications of Fractional Differential Equations. Amsterdam: Elsevier.

[19] LI Y. 1993. A globally convergent method for ℓ_p problems. SIAM J. Optimiz., 3: 609-629.

[20] LILIAN FB, OLIVEIRA ARL & GHIDINI CTLS. 2016. Use of continued iteration on the reduction of iterations of the interior point method. Pesqui. Oper., 36: 487-501.

[21] MOHAMMADZADEH A & KAYACAN E. 2020. A novel fractional-order type-2 fuzzy control method for online frequency regulation in ac microgrid. Eng. Appl. Artif. Intell., 90: 103483.

[22] OLDHAM KB & SPANIER J. 1974. The Fractional Calculus. New York: Academic Press .

[23] OLIVEIRA ARL & CANTANTE DR. 2004. Método de pontos interiores aplicados ao problema de regressão pela norma Lp. Tendências em Matemática Aplicada e Computacional - TEMA , 5: 281-291.

[24] OLIVEIRA ARL & LYRA C. 2004. Interior point methods for the polynomial ℓ_∞ fitting problems. Int. Trans. Oper. Res., 11: 309322.

[25] OLIVEIRA ARL, NASCIMENTO MA & LYRA C. 2000. Efficient implementation and benchmark of interior point methods for the polynomial ℓ₁ fitting problem. Comput. Statist. Data Anal., 35: 119-135.

[26] RICE JR & WHITE JS. 1964. Norms for Smoothing and Estimation. SIAM Review, 6: 243-256.

[27] SEVAUX M & MINEUR Y. 2007. A curve-fitting genetic algorithm for a styling application. Eur. J. Oper. Res. , 179: 895-905.

[28] SÜLI E & MAYERS DF. 2003. An Introduction to Numerical Analysis. New York: Cambridge University Press.

[29] VANDERBEI RJ. 2020. Linear Programming: Foundations and Extensions. Springer Nature.

[30] WANG J, WEN Y, GOU Y, YE Z & CHEN H. 2017a. Fractional-order gradient descent learning of BP neural networks with Caputo derivative. Neural Netw., 89: 19-30.

[31] WANG J, WEN Y, GOU Y, YE Z & CHEN H. 2017b. Fractional-order gradient descent learning of BP neural networks with Caputo derivative. Neural Netw. , 89: 19-30.

[32] WRIGHT SJ. 1996. Primal-Dual Interior-Point Methods. Philadelphia, USA: SIAM.

	n = 2			n = 5
α	It.	Time (s)	Res. Err.	It.	Time (s)	Res. Err.
0.1, 0.2, 0.3	−	−	−	−	−	−
0.4	5	0.0690	3044.3	16	0.1839	2433.6
0.5	20	0.1561	3024.8	20	0.2306	2423.5
0.6	11	0.1033	3055	17	0.2090	2431.8
0.7	6	0.0727	3064.1	5	0.0800	2441.4
0.8	3	0.0379	3065.7	5	0.0849	2443.3
0.9	3	0.0492	3065.8	3	0.0679	2443.7
1	5	0.0725	3065.9	3	0.0718	2443.8

	n = 10			n = 15
α	It.	Time (s)	Res. Err.	It.	Time (s)	Res. Err.
0.1, 0.2, 0.3	−	−	−	−	−	−
0.4	2	0.0622	2213.4169	1	0.0555	1953.1475
0.5	1	0.0317	2220.5557	1	0.0583	1953.4556
0.6	1	0.0337	2221.8459	1	0.055	1953.9432
0.7	1	0.0320	2222.0088	1	0.0698	1954.1043
0.8	1	0.0312	2222.2097	1	0.0526	1953.8845
0.9	1	0.0318	2222.2880	1	0.0584	1953.8840
1	1	0.0314	2222.2941	1	0.0697	1953.8973

	p = 1.1		p = 1.2		p = 1.3
α	It.	Res. Err.	It.	Res. Err.	It.	Res. Err.
0.1	−	−	−	−	−	−
0.2	5	10250	−	−	−	−
0.3	19	10118	5	5293	−	−
0.4	11	10297	20	5251	5	3044
0.5	∗	∗	14	5313	20	3024
0.6	3	10357	17	5324	11	3055
0.7	3	10358	3	5339	6	3064
0.8	5	10358	3	5339	3	3065
0.9	3	10359	5	5339	3	3065
1	2	10360	3	5340	5	3066