Search | arXiv e-print repository

An investigation of stochastic trust-region based algorithms for finite-sum minimization

Authors: Stefania Bellavia, Benedetta Morini, Simone Rebegoldi

Abstract: This work elaborates on the TRust-region-ish (TRish) algorithm, a stochastic optimization method for finite-sum minimization problems proposed by Curtis et al. in [Curtis2019, Curtis2022]. A theoretical analysis that complements the results in the literature is presented, and the issue of tuning the involved hyper-parameters is investigated. Our study also focuses on a practical version of the met… ▽ More This work elaborates on the TRust-region-ish (TRish) algorithm, a stochastic optimization method for finite-sum minimization problems proposed by Curtis et al. in [Curtis2019, Curtis2022]. A theoretical analysis that complements the results in the literature is presented, and the issue of tuning the involved hyper-parameters is investigated. Our study also focuses on a practical version of the method, which computes the stochastic gradient by means of the inner product test and the orthogonality test proposed by Bollapragada et al. in [Bollapragada2018]. It is shown experimentally that this implementation improves the performance of TRish and reduces its sensitivity to the choice of the hyper-parameters. △ Less

Submitted 20 April, 2024; originally announced April 2024.

MSC Class: 65K05; 90C30; 90C15

arXiv:2402.12069 [pdf, other]

Inexact Restoration via random models for unconstrained noisy optimization

Authors: Benedetta Morini, Simone Rebegoldi

Abstract: We study the Inexact Restoration framework with random models for minimizing functions whose evaluation is subject to errors. We propose a constrained formulation that includes well-known stochastic problems and an algorithm applicable when the evaluation of both the function and its gradient is random and a specified accuracy of such evaluations is guaranteed with sufficiently high probability. T… ▽ More We study the Inexact Restoration framework with random models for minimizing functions whose evaluation is subject to errors. We propose a constrained formulation that includes well-known stochastic problems and an algorithm applicable when the evaluation of both the function and its gradient is random and a specified accuracy of such evaluations is guaranteed with sufficiently high probability. The proposed algorithm combines the Inexact Restoration framework with a trust-region methodology based on random first-order models. We analyse the properties of the algorithm and provide the expected number of iterations performed to reach an approximate first-order optimality point. Numerical experiments show that the proposed algorithm compares well with a state-of-the-art competitor. △ Less

Submitted 19 February, 2024; originally announced February 2024.

MSC Class: 65K05; 90C30; 90C15

arXiv:2310.16580 [pdf, ps, other]

An optimally fast objective-function-free minimization algorithm using random subspaces

Authors: S. Bellavia, S. Gratton, B. Morini, Ph. L. Toint

Abstract: An algorithm for unconstrained non-convex optimization is described, which does not evaluate the objective function and in which minimization is carried out, at each iteration, within a randomly selected subspace. It is shown that this random approximation technique does not affect the method's convergence nor its evaluation complexity for the search of an $ε$-approximate first-order critical poin… ▽ More An algorithm for unconstrained non-convex optimization is described, which does not evaluate the objective function and in which minimization is carried out, at each iteration, within a randomly selected subspace. It is shown that this random approximation technique does not affect the method's convergence nor its evaluation complexity for the search of an $ε$-approximate first-order critical point, which is $\mathcal{O}(ε^{-(p+1)/p})$, where $p$ is the order of derivatives used. A variant of the algorithm using approximate Hessian matrices is also analyzed and shown to require at most $\mathcal{O}(ε^{-2})$ evaluations. Preliminary numerical tests show that the random-subspace technique can significantly improve performance on some problems, albeit, unsurprisingly, not for all. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 23 pages

MSC Class: 60G99; 65K05; 68M20; 68Q17; 90C26 ACM Class: G.6.1; F.2.1

arXiv:2310.05501 [pdf, other]

Inexact Newton methods with matrix approximation by sampling for nonlinear least-squares and systems

Authors: Stefania Bellavia, Greta Malaspina, Benedetta Morini

Abstract: We develop and analyze stochastic inexact Gauss-Newton methods for nonlinear least-squares problems and inexact Newton methods for nonlinear systems of equations. Random models are formed using suitable sampling strategies for the matrices involved in the deterministic models. The analysis of the expected number of iterations needed in the worst case to achieve a desired level of accuracy in the f… ▽ More We develop and analyze stochastic inexact Gauss-Newton methods for nonlinear least-squares problems and inexact Newton methods for nonlinear systems of equations. Random models are formed using suitable sampling strategies for the matrices involved in the deterministic models. The analysis of the expected number of iterations needed in the worst case to achieve a desired level of accuracy in the first-order optimality condition provides guidelines for applying sampling and enforcing, with fixed probability, a suitable accuracy in the random approximations. Results of the numerical validation of the algorithms are presented. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2205.06710 [pdf, ps, other]

Linesearch Newton-CG methods for convex optimization with noise

Authors: Stefania Bellavia, Eugenio Fabrizi, Benedetta Morini

Abstract: This paper studies the numerical solution of strictly convex unconstrained optimization problems by linesearch Newton-CG methods. We focus on methods employing inexact evaluations of the objective function and inexact and possibly random gradient and Hessian estimates. The derivative estimates are not required to satisfy suitable accuracy requirements at each iteration but with sufficiently high p… ▽ More This paper studies the numerical solution of strictly convex unconstrained optimization problems by linesearch Newton-CG methods. We focus on methods employing inexact evaluations of the objective function and inexact and possibly random gradient and Hessian estimates. The derivative estimates are not required to satisfy suitable accuracy requirements at each iteration but with sufficiently high probability. Concerning the evaluation of the objective function we first assume that the noise in the objective function evaluations is bounded in absolute value. Then, we analyze the case where the error satisfies prescribed dynamic accuracy requirements. We provide for both cases a complexity analysis and derive expected iteration complexity bounds. We finally focus on the specific case of finite-sum minimization which is typical of machine learning applications. △ Less

Submitted 13 May, 2022; originally announced May 2022.

arXiv:2112.06176 [pdf, ps, other]

Trust-region algorithms: probabilistic complexity and intrinsic noise with applications to subsampling techniques

Authors: S. Bellavia, G. Gurioli, B. Morini, Ph. L. Toint

Abstract: A trust-region algorithm is presented for finding approximate minimizers of smooth unconstrained functions whose values and derivatives are subject to random noise. It is shown that, under suitable probabilistic assumptions, the new method finds (in expectation) an $ε$-approximate minimizer of arbitrary order $ q \geq 1$ in at most $\mathcal{O}(ε^{-(q+1)})$ inexact evaluations of the function and… ▽ More A trust-region algorithm is presented for finding approximate minimizers of smooth unconstrained functions whose values and derivatives are subject to random noise. It is shown that, under suitable probabilistic assumptions, the new method finds (in expectation) an $ε$-approximate minimizer of arbitrary order $ q \geq 1$ in at most $\mathcal{O}(ε^{-(q+1)})$ inexact evaluations of the function and its derivatives, providing the first such result for general optimality orders. The impact of intrinsic noise limiting the validity of the assumptions is also discussed and it is shown that difficulties are unlikely to occur in the first-order version of the algorithm for sufficiently large gradients. Conversely, should these assumptions fail for specific realizations, then "degraded" optimality guarantees are shown to hold when failure occurs. These conclusions are then discussed and illustrated in the context of subsampling methods for finite-sum optimization. △ Less

Submitted 30 December, 2021; v1 submitted 12 December, 2021; originally announced December 2021.

MSC Class: 65K05; 65C50; 90C26 ACM Class: F.2.1; G.1.6

arXiv:2107.03129 [pdf, other]

A stochastic first-order trust-region method with inexact restoration for finite-sum minimization

Authors: Stefania Bellavia, Natasa Krejic, Benedetta Morini, Simone Rebegoldi

Abstract: We propose a stochastic first-order trust-region method with inexact function and gradient evaluations for solving finite-sum minimization problems. Using a suitable reformulation of the given problem, our method combines the inexact restoration approach for constrained optimization with the trust-region procedure and random models. Differently from other recent stochastic trust-region schemes, ou… ▽ More We propose a stochastic first-order trust-region method with inexact function and gradient evaluations for solving finite-sum minimization problems. Using a suitable reformulation of the given problem, our method combines the inexact restoration approach for constrained optimization with the trust-region procedure and random models. Differently from other recent stochastic trust-region schemes, our proposed algorithm improves feasibility and optimality in a modular way. We provide the expected number of iterations for reaching a near-stationary point by imposing some probability accuracy requirements on random functions and gradients which are, in general, less stringent than the corresponding ones in literature. We validate the proposed algorithm on some nonconvex optimization problems arising in binary classification and regression, showing that it performs well in terms of cost and accuracy, and allows to reduce the burdensome tuning of the hyper-parameters involved. △ Less

Submitted 22 October, 2022; v1 submitted 7 July, 2021; originally announced July 2021.

MSC Class: 65K05; 90C26; 68T05

arXiv:2104.02519 [pdf, ps, other]

The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

Authors: Stefania Bellavia, Gianmarco Gurioli, Benedetta Morini, Philippe L. Toint

Abstract: Intrinsic noise in objective function and derivatives evaluations may cause premature termination of optimization algorithms. Evaluation complexity bounds taking this situation into account are presented in the framework of a deterministic trust-region method. The results show that the presence of intrinsic noise may dominate these bounds, in contrast with what is known for methods in which the in… ▽ More Intrinsic noise in objective function and derivatives evaluations may cause premature termination of optimization algorithms. Evaluation complexity bounds taking this situation into account are presented in the framework of a deterministic trust-region method. The results show that the presence of intrinsic noise may dominate these bounds, in contrast with what is known for methods in which the inexactness in function and derivatives' evaluations is fully controllable. Moreover, the new analysis provides estimates of the optimality level achievable, should noise cause early termination. It finally sheds some light on the impact of inexact computer arithmetic on evaluation complexity. △ Less

Submitted 6 April, 2021; originally announced April 2021.

MSC Class: 90C26; 90C30; 90C56; 90C59; 49M37; 49M05 ACM Class: F.2.1; G.1.6

arXiv:2104.00592 [pdf, ps, other]

Quadratic and Cubic Regularisation Methods with Inexact function and Random Derivatives for Finite-Sum Minimisation

Authors: Stefania Bellavia, Gianmarco Gurioli, Benedetta Morini, Philippe L. Toint

Abstract: This paper focuses on regularisation methods using models up to the third order to search for up to second-order critical points of a finite-sum minimisation problem. The variant presented belongs to the framework of [3]: it employs random models with accuracy guaranteed with a sufficiently large prefixed probability and deterministic inexact function evaluations within a prescribed level of accur… ▽ More This paper focuses on regularisation methods using models up to the third order to search for up to second-order critical points of a finite-sum minimisation problem. The variant presented belongs to the framework of [3]: it employs random models with accuracy guaranteed with a sufficiently large prefixed probability and deterministic inexact function evaluations within a prescribed level of accuracy. Without assuming unbiased estimators, the expected number of iterations is $\mathcal{O}\bigl(ε_1^{-2}\bigr)$ or $\mathcal{O}\bigl(ε_1^{-{3/2}}\bigr)$ when searching for a first-order critical point using a second or third order model, respectively, and of $\mathcal{O}\bigl(\max[ε_1^{-{3/2}},ε_2^{-3}]\bigr)$ when seeking for second-order critical points with a third order model, in which $ε_j$, $j\in\{1,2\}$, is the $j$th-order tolerance. These results match the worst-case optimal complexity for the deterministic counterpart of the method. Preliminary numerical tests for first-order optimality in the context of nonconvex binary classification in imaging, with and without Artifical Neural Networks (ANNs), are presented and discussed. △ Less

Submitted 2 April, 2021; v1 submitted 30 March, 2021; originally announced April 2021.

Comments: 9 pages

arXiv:2005.05851 [pdf, other]

Solving nonlinear systems of equations via spectral residual methods: stepsize selection and applications

Authors: Enrico Meli, Benedetta Morini, Margherita Porcelli, Cristina Sgattoni

Abstract: Spectral residual methods are derivative-free and low-cost per iteration procedures for solving nonlinear systems of equations. They are generally coupled with a nonmonotone linesearch strategy and compare well with Newton-based methods for large nonlinear systems and sequences of nonlinear systems. The residual vector is used as the search direction and choosing the steplength has a crucial impac… ▽ More Spectral residual methods are derivative-free and low-cost per iteration procedures for solving nonlinear systems of equations. They are generally coupled with a nonmonotone linesearch strategy and compare well with Newton-based methods for large nonlinear systems and sequences of nonlinear systems. The residual vector is used as the search direction and choosing the steplength has a crucial impact on the performance. In this work we address both theoretically and experimentally the steplength selection and provide results on a real application such as a rolling contact problem. △ Less

Submitted 17 September, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

arXiv:2005.04639 [pdf, ps, other]

Adaptive Regularization for Nonconvex Optimization Using Inexact Function Values and Randomly Perturbed Derivatives

Authors: S. Bellavia, G. Gurioli, B. Morini, Ph. L. Toint

Abstract: A regularization algorithm allowing random noise in derivatives and inexact function values is proposed for computing approximate local critical points of any order for smooth unconstrained optimization problems. For an objective function with Lipschitz continuous $p$-th derivative and given an arbitrary optimality order $q \leq p$, it is shown that this algorithm will, in expectation, compute suc… ▽ More A regularization algorithm allowing random noise in derivatives and inexact function values is proposed for computing approximate local critical points of any order for smooth unconstrained optimization problems. For an objective function with Lipschitz continuous $p$-th derivative and given an arbitrary optimality order $q \leq p$, it is shown that this algorithm will, in expectation, compute such a point in at most $O\left(\left(\min_{j\in\{1,\ldots,q\}}ε_j\right)^{-\frac{p+1}{p-q+1}}\right)$ inexact evaluations of $f$ and its derivatives whenever $q\in\{1,2\}$, where $ε_j$ is the tolerance for $j$th order accuracy. This bound becomes at most $O\left(\left(\min_{j\in\{1,\ldots,q\}}ε_j\right)^{-\frac{q(p+1)}{p}}\right)$ inexact evaluations if $q>2$ and all derivatives are Lipschitz continuous. Moreover these bounds are sharp in the order of the accuracy tolerances. An extension to convexly constrained problems is also outlined. △ Less

Submitted 6 April, 2021; v1 submitted 10 May, 2020; originally announced May 2020.

Comments: 22 pages

MSC Class: 49K10; 49M37; 65K05; 68W40; 90C15 ACM Class: G.1.6; F.2.1

arXiv:1902.01710 [pdf, other]

Inexact restoration with subsampled trust-region methods for finite-sum minimization

Authors: Stefania Bellavia, Natasa Krejic, Benedetta Morini

Abstract: Convex and nonconvex finite-sum minimization arises in many scientific computing and machine learning applications. Recently, first-order and second-order methods where objective functions, gradients and Hessians are approximated by randomly sampling components of the sum have received great attention. We propose a new trust-region method which employs suitable approximations of the objective func… ▽ More Convex and nonconvex finite-sum minimization arises in many scientific computing and machine learning applications. Recently, first-order and second-order methods where objective functions, gradients and Hessians are approximated by randomly sampling components of the sum have received great attention. We propose a new trust-region method which employs suitable approximations of the objective function, gradient and Hessian built via random subsampling techniques. The choice of the sample size is deterministic and ruled by the inexact restoration approach. We discuss local and global properties for finding approximate first- and second-order optimal points and function evaluation complexity results. Numerical experience shows that the new procedure is more efficient, in terms of overall computational cost, than the standard trust-region scheme with subsampled Hessians. △ Less

Submitted 10 May, 2020; v1 submitted 5 February, 2019; originally announced February 2019.

arXiv:1811.03831 [pdf, ps, other]

Adaptive Regularization Algorithms with Inexact Evaluations for Nonconvex Optimization

Authors: S. Bellavia, G. Gurioli, B. Morini, Ph. L. Toint

Abstract: A regularization algorithm using inexact function values and inexact derivatives is proposed and its evaluation complexity analyzed. This algorithm is applicable to unconstrained problems and to problems with inexpensive constraints (that is constraints whose evaluation and enforcement has negligible cost) under the assumption that the derivative of highest degree is $β$-Hölder continuous. It feat… ▽ More A regularization algorithm using inexact function values and inexact derivatives is proposed and its evaluation complexity analyzed. This algorithm is applicable to unconstrained problems and to problems with inexpensive constraints (that is constraints whose evaluation and enforcement has negligible cost) under the assumption that the derivative of highest degree is $β$-Hölder continuous. It features a very flexible adaptive mechanism for determining the inexactness which is allowed, at each iteration, when computing objective function values and derivatives. The complexity analysis covers arbitrary optimality order and arbitrary degree of available approximate derivatives. It extends results of Cartis, Gould and Toint (2018) on the evaluation complexity to the inexact case: if a $q$th order minimizer is sought using approximations to the first $p$ derivatives, it is proved that a suitable approximate minimizer within $ε$ is computed by the proposed algorithm in at most $O(ε^{-\frac{p+β}{p-q+β}})$ iterations and at most $O(|\log(ε)|ε^{-\frac{p+β}{p-q+β}})$ approximate evaluations. An algorithmic variant, although more rigid in practice, can be proved to find such an approximate minimizer in $O(|\log(ε)|+ε^{-\frac{p+β}{p-q+β}})$ evaluations.While the proposed framework remains so far conceptual for high degrees and orders, it is shown to yield simple and computationally realistic inexact methods when specialized to the unconstrained and bound-constrained first- and second-order cases. The deterministic complexity results are finally extended to the stochastic context, yielding adaptive sample-size rules for subsampling methods typical of machine learning. △ Less

Submitted 19 April, 2019; v1 submitted 9 November, 2018; originally announced November 2018.

Comments: 32 pages

MSC Class: 49K10; 49M37; 65K05; 68T05; 68W40 ACM Class: F.1.3; F.2.1; G.1.6; I.2.6

arXiv:1808.06239 [pdf, ps, other]

Adaptive Cubic Regularization Methods with Dynamic Inexact Hessian Information and Applications to Finite-Sum Minimization

Authors: Stefania Bellavia, Gianmarco Gurioli, Benedetta Morini

Abstract: We consider the Adaptive Regularization with Cubics approach for solving nonconvex optimization problems and propose a new variant based on inexact Hessian information chosen dynamically. The theoretical analysis of the proposed procedure is given. The key property of ARC framework, constituted by optimal worst-case function/derivative evaluation bounds for first- and second-order critical point,… ▽ More We consider the Adaptive Regularization with Cubics approach for solving nonconvex optimization problems and propose a new variant based on inexact Hessian information chosen dynamically. The theoretical analysis of the proposed procedure is given. The key property of ARC framework, constituted by optimal worst-case function/derivative evaluation bounds for first- and second-order critical point, is guaranteed. Application to large-scale finite-sum minimization based on subsampled Hessian is discussed and analyzed in both a deterministic and probabilistic manner and equipped with numerical experiments on synthetic and real datasets. △ Less

Submitted 3 December, 2019; v1 submitted 19 August, 2018; originally announced August 2018.

arXiv:1504.03442 [pdf, ps, other]

On an adaptive regularization for ill-posed nonlinear systems and its trust-region implementation

Authors: Stefania Bellavia, Benedetta Morini, Elisa Riccietti

Abstract: In this paper we address the stable numerical solution of nonlinear ill-posed systems by a trust-region method. We show that an appropriate choice of the trust-region radius gives rise to a procedure that has the potential to approach a solution of the unperturbed system. This regularizing property is shown theoretically and validated numerically. In this paper we address the stable numerical solution of nonlinear ill-posed systems by a trust-region method. We show that an appropriate choice of the trust-region radius gives rise to a procedure that has the potential to approach a solution of the unperturbed system. This regularizing property is shown theoretically and validated numerically. △ Less

Submitted 14 April, 2015; originally announced April 2015.

Comments: arXiv admin note: text overlap with arXiv:1410.2780

arXiv:1410.2780

Improved regularizing iterative methods for ill-posed nonlinear systems

Authors: Stefania Bellavia, Benedetta Morini

Abstract: In this paper we address the numerical solution of nonlinear ill-posed systems by iterative regularization methods in the classes of Levenberg-Marquardt, trust-region and adaptive quadratic regularization procedures. Both with exact and noisy data, our focus is on the potential to approach a solution of the unperturbed systems without assumptions on its vicinity to the initial guess. Regularizing… ▽ More In this paper we address the numerical solution of nonlinear ill-posed systems by iterative regularization methods in the classes of Levenberg-Marquardt, trust-region and adaptive quadratic regularization procedures. Both with exact and noisy data, our focus is on the potential to approach a solution of the unperturbed systems without assumptions on its vicinity to the initial guess. Regularizing properties of the methods proposed are shown theoretically and validated numerically along with enhanced convergence. △ Less

Submitted 16 April, 2015; v1 submitted 10 October, 2014; originally announced October 2014.

Comments: It has been significantly improved and the new version with a new title is available at arXiv:1504.03442

arXiv:1312.0047 [pdf, ps, other]

Updating constraint preconditioners for KKT systems in quadratic programming via low-rank corrections

Authors: S. Bellavia, V. De Simone, D. di Serafino, B. Morini

Abstract: This work focuses on the iterative solution of sequences of KKT linear systems arising in interior point methods applied to large convex quadratic programming problems. This task is the computational core of the interior point procedure and an efficient preconditioning strategy is crucial for the efficiency of the overall method. Constraint preconditioners are very effective in this context; never… ▽ More This work focuses on the iterative solution of sequences of KKT linear systems arising in interior point methods applied to large convex quadratic programming problems. This task is the computational core of the interior point procedure and an efficient preconditioning strategy is crucial for the efficiency of the overall method. Constraint preconditioners are very effective in this context; nevertheless, their computation may be very expensive for large-scale problems, and resorting to approximations of them may be convenient. Here we propose a procedure for building inexact constraint preconditioners by updating a "seed" constraint preconditioner computed for a KKT matrix at a previous interior point iteration. These updates are obtained through low-rank corrections of the Schur complement of the (1,1) block of the seed preconditioner. The updated preconditioners are analyzed both theoretically and computationally. The results obtained show that our updating procedure, coupled with an adaptive strategy for determining whether to reinitialize or update the preconditioner, can enhance the performance of interior point methods on large problems. △ Less

Submitted 20 September, 2015; v1 submitted 29 November, 2013; originally announced December 2013.

Comments: 22 pages

MSC Class: 65F08; 65F10; 90C20; 90C51

Showing 1–17 of 17 results for author: Morini, B