A secant-based Nesterov method for convex functions

Alli-Oke, Razak O.; Heath, William P.

doi:10.1007/s11590-015-0991-3

A secant-based Nesterov method for convex functions

Original Paper
Published: 07 March 2016

Volume 11, pages 81–105, (2017)
Cite this article

Optimization Letters Aims and scope Submit manuscript

3 Citations
Explore all metrics

Abstract

A simple secant-based fast gradient method is developed for problems whose objective function is convex and well-defined. The proposed algorithm extends the classical Nesterov gradient method by updating the estimate-sequence parameter with secant information whenever possible. This is achieved by imposing a secant condition on the choice of search point. Furthermore, the proposed algorithm embodies an "update rule with reset" that parallels the restart rule recently suggested in O’Donoghue and Candes (Found Comput Math, 2013). The proposed algorithm applies to a large class of problems including logistic and least-square losses commonly found in the machine learning literature. Numerical results demonstrating the efficiency of the proposed algorithm are analyzed with the aid of performance profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The simplified Nesterov gradient method Algorithm 1b is used for computations.
In the case of \(\mu =0\) (Examples 2 and 4), fixed restart is done after \(k=\max \{\text {N},\sqrt{\text {L}}\}\) iterations.
In the case of \(\mu =0\) (Examples 2 and 4), the algorithm resets with \(\gamma _{k+1} = \text {min}\, \left( \dfrac{\gamma _{k+1}^F}{\text {L}^2},\,10^{-6}\gamma _{k+1}^F \right) \), see Remark 8.

References

Amini, K., Ahookhosh, M., Nosratipour, H.: An inexact line search approach using modified nonmonotone strategy for unconstrained optimization. Numer. Algorithms 66(1), 49–78 (2013)
Article MathSciNet MATH Google Scholar
Barzilai, J., Borwein, J.: Two-point step size gradient methods. IMA J. Numer. Anal. 8, 141–148 (1988)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Becker, S.R., Candes, E.J., Grant, M.C.: Templates for convex cone problems with applications to sparse signal recovery. Math. Program. Comput. 3(3), 165–218 (2011)
Article MathSciNet MATH Google Scholar
Bhaya, A., Kaszkurewicz, E.: Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method. Neural Netw. 17(1), 65–71 (2004)
Article MATH Google Scholar
Birgin, E.G., Martinez, J.M., Raydan, M.: Nonmonotone spectral projected gradient methods on convex sets. SIAM J. Optim. 10(4), 1196–1211 (2000)
Article MathSciNet MATH Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2009)
MATH Google Scholar
Collins, M., Schapire, R.E., Singer, Y.: Logistic regression. Adaboost and Bregman distances. Mach. Learn. 48, 253–285 (2002)
Article MATH Google Scholar
Dolan, E.D., More, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
Article MathSciNet MATH Google Scholar
Fletcher, R.: On the Barzilai–Borwein method. In: Qi, L., Teo, K., Yang, X. (eds.) Optimization and Control with Applications. Applied Optimization, vol. 96, pp. 235–256. Springer, US (2005)
Chapter Google Scholar
Goldstein, T., O’Donoghue, B., Setzer, S.: Fast alternating direction optimization methods. Technical Report, UCLA (May 2012 (Revised January 2014))
Gonzaga, C.C., Karas, E.W.: Fine tuning Nesterov’s steepest descent algorithm for differentiable convex programming. Math. Program. Ser. A 138, 141–166 (2013)
Article MathSciNet MATH Google Scholar
Gu, M., Lim, L.H., Wu, C.: ParNes: a rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals. Numer. Algorithms 64(2), 321–347 (2013)
Article MathSciNet MATH Google Scholar
He, R., Tan, T., Wang, L.: Robust recovery of corrupted low-rank matrix by implicit regularizers. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 770–783 (2014)
Article Google Scholar
Hu, S.L., Huang, Z.H., Lu, N.: A nonmonotone line search algorithm for unconstrained optimization. J. Sci. Comput. 42, 38–53 (2010)
Article MathSciNet MATH Google Scholar
Kozma, A., Conte, C., Diehl, M.: Benchmarking large-scale distributed convex quadratic programming algorithms. Optim. Methods Softw. 30(1), 191–214 (2015)
Article MathSciNet MATH Google Scholar
Kozma, A., Frasch, J.V., Diehl, M.: A distributed method for convex quadratic programming problems arising in optimal control of distributed systems. In: Proceedings of the 52nd IEEE Conference on Decision and Control, Florence, Italy (December 2013)
Lan, G., Monteiro, R.D.: Iteration-complexity of first-order penalty methods for convex programming. Math. Program. 138(1–2), 115–139 (2013)
Article MathSciNet MATH Google Scholar
Lin, Q., Xiao, L.: An adaptive accelerated proximal gradient method and its homotopy continuation for sparse optimization. In: Proceedings of The 31st International Conference on Machine Learning, Beijing, China (2014)
Maros, I., Meszaros, C.: A repository of convex quadratic programming problems. Optim. Methods Softw. 11(1–4), 671–681 (1999)
Article MathSciNet MATH Google Scholar
Meng, X., Chen, H.: Accelerating Nesterov’s method for strongly convex functions with Lipschitz gradient. Math. Optim. Control 90C25, 1–13 (2011). arXiv:1109.6058v1
Nemirovski, A.S.: Efficient methods in convex programming, Lecture Notes, Technion-Israel Institute of Technology (1994)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate of (\(1/k^2\)). Sov. Math. Doklady 27(2), 372–376 (1983)
MATH Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Programming: A Basic Course. Kluwer Academic Publishers, Dordrecht (2004)
Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE discussion paper (2007)
Nicolas, L.R., Mark, S., Francis, B.: A stochastic gradient method with an exponential convergence rate for finite training sets. Math. Optim. Control 1–34 (2013). arXiv:1202.6258v4
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006)
MATH Google Scholar
O’Donoghue, B., Candes, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. (2013)
Patrinos, P., Bemporad, A.: An accelerated dual gradient-projection algorithm for linear model predictive control. In: Proceedings of the 51st IEEE Conference on Decision and Control. Maui, US (December 2012)
Pedregosa, F.: Numerical optimizers for logistic regression (2013). http://fa.bianp.net/blog/2013/numerical-optimizers-for-logistic-regression/#fn:2
Polyak, B.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)
Article Google Scholar
Richter, S., Jones, C.N., Morari, M.: Real-time input-constrained MPC using fast gradient methods. In: Proceedings of the 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference, Shanghai, China (December 2009)
Shah, B., Buehler, R., Kempthorne, O.: Some algorithms for minimizing a function of several variables. J. Soc. Ind. Appl. Math. 12(1), 74–92 (1964)
Article MathSciNet MATH Google Scholar
Telgarsky, M.: Steepest descent analysis for unregularized linear prediction with strictly convex penalties. In: Proceedings of the 4th International Workshop on Optimization for Machine Learning (OPT), held as a part of the NIPS workshops series (December 2011)
Torii, M., Hagan, M.T.: Stability of steepest descent with momentum for quadratic functions. IEEE Trans. Neural Netw. 13(3), 752–756 (2002)
Article Google Scholar
Vogl, T.P., Mangis, J., Rigler, A., Zink, W., Alkon, D.: Accelerating the convergence of the back-propagation method. Biol. Cybern. 59(4–5), 257–263 (1988)
Article Google Scholar
Worthington, P.L., Hancock, E.R.: Surface topography using shape-from-shading. Pattern Recognit. 34(4), 823–840 (2001)
Article MATH Google Scholar

Download references

Acknowledgments

This work was funded by EPRSC under the Grant EP/H016600/1.

Author information

Authors and Affiliations

Control Systems Center, School of Electrical and Electronic Engineering, University of Manchester, Sackville Street Building, Manchester, M13 9PL, UK
Razak O. Alli-Oke & William P. Heath

Authors

Razak O. Alli-Oke
View author publications
You can also search for this author in PubMed Google Scholar
William P. Heath
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Razak O. Alli-Oke.

Appendix

See Table 1.

Table 1 Performance of solvers \(\mathcal {S}\) on Example 4

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alli-Oke, R.O., Heath, W.P. A secant-based Nesterov method for convex functions. Optim Lett 11, 81–105 (2017). https://doi.org/10.1007/s11590-015-0991-3

Download citation

Received: 13 March 2015
Accepted: 14 December 2015
Published: 07 March 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s11590-015-0991-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A secant-based Nesterov method for convex functions

Abstract

Access this article

Similar content being viewed by others

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

CasADi: a software framework for nonlinear optimization and optimal control

Random Gradient-Free Minimization of Convex Functions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A secant-based Nesterov method for convex functions

Abstract

Access this article

Similar content being viewed by others

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

CasADi: a software framework for nonlinear optimization and optimal control

Random Gradient-Free Minimization of Convex Functions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation