Neural Network Learning Using an Improved Levenberg-Marquardt Algorithm

Document Type : Original Article

Authors

1 MSc. Student, Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

2 Assistant Professor, Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

3 Associate Professor, Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

10.22034/abmir.2025.22500.1082

Abstract

This paper presents a new method to improve the convergence speed and efficiency of the Levenberg-Marquardt algorithm. The Levenberg-Marquardt algorithm is a Newton-based method that is efficient in optimization and determining the weights of neural networks compared to other methods, including the backpropagation method. However, the performance of this method depends significantly on the selection of an appropriate damping factor. Among the various methods for determining the damping factor, the Marquardt line search method and methods based on error norm and based on Jacobian norm are mentioned, which in this paper, by examining the strengths and weaknesses of these methods, a combined method is presented to increase the convergence speed. In the proposed method, the search range of the damping factor and, as a result, the value of the adjustment rate parameter is reduced. By making these corrections, the accuracy of the damping factor search is increased and the convergence speed of the proposed method is increased. In order to evaluate the proposed method, learning a neural network to identify several problems including a nonlinear complex function, a regression and a classification task has been simulated and the results show that the proposed method has good efficiency compared to other methods studied and has been able to reduce the learning error to an acceptable level and has a higher convergence speed.

Keywords

Main Subjects


[1]     O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, “State-of-the-art in artificial neural network applications: A survey,” Heliyon, vol. 4, no. 11, p. e00938, Nov. 2018, doi: 10.1016/j.heliyon.2018.e00938.
[2]     M. Sh. Daoud, M. Shehab, H. M. Al-Mimi, L. Abualigah, R. A. Zitar, and M. K. Y. Shambour, “Gradient-Based Optimizer (GBO): A Review, Theory, Variants, and Applications,” Archives of Computational Methods in Engineering, vol. 30, no. 4, pp. 2431–2449, May 2023, doi: 10.1007/s11831-022-09872-y.
[3]     X. Wei and H. Huang, “A survey on several new popular swarm intelligence optimization algorithms,” 2023, doi: 10.21203/rs.3.rs-2450545/v1.
[4]     S. Ruder, “An overview of gradient descent optimization algorithms,” Sep. 2016.
[5]     B. T. Polyak, “Newton’s method and its use in optimization,” Eur J Oper Res, vol. 181, no. 3, pp. 1086–1096, Sep. 2007, doi: 10.1016/j.ejor.2005.06.076.
[6]     M. S. Osigbemeh, C. Osuji, M. O. Onyesolu, and U. P. Onochie, “Comparison of the Artificial Neural Network’s Approximations for the Levenberg-Marquardt Algorithm and the Gradient Descent Optimization on Datasets,” Artificial Intelligence Evolution, pp. 24–38, Mar. 2024, doi: 10.37256/aie.5120243781.
[7]     O. Umar, I. M. Sulaiman, M. Mamat, M. Y. Waziri, and N. Zamri, “On damping parameters of Levenberg-Marquardt algorithm for nonlinear least square problems,” J Phys Conf Ser, vol. 1734, no. 1, p. 012018, Jan. 2021, doi: 10.1088/1742-6596/1734/1/012018.
[8]     K. Levenberg, “A method for the solution of certain non-linear problems in least squares,” Q Appl Math, vol. 2, no. 2, pp. 164–168, 1944, doi: 10.1090/qam/10666.
[9]     D. W. Marquardt, “An Algorithm for Least-Squares Estimation of Nonlinear Parameters,” Journal of the Society for Industrial and Applied Mathematics, vol. 11, no. 2, pp. 431–441, Jun. 1963, doi: 10.1137/0111030.
[10] M. K. Transtrum and J. P. Sethna, “Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization,” Jan. 2012.
[11] J. J. Moré, “The Levenberg-Marquardt algorithm: Implementation and theory,” 1978, pp. 105–116. doi: 10.1007/BFb0067700.
[12] Y. Wang, “Gauss–Newton method,” WIREs Computational Statistics, vol. 4, no. 4, pp. 415–420, Jul. 2012, doi: 10.1002/wics.1202.
[13] C. Chen, S. Reiz, C. Yu, H.-J. Bungartz, and G. Biros, “Fast Approximation of the Gauss-Newton Hessian Matrix for the Multilayer Perceptron,” Oct. 2019.
[14] M. R. Osborne, “Nonlinear least squares — the Levenberg algorithm revisited,” The Journal of the Australian Mathematical Society. Series B. Applied Mathematics, vol. 19, no. 3, pp. 343–357, Jun. 1976, doi: 10.1017/S033427000000120X.
[15] A. Suratgar, M. B. Tavakoli, and A. Hoseinabadi, “Modified Levenberg-Marquardt method for neural networks training,” in Proceedings - Wec 05: Fourth World Enformatika Conference, 2005. doi: 10.5281/zenodo.1333881.
[16] Y. H. Irawan and P. T. Lin, “Parametric optimization technique for continuous and combinational problems based on simulated annealing algorithm,” Journal of Energy, Mechanical, Material, and Manufacturing Engineering, vol. 8, no. 2, pp. 75–82, 2023, doi: 10.22219/jemmme.v8i2.29556.
[17] Tsanas and A. Xifara, “Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools,” Energy and Buildings, vol. 49, pp. 560–567, 2012, doi: 10.1016/j.enbuild.2012.03.003.
[18] S. S. Mohammadi, M. Salehirad, M. M. Emamzadeh et al., “Improved equilibrium optimizer for accurate training of feedforward neural networks,” Optical Memory and Neural Networks, vol. 33, pp. 133–143, 2024, doi: 10.3103/S1060992X24700048.