Steel Price Time Series Forecasting Using a Patch-based Transformer Architecture

Document Type : Original Article

Authors

1 Assistant Professor Computer Engineering Department, Meybod University, Meybod, Iran

2 Ms Student Computer Engineering Department, Meybod University, Meybod, Iran

10.22034/abmir.2025.23536.1152

Abstract

Steel, as one of the most critical materials across various industries, plays a vital role in the global economy. However, its high price volatility, complex and non-linear dependencies, and the inability of traditional models to capture long-term correlations have made its accurate forecasting a serious challenge. To overcome these shortcomings, this research proposes a comprehensive deep learning approach for steel price time series forecasting, structured in three main phases. First, the data undergoes preprocessing, including handling missing values, normalization, and creating time window inputs. Second, various deep learning models, including Recurrent Neural Networks (RNNs), Transformer networks (PatchTST), and Convolutional Networks, are trained and optimized. Finally, the third phase involves evaluating and comparing the models' performance using the R², RMSE, and MAE metrics. Evaluation results demonstrate that the PatchTST model, by utilizing its attention mechanism and processing data in segments or "patches," successfully identified complex, long-term dependencies with superior accuracy compared to other models. Specifically, PatchTST achieved a leading R2 value of 0.9815 and the lowest RMSE of 0.0304 in steel price prediction. Conversely, standard RNN models exhibited significantly weaker performance due to their sequential nature and inherent structural limitations. These findings underscore the definitive superiority of Transformer-based models for accurate forecasting of complex time series data.

Keywords

Main Subjects


[1]     G. Dooley, and H. Lenihan, “An assessment of time series methods in metal price forecasting,” Resources Policy, Vol. 30, No. 3, pp. 208-217, 2005.
[2]     T. Bollerslev, “Generalized autoregressive conditional heteroskedasticity,” Journal of econometrics, Vol. 31, No. 3, pp. 307-327, 1986.
[3]     C. A. Sims, “Macroeconomics and reality,” Econometrica: journal of the Econometric Society, pp. 1-48, 1980.
[4]     [4] C. Zhang, N. N. A. Sjarif, and R. Ibrahim, “Deep learning models for price forecasting of financial time series: A review of recent advancements: 2020–2022,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 14, No. 1, 2024.
[5]     K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,”. arXiv preprint arXiv:1406.1078, 2014.
[6]     C. Ozdemir, K. Buluş, and K. Zor, “Medium-to long-term nickel price forecasting using LSTM and networks,” Resources Policy, Vol. 78, p. 102906, 2022.
[7]     L’Heureux, K. Grolinger, and M. A. Capretz, “Transformer-based model for electrical load forecasting,” Energies, Vol. 15, No. 14, p. 4993, 2022.
[8]     Guha, and G. Bandyopadhyay, “Gold price forecasting using ARIMA model,” Journal of Advanced Management Science, Vol. 4, No. 2, 2016.
[9]     G. P. Girish, “Spot electricity price forecasting in Indian electricity market using autoregressive-GARCH models,” Energy Strategy Reviews, Vol. 11, pp. 52-57, 2016.
[10] F. Merabet, H. Zeghdoudi, R. H. Yahia, and I. Saba, “Modelling of oil price volatility using ARIMA-GARCH models,” Adv. Mathematics, Vol. 10, pp. 2361-2380, 2021.
[11] T. Rahnemoon Piruj, S. S. Akbar Mousavi, and M. Asgari, “Modeling and Monthly Price Forecasting of Steel in Iran,” Stable Economy Journal, Vol. 4, No. 4, pp. 60-95, 2024.
[12] M. S. Khan, and U. Khan, “Comparison of forecasting performance with VAR vs. ARIMA models using economic variables of Bangladesh,” Asian Journal of Probability and Statistics, Vol. 10, No. 2, pp. 33-47, 2020.
[13] M. Vijh, D. Chandola, V. A. Tikkiwal, and A. Kumar, “Stock closing price prediction using machine learning techniques,” Procedia computer science, Vol. 167, pp. 599-606, 2020.
[14] Kuvalekar, S. Manchewar, S. Mahadik, and S. Jawale, "House price forecasting using machine learning,” In Proceedings of the 3rd international conference on advances in science & technology (ICAST), 2020.
[15] T. B. Shahi, A. Shrestha, A. Neupane, and W. Guo, “Stock price forecasting with deep learning: A comparative study,” Mathematics, Vol. 8, No. 9, p. 1441, 2020.
[16] M. Gunarto, S. Sa'adah, and D. Q. Utama, “Predicting cryptocurrency price using rnn and lstm method,” Journal Sisfokom (Sistem Informasi dan Komputer), Vol. 12, No. 1, pp. 1-8, 2023.
[17] T. Muhammad, A. B. Aftab, M. Ibrahim, M. M. Ahsan, M. M. Muhu, S. I. Khan, and M. S. Alam, ”Transformer-based deep learning model for stock price prediction: A case study on Bangladesh stock market,” International Journal of Computational Intelligence and Applications, Vol. 22, No. 3, p. 2350013, 2023.
[18] H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, and W. Zhang, “Informer: Beyond efficient transformer for long sequence time-series forecasting,”. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35, No. 12, pp. 11106-11115, 2021.
[19] Q. Wen, T. Zhou, C. Zhang, W. Chen, Z. Ma, J. Yan, and L. Sun, “Transformers in time series: A survey,” arXiv preprint arXiv:2202.07125, 2022.
[20] Patnaik, N. J. Rao, B. Padhiari, and S. Patnaik, “Optimised hybrid CNN-LSTM model for stock price prediction,”. International Journal of Management and Decision Making, Vol. 23, No. 4, pp. 438-460, 2024.
[21] L. Jialin, Q. Shanwen, Z. Zhikai, L. Keyao, M. Jiayong, and T. T. Toe, “Cnn-lstm model stock forecasting based on an integrated attention mechanism,” In 2022 3rd International conference on pattern recognition and machine learning (PRML), pp. 403-408, 2022.
[22] S. Wang, “A stock price prediction method based on BiLSTM and improved transformer,”. IEEE Access, Vol. 11, pp. 104211-104223, 2023.
[23] Y. Liu, H. Dong, X. Wang, and S. Han, “Time series prediction based on temporal convolutional network,” In 2019 IEEE/ACIS 18th International conference on computer and information science (ICIS), pp. 300-305, 2019.
[24] J. Shi, R. Myana, V. Stebliankin, A. Shirali, and G. Narasimhan, “Explainable parallel rcnn with novel feature representation for time series forecasting,” In International Workshop on Advanced Analytics and Learning on Temporal Data, pp. 56-75, 2023.