Semi-supervised Sparse Feature Selection based on Graph Autoencoder by Preservation of Broad and Local Data Structures

Document Type : Original Article

Authors

1 Phd candidate, Computer Engineering Department, Yazd University, Yazd, Iran

2 Associate Professor, Computer Engineering Department, Yazd University, Yazd, Iran

3 Associate Professor, Department of Computer Engineering, Faculty of Engineering, Ardakan University, Ardakan, Iran

Abstract

Processing and analyzing high-dimensional data is a significant challenge in many domains, and feature selection, as an effective dimension reduction method, plays a key role in improving the performance of machine learning models. Given that in the real world, labeling large volumes of data is costly and time-consuming, semi-supervised feature selection methods that can leverage valuable information from unlabeled data alongside labeled data have gained considerable importance. In this paper, a novel sparse semi-supervised feature selection framework is introduced, which simultaneously preserves the broad and local structures of data as well as the information from available labels. The proposed framework by optimizing a comprehensive objective function comprising an autoencoder reconstruction term, an L_(2,1)-norm regularization term for sparsity, and a term based on the semi-supervised spectral graph, selects an optimal subset of features. To solve this optimization problem, a gradient-based backpropagation algorithm is employed, and its convergence has been empirically investigated and confirmed. Extensive evaluations on six standard datasets and comparison of the results with several prominent previous methods demonstrate the significant superiority of the proposed framework in improving classification accuracy and selecting more effective features under semi-supervised conditions.

Keywords

Main Subjects


[1]     G. Chandrashekar and F. Sahin, “A survey on feature selection methods,” Computers & Electrical Engineering, vol. 40, no. 1, pp. 16–28, Jan. 2014, doi: 10.1016/j.compeleceng.2013.11.024.
[2]     Y. Hu, Y. Zhang, and D. Gong, “Multiobjective Particle Swarm Optimization for Feature Selection With Fuzzy Cost,” IEEE Trans Cybern, vol. 51, no. 2, pp. 874–888, Feb. 2021, doi: 10.1109/TCYB.2020.3015756.
[3]     W. Zhong, X. Chen, F. Nie, and J. Z. Huang, “Adaptive discriminant analysis for semi-supervised feature selection,” Inf Sci (N Y), vol. 566, pp. 178–194, Aug. 2021, doi: 10.1016/j.ins.2021.02.035.
[4]     G. Roffo, S. Melzi, U. Castellani, A. Vinciarelli, and M. Cristani, “Infinite Feature Selection: A Graph-based Feature Filtering Approach,” IEEE Trans Pattern Anal Mach Intell, vol. 43, no. 12, pp. 4396–4410, Dec. 2021, doi: 10.1109/TPAMI.2020.3002843.
[5]     R. Sheikhpour, M. A. Sarram, S. Gharaghani, and M. A. Z. Chahooki, “A Survey on semi-supervised feature selection methods,” Pattern Recognit, vol. 64, pp. 141–158, Apr. 2017, doi: 10.1016/j.patcog.2016.11.003.
[6]     T. Bhadra and S. Bandyopadhyay, “Supervised feature selection using integration of densest subgraph finding with floating forward–backward search,” Inf Sci (N Y), vol. 566, pp. 1–18, Aug. 2021, doi: 10.1016/j.ins.2021.02.034.
[7]     B. C. Love, “Comparing supervised and unsupervised category learning,” Psychon Bull Rev, vol. 9, no. 4, pp. 829–835, Dec. 2002, doi: 10.3758/BF03196342.
[8]     R. Zhang, H. Zhang, X. Li, and S. Yang, “Unsupervised Feature Selection With Extended OLSDA via Embedding Nonnegative Manifold Structure,” IEEE Trans Neural Netw Learn Syst, vol. 33, no. 5, pp. 2274–2280, May 2022, doi: 10.1109/TNNLS.2020.3045053.
[9]     J. E. van Engelen and H. H. Hoos, “A survey on semi-supervised learning,” Mach Learn, vol. 109, no. 2, 2020, doi: 10.1007/s10994-019-05855-6.
[10] C. Shi, Q. Ruan, and G. An, “Sparse feature selection based on graph Laplacian for web image annotation,” Image Vis Comput, vol. 32, no. 3, pp. 189–201, Mar. 2014, doi: 10.1016/j.imavis.2013.12.013.
[11] H. Barkia, H. Elghazel, and A. Aussem, “Semi-supervised Feature Importance Evaluation with Ensemble Learning,” in 2011 IEEE 11th International Conference on Data Mining, IEEE, Dec. 2011, pp. 31–40. doi: 10.1109/ICDM.2011.129.
[12] F. Nie, H. Huang, X. Cai, and C. Ding, “Efficient and robust feature selection via joint ℓ2, 1-norms minimization,” Adv Neural Inf Process Syst, vol. 23, 2010.
[13] L. Wang and S. Chen, “$ l_ {2, p} $ Matrix Norm and Its Application in Feature Selection,” arXiv preprint arXiv:1303.3987, 2013.
[14] R. Sheikhpour, M. A. Sarram, S. Gharaghani, and M. A. Z. Chahooki, “A robust graph-based semi-supervised sparse feature selection method,” Inf Sci (N Y), vol. 531, pp. 13–30, Aug. 2020, doi: 10.1016/j.ins.2020.03.094.
[15] X. Li, Y. Zhang, and R. Zhang, “Semisupervised Feature Selection via Generalized Uncorrelated Constraint and Manifold Embedding,” IEEE Trans Neural Netw Learn Syst, vol. 33, no. 9, pp. 5070–5079, Sep. 2022, doi: 10.1109/TNNLS.2021.3069038.
[16] G. Sampson, D. E. Rumelhart, J. L. McClelland, and The PDP Research Group, “Parallel Distributed Processing: Explorations in the Microstructures of Cognition,” Language (Baltim), vol. 63, no. 4, p. 871, Dec. 1987, doi: 10.2307/415721.
[17] Y. Bengio, “Learning Deep Architectures for AI,” Foundations and Trends® in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009, doi: 10.1561/2200000006.
[18] S. Feng and M. F. Duarte, “Graph autoencoder-based unsupervised feature selection with broad and local data structure preservation,” Neurocomputing, vol. 312, pp. 310–323, Oct. 2018, doi: 10.1016/j.neucom.2018.05.117.
[19] R. Shang, Z. Zhang, L. Jiao, C. Liu, and Y. Li, “Self-representation based dual-graph regularized feature selection clustering,” Neurocomputing, vol. 171, pp. 1242–1253, Jan. 2016, doi: 10.1016/j.neucom.2015.07.068.
[20] P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in Proceedings of the 25th International Conference on Machine Learning, 2008. doi: 10.1145/1390156.1390294.
[21] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in 2nd International Conference on Learning Representations, ICLR 2014 - Conference Track Proceedings, 2014. doi: 10.61603/ceas.v2i1.33.
[22] A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey, “Adversarial autoencoders,” arXiv preprint arXiv:1511.05644, 2015.
[23] Z. Xu, X. Chang, F. Xu, and H. Zhang, “$L_{1/2}$ Regularization: A Thresholding Representation Theory and a Fast Solver,” IEEE Trans Neural Netw Learn Syst, vol. 23, no. 7, pp. 1013–1027, Jul. 2012, doi: 10.1109/TNNLS.2012.2197412.
[24] X. Zhu, S. Zhang, Z. Jin, Z. Zhang, and Z. Xu, “Missing Value Estimation for Mixed-Attribute Data Sets,” IEEE Trans Knowl Data Eng, vol. 23, no. 1, pp. 110–121, Jan. 2011, doi: 10.1109/TKDE.2010.99.
[25] D. Cai, C. Zhang, and X. He, “Unsupervised feature selection for multi-cluster data,” in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA: ACM, Jul. 2010, pp. 333–342. doi: 10.1145/1835804.1835848.
[26] L. Bottou and V. Vapnik, “Local Learning Algorithms,” Neural Comput, vol. 4, no. 6, pp. 888–900, Nov. 1992, doi: 10.1162/neco.1992.4.6.888.
[27] R. Sheikhpour, “Semi-supervised sparse feature selection based on Hessian regularization and Fisher discriminant analysis,” Tabriz J. Electr. Eng., vol. 52, no. 2, pp. 125–135, 2022. doi: 10.22034/tjee.2022.15428. [In Persian]
[28] R. Sheikhpour, K. Berahmand, and S. Forouzandeh, “Hessian-based semi-supervised feature selection using generalized uncorrelated constraint,” Knowl Based Syst, vol. 269, p. 110521, Jun. 2023, doi: 10.1016/j.knosys.2023.110521.
[29] R. Sheikhpour, “A local spline regression-based framework for semi-supervised sparse feature selection,” Knowl Based Syst, vol. 262, p. 110265, Feb. 2023, doi: 10.1016/j.knosys.2023.110265.
[30] Z. Wang, F. Nie, L. Tian, R. Wang, and X. Li, “Discriminative Feature Selection via A Structured Sparse Subspace Learning Module,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, California: International Joint Conferences on Artificial Intelligence Organization, Jul. 2020, pp. 3009–3015. doi: 10.24963/ijcai.2020/416.