Multi-label feature selection method based on redundancy minimization

Document Type : Original Article

Authors

1 Department of Applied Mathematics, Faculty of Sciences and Modern Technologies, Graduate University of Advanced Technology, Kerman, Iran

2 Department of Computer Engineering, Faculty of Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

3 Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

Abstract

Feature selection methods are known to be effective in improving the learning process. The purpose of a feature selection method is to identify relevant features and remove irrelevant features in order to obtain a suitable subset of features, so that the redundancy between the selected features is minimized. In multi-label data, if there is a correlation between features, it is possible that the amount of redundancy in the feature set is increased. The existence of redundancy between features along with the challenge of high dimensions of multi-label data can grow the computational calculations, decrease the accuracy and finally increase the probability of errors in the prediction and classification of multi-label data. In this article, with the aim of minimizing the redundancy of features, a multi-label feature selection algorithm is proposed considering the least squares regression model and sparse regularization. Finally, using a number of well-known multi-label data sets, the efficiency of the proposed method is verified and the results are compared with some common multi-label feature selection methods.

Keywords


  • McCallum, A. (1999). Multi-Label Text Classification with a Mixture Model Trained by EM. AAAI'99 Workshop on Text Learning.
  • Boutell, M. R., Luo, J., Shen, X. & Brown, C. M. (2004). Learning Multi-Label Scene Classification. Pattern Recognition, 37, 1757--1771.
  • Zhang, M.-L. & Zhou, Z.-H. (2006). Multi-Label Neural Networks with Applications to Functional Genomics and Text Categorization.. IEEE Trans. Knowl. Data Eng., 18, 1338-1351.
  • Xu, X., Shan, D., Li, S., Sun, T., Xiao, P., & Fan, J. (2019). Multi-label learning method based on ML-RBF and laplacian ELM. Neurocomputing, 331, 213–219.
  • N‎. ‎Zhang‎, ‎S‎. ‎Ding‎, ‎J‎. ‎Zhang‎, ‎Multi layer ELM-RBF for multi-label learning‎, ‎Applied Soft‎‎Computing‎, ‎43‎, ‎535-545‎, ‎2016‎.
  • Wang, S., Pedrycz, W., Zhu, Q. & Zhu, W. (2015). Subspace learning for unsupervised feature selection via matrix factorization.. Pattern Recognit., 48, 10-19.
  • Wang, S., Pedrycz, W., Zhu, Q. & Zhu, W. (2015). Unsupervised feature selection via maximum projection and minimum redundancy.. Knowl. Based Syst., 75, 19-29.
  • Jian, L., Li, J., Shu, K., & Liu, H. (2016). Multi-label informed feature selection. IJCAI International Joint Conference on Artificial Intelligence, 2016-January, 1627-1633.
  • Huang, R., & Wu, Z. (2021). Multi-label feature selection via manifold regularization and dependence maximization. Pattern Recognition, 120, 108149. doi:10.1016/j.patcog.2021.108149
  • Lin, Y., Hu, Q., Liu, J., & Duan, J. (2015). Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing, 168, 92–103. doi:10.1016/j.neucom.2015.06.010
  • Shannon, C. E. (2001). A Mathematical Theory of Communication. SIGMOBILE Mob. Comput. Commun. Rev., 5(1), 3–55. doi:10.1145/584091.584093
  • Lee, J., & Kim, D.-W. (2013). Feature selection for multi-label classification using multivariate mutual information. Pattern Recognition Letters, 34(3), 349–357. doi:10.1016/j.patrec.2012.10.005
  • Lee, J., & Kim, D.-W. (2015). Mutual Information-based multi-label feature selection using interaction information. Expert Systems with Applications, 42(4), 2013–2025. doi:10.1016/j.eswa.2014.09.063.
  • Lee, J., & Kim, D.-W. (2015). Memetic feature selection algorithm for multi-label classification. Information Sciences, 293, 80–96. doi:10.1016/j.ins.2014.09.020
  • Doquire, G., & Verleysen, M. (2013). Mutual information-based feature selection for multilabel classification. Neurocomputing, 122, 148–155. doi:10.1016/j.neucom.2013.06.035
  • Jian, L., Li, J., Shu, K., & Liu, H. (2016). Multi-Label Informed Feature Selection. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 1627–1633. New York, New York, USA: AAAI Press.
  • Huang, R., Jiang, W., & Sun, G. (2018). Manifold-based constraint Laplacian score for multi-label feature selection. Pattern Recognition Letters, 112, 346–352. doi:10.1016/j.patrec.2018.08.021
  • Hu, J., Li, Y., Gao, W., & Zhang, P. (2020). Robust multi-label feature selection with dual-graph regularization. Knowledge-Based Systems, 203, 106126. doi:10.1016/j.knosys.2020.106126
  • Lee, J., & Kim, D.-W. (2015). Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recognition, 48(9), 2761–2771. doi:10.1016/j.patcog.2015.04.009