ارائه یک روش انتخاب ویژگی مبتنی بر انتگرال فازی در یادگیری چندبرچسبی

نوع مقاله : مقاله پژوهشی

نویسندگان

گروه مهندسی کامپیوتر، دانشکده فنی و مهندسی، دانشگاه لرستان، خرم آباد، ایران

چکیده

الگوریتم‌های یادگیری چندبرچسبی به دلیل حجم و ابعاد بالای داده‌های چندبرچسبی و همچنین وجود نویز در آنها، با چالش‌های فراوانی مواجه هستند. انتخاب ویژگی یک تکنیک مؤثر برای برطرف کردن این چالش‌ها است. در این مقاله یک روش انتخاب ویژگی مبتنی بر یک رویکرد شورایی برای داده‌های چندبرچسبی ارائه شده است. در روش پیشنهادی، سه ماتریس تصمیم مختلف بر اساس معیار‌های ارزیابی ویژگی مختلف با درنظرگرفتن همگرایی ویژگی‌ها با برچسب‌های کلاس و همچنین افزونگی ویژگی‌ها نسبت به هم در فرایند انتخاب ویژگی مؤثر هستند. این سه ماتریس تصمیم در نهایت بر اساس یک رویکرد شورایی مبتنی بر مفهوم انتگرال فازی با هم ترکیب می‌شوند تا ارزیابی ویژگی‌ها بر اساس مقدار تجمیع شده صورت گیرد. برای ارزیابی عملکرد الگوریتم پیشنهادی، مقایساتی با چندین الگوریتم مشابه بر روی چند مجموعه‌داده مختلف صورت گرفته است. نتایج به دست آمده از آزمایش‌ها انجام شده، نشان‌دهنده عملکرد مناسب الگوریتم پیشنهادی در مقایسه با سایر الگوریتم‌ها است.

کلیدواژه‌ها


عنوان مقاله [English]

A feature selection algorithm based on fuzzy integral in multi-label learning

چکیده [English]

Multi-label learning algorithms face many challenges due to the high volume and dimensions of multi-label data and the existence of noise. Feature selection methods are an effective technique for addressing these challenges. This paper presents a feature selection method based on an ensemble approach for multi-label data. In this approach, three different decision matrices based on various feature evaluation criteria, taking into account the relevancy of features with class labels and their redundancy relative to each other, are effective in the feature selection process. These three decision matrices are finally combined based on an ensemble approach using the concept of fuzzy integral to evaluate the features according to the aggregate value. Comparisons have been made with several similar algorithms to illustrate the performance of the proposed method.

کلیدواژه‌ها [English]

  • Feature selection
  • Multi-label learning
  • Fuzzy integral
  • Ensemble approach
[1]          Hashemi, A., Bagher Dowlatshahi, M., and Nezamabadi-pour, H. (2021) An efficient Pareto-based feature selection algorithm for multi-label classification. Information Sciences. 581 428–447.
[2]          Dhal, P. and Azad, C. (2022) A comprehensive survey on feature selection in the various fields of machine learning. Applied Intelligence. 52 (4), 4543–4581.
[3]          Deng, X., Li, Y., Weng, J., and Zhang, J. (2019) Feature selection for text classification: A review. Multimedia Tools and Applications. 78 (3), 3797–3816.
[4]          Hashemi, A., Dowlatshahi, M.B., and Nezamabadi-pour, H. (2020) MFS-MCDM: Multi-label feature selection using multi-criteria decision making. Knowledge-Based Systems. 206 106365.
[5]          Kashef, S., Nezamabadi-pour, H., and Nikpour, B. (2018) Multilabel feature selection: A comprehensive review and guiding experiments. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 8 e1240.
[6]          Hashemi, A., Dowlatshahi, M.B., and Nezamabadi-Pour, H. (2020) A bipartite matching-based feature selection for multi-label learning. International Journal of Machine Learning and Cybernetics.
[7]          Cai, J., Luo, J., Wang, S., and Yang, S. (2018) Feature selection in machine learning: A new perspective. Neurocomputing. 300 70–79.
[8]          Bolón-Canedo, V. and Alonso-Betanzos, A. (2019) Ensembles for feature selection: A review and future trends. Information Fusion. 52 1–12.
[9]          Paniri, M., Dowlatshahi, M.B., and Nezamabadi-pour, H. (2020) MLACO: A multi-label feature selection algorithm based on ant colony optimization. Knowledge-Based Systems. 192 105285.
[10]        Hashemi, A., Dowlatshahi, M.B., and Nezamabadi-pour, H. (2020) MGFS: A multi-label graph-based feature selection algorithm via PageRank centrality. Expert Systems with Applications. 142 113024.
[11]        Hashemi, A., Dowlatshahi, M.B., and Nezamabadi-Pour, H. (2021) A bipartite matching-based feature selection for multi-label learning. International Journal of Machine Learning and Cybernetics. 12 (2), 459–475.
[12]        Che, X., Chen, D., and Mi, J. (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Information Sciences. 512 795–812.
[13]        Paniri, M., Dowlatshahi, M.B., and Nezamabadi-pour, H. (2021) Ant-TD: Ant colony optimization plus temporal difference reinforcement learning for multi-label feature selection. Swarm and Evolutionary Computation. 64 100892.
[14]        Zhang, P., Liu, G., and Gao, W. (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recognition. 95 72–82.
[15]        Paul, D., Jain, A., Saha, S., and Mathew, J. (2021) Multi-objective PSO based online feature selection for multi-label classification. Knowledge-Based Systems. 222 106966.
[16]        Fan, Y., Liu, J., Weng, W., Chen, B., Chen, Y., and Wu, S. (2021) Multi-label feature selection with constraint regression and adaptive spectral graph. Knowledge-Based Systems. 212 106621.
[17]        Beliakov, G. and Divakov, D. (2020) On representation of fuzzy measures for learning Choquet and Sugeno integrals. Knowledge-Based Systems. 189 105134.
[18]        Ayub, M. (2009) Choquet and Sugeno Integrals, 2009.
[19]        Hashemi, A., Dowlatshahi, M.B., and Nezamabadi-pour, H. (2022) Ensemble of feature selection algorithms: a multi-criteria decision-making approach. International Journal of Machine Learning and Cybernetics. 13 (1), 49–69.
[20]        Ueda, N. and Saito, K. (2003) Parametric mixture models for multi-labeled text. in: Adv. Neural Inf. Process. Syst., pp. 737–744.
[21]        Charte, F. and Charte, D. (2015) Working with multilabel datasets in R: The mldr package. R Journal. 7 (2), 149–162.
[22]        Reyes, O., Morell, C., and Ventura, S. (2015) Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing. 161.
[23]        Huang, R., Jiang, W., and Sun, G. (2018) Manifold-based constraint Laplacian score for multi-label feature selection. Pattern Recognition Letters. 112 346–352.
[24]        Cherman, E.A., Spolaôr, N., Valverde-Rebaza, J., and Monard, M.C. (2015) Lazy Multi-label Learning Algorithms Based on Mutuality Strategies. Journal of Intelligent and Robotic Systems: Theory and Applications. 80 261–276.