IJEM Vol. 16, No. 1, 8 Feb. 2026
Cover page and Table of Contents: PDF (size: 1636KB)
PDF (1636KB), PP.1-18
Views: 0 Downloads: 0
Multi-Label Classification, Low-Rank Subspace Learning, Feature Selection, Schatten-P Norm, Manifold Regularization, Global-Local Correlation
Multi-label classification faces significant challenges from high-dimensional features and complex label dependencies. Traditional feature selection methods often fail to capture these dependencies effectively or suffer from high computational costs. This paper proposes a novel Robust Low-Rank Subspace Learning (RLRSL) framework for multi-label feature selection. Our method integrates global label correlations and local feature structures within a unified objective function, utilizing Schatten-p norm for low-rank subspace learning, l_(2,1),-norm for joint feature sparsity, and manifold regularization for local geometry preservation. We develop an efficient optimization algorithm to solve the resulting non-convex problem. Comprehensive experiments on seven benchmark datasets demonstrate that RLRSL consistently outperforms state-of-the-art methods across multiple evaluation metrics, including ranking loss, multi-label accuracy, and F1-score, using both ML-*k* NN and SVM classifiers. The results confirm the robustness, efficiency, and superior generalization capability of our proposed approach
Emmanuel Ntaye, Xiang-Jun Shen, Andrew Azaabanye Bayor, Fadilul-lah Yassaanah Issahaku, "Robust Low-Rank Subspace Learning for Multi-Label Feature Selection with Global-Local Correlation Modeling", International Journal of Engineering and Manufacturing (IJEM), Vol.16, No.1, pp. 1-18, 2026. DOI:10.5815/ijem.2026.01.01
[1]J. Li, C. Zhang, J. T. Zhou, H. Fu, S. Xia, and Q. Hu. Deep-lift: Deep label-specific feature learning for image annotation. IEEE Transactions on Cybernetics, 52(8):7732–7741, 2022.
[2]T. Su, D. Feng, M. Wang, and M. Chen. Dual discriminative low-rank projection learning for robust image classification. IEEE Transactions on Circuits and Systems for Video Technology, 33(12):7708–7722, 2023.
[3]X. Wei, J. Huang, R. Zhao, H. Yu, and Z. Xu. Multi-label text classification model based on multi-level constraint augmentation and label association attention. ACM Transactions on Asian and Low-Resource Language Information Processing, 23(1), 2024.
[4]H. Joe and H. G. Kim. Multi-label classification with xgboost for metabolic pathway prediction. BMC Bioinformatics, 25(1):1–15, 2024.
[5]Y. J. Deng, M. L. Yang, H. C. Li, C. F. Long, K. Fang, and Q. Du. Feature dimensionality reduction with l2, p-norm-based robust embedding regression for classification of hyperspectral images. IEEE Transactions on Geoscience and Remote Sensing, 62:1–14, 2024.
[6]J. Y. Hang and M. L. Zhang. Dual perspective of label-specific feature learning for multi-label classification. Proceedings of Machine Learning Research, 162:8375–8386, 2022
[7]D. Theng and K. K. Bhoyar, “Feature selection techniques for machine learning: a survey of more than two decades of research,” Knowledge Information System, vol. 66, no. 3, pp. 1575–1637, Mar. 2024.
[8]Z. Ergul Aydin and Z. Kamisli Ozturk, “Filter-based feature selection methods in the presence of missing data for medical prediction models,” Multimedia Tools Appl, vol. 83, no. 8, pp. 24187–24216, Mar. 2024.
[9]I. Emmanuel, Y. Sun, and Z. Wang, “A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method,” J Big Data, vol. 11, no. 1, pp. 1–14, Dec. 2024
[10]S. Geng and L. Zhang, “GEE analysis in joint mean-covariance model for high-dimensional longitudinal data with HPC,” J Stat Comput Simul, vol. 95, no. 7, pp. 1520–1537, May 2025.
[11]A. K. Mandal, M. D. Nadim, H. Saha, T. Sultana, M. D. Hossain, and E. N. Huh, “Feature Subset Selection for High-Dimensional, Low Sampling Size Data Classification Using Ensemble Feature Selection With a Wrapper-Based Search,” IEEE Access, vol. 12, pp. 62341–62357, 2024, doi: 10.1109/ACCESS.2024.3390684.
[12]S. Solorio-Fernández, J. A. Carrasco-Ochoa, and J. F. Martínez-Trinidad, “A review of unsupervised feature selection methods,” Artif Intell Rev, vol. 53, no. 2, pp. 907–948, Feb. 2020, doi: 10.1007/S10462-019-09682-Y/METRICS.
[13]D. Yu and D. Kong, “Nuclear Norm Regularization,” Wiley Interdiscip Rev Comput Stat, vol. 17, no. 1, p. 70013, Mar. 2025.
[14]C. Lu, J. Tang, S. Yan, and Z. Lin, “Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm,” IEEE Transactions on Image Processing, vol. 25, no. 2, pp. 829–839, Feb. 2016, doi: 10.1109/TIP.2015.2511584.
[15]Y. Fan, J. Liu, P. Liu, Y. Du, W. Lan, and S. Wu, “Manifold learning with structured subspace for multi-label feature selection,” Pattern Recognition, vol. 120, p. 108169, Dec. 2021, doi: 10.1016/J.PATCOG.2021.108169.
[16]J. Zhang, Y. Lin, M. Jiang, S. Li, Y. Tang, and K. C. Tan, “Multi-label feature selection via global relevance and redundancy optimization,” IJCAI International Joint Conference on Artificial Intelligence, vol. 2021-January, pp. 2512–2518, 2020, doi: 10.24963/IJCAI.2020/348.
[17]K. Ding, J. Shu, D. Meng, and Z. Xu, “Improve Noise Tolerance of Robust Loss via Noise-Awareness,” IEEE Trans Neural Network Learning System, pp. 1–15, 2024, doi: 10.1109/TNNLS.2024.3457029.
[18]X. Jin, J. Miao, Q. Wang, G. Geng, and K. Huang, “Sparse matrix factorization with L2,1 norm for matrix completion,” Pattern Recognition, vol. 127, p. 108655, Jul. 2022, doi: 10.1016/J.PATCOG.2022.108655.
[19]H. Chen, H. Chen, W. Li, T. Li, C. Luo, and J. Wan, “Robust dual-graph regularized and minimum redundancy based on self-representation for semi-supervised feature selection,” Neurocomputing, vol. 490, pp. 104–123, Jun. 2022, doi: 10.1016/J.NEUCOM.2022.03.004.
[20]H. Cheng, W. Deng, C. Fu, Y. Wang, and Z. Qin, “Graph-Based Semi-supervised Feature Selection with Application to Automatic Spam Image Identification,” Communications in Computer and Information Science, vol. 159 CCIS, no. PART 2, pp. 259–264, 2011, doi: 10.1007/978-3-642-22691-5_45.
[21]H. Chen, H. Chen, W. Li, T. Li, C. Luo, and J. Wan, “Robust dual-graph regularized and minimum redundancy based on self-representation for semi-supervised feature selection,” Neurocomputing, vol. 490, pp. 104–123, Jun. 2022, doi: 10.1016/j.neucom.2022.03.004.
[22]J. Zhang, Y. Lin, M. Jiang, S. Li, Y. Tang, and K. C. Tan, “Multi-label Feature Selection via Global Relevance and Redundancy Optimization,” 2020, Accessed: Jun. 01, 2025. [Online]. Available: https://www.mathworks.com/help/control/ref/lyap.html
[23]K. Hopf and S. Reifenrath, “Filter Methods for Feature Selection in Supervised Machine Learning Applications -- Review and Benchmark,” Nov. 2021, Accessed: Jun. 01, 2025. https://arxiv.org/pdf/2111.12140
[24]W. Qian, J. Huang, Y. Wang, and W. Shu, “Mutual information-based label distribution feature selection for multi-label learning,” Knowledge Based Syst, vol. 195, p. 105684, May 2020, doi: 10.1016/J.KNOSYS.2020.105684.
[25]H. Zhou, X. Wang, and R. Zhu, “Feature selection based on mutual information with correlation coefficient,” Applied Intelligence, vol. 52, no. 5, pp. 5457–5474, Mar. 2022, doi: 10.1007/S10489-021-02524-X/TABLES/7.
[26]Z. Hou, Q. Hu, and W. L. Nowinski, “On minimum variance thresholding,” Pattern Recognition Letter, vol. 27, no. 14, pp. 1732–1743, Oct. 2006.
[27]G. Doquire and M. Verleysen, “Mutual information-based feature selection for multilabel classification,” Neurocomputing, vol. 122, pp. 148–155, Dec. 2013, doi: 10.1016/J.NEUCOM.2013.06.035.
[28]S. Sharmin, M. Shoyaib, A. A. Ali, M. A. H. Khan, and O. Chae, “Simultaneous feature selection and discretization based on mutual information,” Pattern Recognition, vol. 91, pp. 162–174, Jul. 2019, doi: 10.1016/J.PATCOG.2019.02.016.
[29]L. Pan et al., “A robust wrapper-based feature selection technique based on modified teaching learning-based optimization with hierarchical learning scheme,” Engineering Science and Technology, an International Journal, vol. 61, p. 101935, Jan. 2025.
[30]D. Bajer, M. Dudjak, and B. Zoric, “Wrapper-based feature selection: How important is the wrapped classifier?” Proceedings of 2020 International Conference on Smart Systems and Technologies, SST 2020, pp. 97–105, Oct. 2020, doi: 10.1109/SST49455.2020.9264072.
[31]M. G. Altarabichi, S. Nowaczyk, S. Pashami, and P. S. Mashhadi, “Fast Genetic Algorithm for feature selection — A qualitative approximation approach,” Expert Syst Appl, vol. 211, p. 118528, Jan. 2023, doi: 10.1016/J.ESWA.2022.118528.
[32]Z. Y. Taha, A. A. Abdullah, and T. A. Rashid, “Optimizing Feature Selection with Genetic Algorithms: A Review of Methods and Applications,” Sep. 2024, Accessed: Jun. 01, 2025. [Online]. Available: https://arxiv.org/pdf/2409.14563.
[33]I. Tsamardinos, G. Borboudakis, P. Katsogridakis, P. Pratikakis, and V. Christophides, “A greedy feature selection algorithm for Big Data of high dimensionality,” Mach Learn, vol. 108, no. 2, pp. 149–202, Feb. 2019, doi: 10.1007/S10994-018-5748-7/FIGURES/5.
[34]P. Shekhar and A. Patra, “A forward–backward greedy approach for sparse multiscale learning,” Comput Methods Appl Mech Eng, vol. 400, p. 115420, Oct. 2022.
[35]W. Wang, F. Zhang, and J. Wang, “Low-rank matrix recovery via regularized nuclear norm minimization,” Appl Comput Harmon Anal, vol. 54, pp. 1–19, Sep. 2021.
[36]J. Wen, X. Fang, Y. Xu, C. Tian, and L. Fei, “Low-rank representation with adaptive graph regularization,” Neural Networks, vol. 108, pp. 83–96, Dec. 2018.
[37]M. Liu, X. Zhang, and L. Tang, “Weighted t-Schatten-p Norm Minimization for Real Color Image Denoising,” IEEE Access, vol. 8, pp. 150350–150359, 2020.
[38]X. Jin, J. Miao, Q. Wang, G. Geng, and K. Huang, “Sparse matrix factorization with L2,1 norm for matrix completion,” Pattern Recognition, vol. 127, p. 108655, Jul. 2022.
[39]S. Fu, W. Liu, K. Zhang, Y. Zhou, and D. Tao, “Semi-supervised classification by graph p-Laplacian convolutional networks,” Inf Sci (N Y), vol. 560, pp. 92–106, Jun. 2021, doi: 10.1016/J.INS.2021.01.075.
[40]A. Kapur, K. Marwah, and G. Alterovitz, “Gene expression prediction using low-rank matrix completion,” BMC Bioinformatics, vol. 17, no. 1, pp. 1–13, Jun. 2016.
[41]R. B. Putchanuthala and E. S. Reddy, “Image retrieval using locality preserving projections,” The Journal of Engineering, vol. 2020, no. 10, pp. 889–892, Oct. 2020.
[42]M. Zong, Z. Ma, F. Zhu, Y. Ma, and R. Wang, “Laplacian eigenmaps based manifold regularized CNN for visual recognition,” Inf Sci (N Y), vol. 689, p. 121503, Jan. 2025.
[43]S. Feng and C. Lang, “Graph regularized low-rank feature mapping for multi-label learning with application to image annotation,” Multidimension System Signal Process, vol. 29, no. 4, pp. 1351–1372, Oct. 2018.
[44]B. V. Lad, M. F. Hashmi, and A. G. Keskar, “Boundary Preserved Salient Object Detection Using Guided Filter Based Hybridization Approach of Transformation and Spatial Domain Analysis,” IEEE Access, vol. 10, pp. 67230–67246, 2022.
[45]G. Li et al., “Personal Fixations-Based Object Segmentation with Object Localization and Boundary Preservation,” IEEE Transactions on Image Processing, vol. 30, pp. 1461–1475, 2021.
[46]Y. Wang, X. Zhang, and Y. Cheng, “A global and local unified feature selection algorithm based on hierarchical structure constraints,” Expert Syst Appl, vol. 282, p. 127535, Jul. 2025, doi: 10.1016/J.ESWA.2025.127535.
[47]Y. Xue, C. Zhang, F. Neri, M. Gabbouj, and Y. Zhang, “An external attention-based feature ranker for large-scale feature selection,” Knowl Based Syst, vol. 281, p. 111084, Dec. 2023, doi: 10.1016/J.KNOSYS.2023.111084.
[48]S. Liang, Y. Zhang, K. Zheng, and Y. Bai, “FeatureX: An explainable feature selection for deep learning,” Expert Syst Appl, vol. 282, p. 127675, Jul. 2025.
[49]M. C. Barbieri, B. I. Grisci, and M. Dorn, “Analysis and comparison of feature selection methods towards performance and stability,” Expert Syst Appl, vol. 249, p. 123667, Sep. 2024, doi: 10.1016/J.ESWA.2024.123667.
[50]C. Kuzudisli, B. Bakir-Gungor, N. Bulut, B. Qaqish, and M. Yousef, “Review of feature selection approaches based on grouping of features,” PeerJ, vol. 11, p. e15666, 2023, doi: 10.7717/PEERJ.15666.
[51]K. Hirose, S. Tateishi, and S. Konishi, “Tuning parameter selection in sparse regression modeling,” Comput Stat Data Anal, vol. 59, no. 1, pp. 28–40, Mar. 2013.
[52]M. L. Zhang and Z. H. Zhou, “ML-KNN: A lazy learning approach to multi-label learning,” Pattern Recognition, vol. 40, no. 7, pp. 2038–2048, Jul. 2007, doi: 10.1016/J.PATCOG.2006.12.019.
[53]A. Elisseeff and J. Weston, “A kernel method for multi-labelled classification,” Adv Neural Inf Process Syst, vol. 14, 2001.
[54]G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,” International Journal of Data Warehousing and Mining, vol. 3, no. 3, pp. 1–13, 2007.
[55]J. Lee, H. Lim, and D. W. Kim, “Approximating mutual information for multi-label feature selection,” Electron Lett, vol. 48, no. 15, pp. 929–930, Jul. 2012.
[56]J. Lee and D. W. Kim, “Feature selection for multi-label classification using multivariate mutual information,” Pattern Recognition Lett, vol. 34, no. 3, pp. 349–357, Feb. 2013, doi: 10.1016/J.PATREC.2012.10.005.
[57]Y. Lin, Q. Hu, J. Liu, and J. Duan, “Multi-label feature selection based on max-dependency and min-redundancy,” Neurocomputing, vol. 168, pp. 92–103, Nov. 2015.
[58]J. Lee and D. W. Kim, “Fast multi-label feature selection based on information-theoretic feature ranking,” Pattern Recognition, vol. 48, no. 9, pp. 2761–2771, Sep. 2015.
[59]H. Lim, J. Lee, and D. W. Kim, “Optimization approach for feature selection in multi-label classification,” Pattern Recognition Lett, vol. 89, pp. 25–30, Apr. 2017.
[60]R. Huang, W. Jiang, and G. Sun, “Manifold-based constraint Laplacian score for multi-label feature selection,” Pattern Recognition Lett, vol. 112, pp. 346–352, Sep. 2018.
[61]P. Zhang, G. Liu, W. Gao, and J. Song, “Multi-label feature selection considering label supplementation,” Pattern Recognition, vol. 120, p. 108137, Dec. 2021.
[62]Y. Li, L. Hu, and W. Gao, “Multi-label feature selection via robust flexible sparse regularization,” Pattern Recognition, vol. 134, p. 109074, Feb. 2023.