Robust Low-Rank Subspace Learning for Multi-Label Feature Selection with Global-Local Correlation Modeling

PDF (1636KB), PP.1-18

Views: 0 Downloads: 0

Author(s)

Emmanuel Ntaye 1,* Xiang-Jun Shen 1 Andrew Azaabanye Bayor 2 Fadilul-lah Yassaanah Issahaku 3

1. School of Computer Science and Communication Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China

2. Department of Computer Science, S.D Dombo University of Business and Integrated Development Studies, Wa, 00233, Upper West, Ghana

3. School of Mathematics and Big Data, Anhui University of Science and Technology, Anhui, 232001, China

* Corresponding author.

DOI: https://doi.org/10.5815/ijem.2026.01.01

Received: 2 Jun. 2025 / Revised: 14 Oct. 2025 / Accepted: 21 Nov. 2025 / Published: 8 Feb. 2026

Index Terms

Multi-Label Classification, Low-Rank Subspace Learning, Feature Selection, Schatten-P Norm, Manifold Regularization, Global-Local Correlation

Abstract

Multi-label classification faces significant challenges from high-dimensional features and complex label dependencies. Traditional feature selection methods often fail to capture these dependencies effectively or suffer from high computational costs. This paper proposes a novel Robust Low-Rank Subspace Learning (RLRSL) framework for multi-label feature selection. Our method integrates global label correlations and local feature structures within a unified objective function, utilizing Schatten-p norm for low-rank subspace learning, l_(2,1),-norm for joint feature sparsity, and manifold regularization for local geometry preservation. We develop an efficient optimization algorithm to solve the resulting non-convex problem. Comprehensive experiments on seven benchmark datasets demonstrate that RLRSL consistently outperforms state-of-the-art methods across multiple evaluation metrics, including ranking loss, multi-label accuracy, and F1-score, using both ML-*k* NN and SVM classifiers. The results confirm the robustness, efficiency, and superior generalization capability of our proposed approach

Cite This Paper

Emmanuel Ntaye, Xiang-Jun Shen, Andrew Azaabanye Bayor, Fadilul-lah Yassaanah Issahaku, "Robust Low-Rank Subspace Learning for Multi-Label Feature Selection with Global-Local Correlation Modeling", International Journal of Engineering and Manufacturing (IJEM), Vol.16, No.1, pp. 1-18, 2026. DOI:10.5815/ijem.2026.01.01

Reference

[1]J. Li, C. Zhang, J. T. Zhou, H. Fu, S. Xia, and Q. Hu. Deep-lift: Deep label-specific feature learning for image annotation. IEEE Transactions on Cybernetics, 52(8):7732–7741, 2022.
[2]T. Su, D. Feng, M. Wang, and M. Chen. Dual discriminative low-rank projection learning for robust image classification. IEEE Transactions on Circuits and Systems for Video Technology, 33(12):7708–7722, 2023.
[3]X. Wei, J. Huang, R. Zhao, H. Yu, and Z. Xu. Multi-label text classification model based on multi-level constraint augmentation and label association attention. ACM Transactions on Asian and Low-Resource Language Information Processing, 23(1), 2024.
[4]H. Joe and H. G. Kim. Multi-label classification with xgboost for metabolic pathway prediction. BMC Bioinformatics, 25(1):1–15, 2024.
[5]Y. J. Deng, M. L. Yang, H. C. Li, C. F. Long, K. Fang, and Q. Du. Feature dimensionality reduction with l2, p-norm-based robust embedding regression for classification of hyperspectral images. IEEE Transactions on Geoscience and Remote Sensing, 62:1–14, 2024.
[6]J. Y. Hang and M. L. Zhang. Dual perspective of label-specific feature learning for multi-label classification. Proceedings of Machine Learning Research, 162:8375–8386, 2022
[7]D. Theng and K. K. Bhoyar, “Feature selection techniques for machine learning: a survey of more than two decades of research,” Knowledge Information System, vol. 66, no. 3, pp. 1575–1637, Mar. 2024.
[8]Z. Ergul Aydin and Z. Kamisli Ozturk, “Filter-based feature selection methods in the presence of missing data for medical prediction models,” Multimedia Tools Appl, vol. 83, no. 8, pp. 24187–24216, Mar. 2024.
[9]I. Emmanuel, Y. Sun, and Z. Wang, “A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method,” J Big Data, vol. 11, no. 1, pp. 1–14, Dec. 2024
[10]S. Geng and L. Zhang, “GEE analysis in joint mean-covariance model for high-dimensional longitudinal data with HPC,” J Stat Comput Simul, vol. 95, no. 7, pp. 1520–1537, May 2025.
[11]A. K. Mandal, M. D. Nadim, H. Saha, T. Sultana, M. D. Hossain, and E. N. Huh, “Feature Subset Selection for High-Dimensional, Low Sampling Size Data Classification Using Ensemble Feature Selection With a Wrapper-Based Search,” IEEE Access, vol. 12, pp. 62341–62357, 2024, doi: 10.1109/ACCESS.2024.3390684.
[12]S. Solorio-Fernández, J. A. Carrasco-Ochoa, and J. F. Martínez-Trinidad, “A review of unsupervised feature selection methods,” Artif Intell Rev, vol. 53, no. 2, pp. 907–948, Feb. 2020, doi: 10.1007/S10462-019-09682-Y/METRICS.
[13]D. Yu and D. Kong, “Nuclear Norm Regularization,” Wiley Interdiscip Rev Comput Stat, vol. 17, no. 1, p. 70013, Mar. 2025.
[14]C. Lu, J. Tang, S. Yan, and Z. Lin, “Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm,” IEEE Transactions on Image Processing, vol. 25, no. 2, pp. 829–839, Feb. 2016, doi: 10.1109/TIP.2015.2511584.
[15]Y. Fan, J. Liu, P. Liu, Y. Du, W. Lan, and S. Wu, “Manifold learning with structured subspace for multi-label feature selection,” Pattern Recognition, vol. 120, p. 108169, Dec. 2021, doi: 10.1016/J.PATCOG.2021.108169.
[16]J. Zhang, Y. Lin, M. Jiang, S. Li, Y. Tang, and K. C. Tan, “Multi-label feature selection via global relevance and redundancy optimization,” IJCAI International Joint Conference on Artificial Intelligence, vol. 2021-January, pp. 2512–2518, 2020, doi: 10.24963/IJCAI.2020/348.
[17]K. Ding, J. Shu, D. Meng, and Z. Xu, “Improve Noise Tolerance of Robust Loss via Noise-Awareness,” IEEE Trans Neural Network Learning System, pp. 1–15, 2024, doi: 10.1109/TNNLS.2024.3457029.
[18]X. Jin, J. Miao, Q. Wang, G. Geng, and K. Huang, “Sparse matrix factorization with L2,1 norm for matrix completion,” Pattern Recognition, vol. 127, p. 108655, Jul. 2022, doi: 10.1016/J.PATCOG.2022.108655.
[19]H. Chen, H. Chen, W. Li, T. Li, C. Luo, and J. Wan, “Robust dual-graph regularized and minimum redundancy based on self-representation for semi-supervised feature selection,” Neurocomputing, vol. 490, pp. 104–123, Jun. 2022, doi: 10.1016/J.NEUCOM.2022.03.004.
[20]H. Cheng, W. Deng, C. Fu, Y. Wang, and Z. Qin, “Graph-Based Semi-supervised Feature Selection with Application to Automatic Spam Image Identification,” Communications in Computer and Information Science, vol. 159 CCIS, no. PART 2, pp. 259–264, 2011, doi: 10.1007/978-3-642-22691-5_45.
[21]H. Chen, H. Chen, W. Li, T. Li, C. Luo, and J. Wan, “Robust dual-graph regularized and minimum redundancy based on self-representation for semi-supervised feature selection,” Neurocomputing, vol. 490, pp. 104–123, Jun. 2022, doi: 10.1016/j.neucom.2022.03.004.
[22]J. Zhang, Y. Lin, M. Jiang, S. Li, Y. Tang, and K. C. Tan, “Multi-label Feature Selection via Global Relevance and Redundancy Optimization,” 2020, Accessed: Jun. 01, 2025. [Online]. Available: https://www.mathworks.com/help/control/ref/lyap.html
[23]K. Hopf and S. Reifenrath, “Filter Methods for Feature Selection in Supervised Machine Learning Applications -- Review and Benchmark,” Nov. 2021, Accessed: Jun. 01, 2025. https://arxiv.org/pdf/2111.12140
[24]W. Qian, J. Huang, Y. Wang, and W. Shu, “Mutual information-based label distribution feature selection for multi-label learning,” Knowledge Based Syst, vol. 195, p. 105684, May 2020, doi: 10.1016/J.KNOSYS.2020.105684.
[25]H. Zhou, X. Wang, and R. Zhu, “Feature selection based on mutual information with correlation coefficient,” Applied Intelligence, vol. 52, no. 5, pp. 5457–5474, Mar. 2022, doi: 10.1007/S10489-021-02524-X/TABLES/7.
[26]Z. Hou, Q. Hu, and W. L. Nowinski, “On minimum variance thresholding,” Pattern Recognition Letter, vol. 27, no. 14, pp. 1732–1743, Oct. 2006. 
[27]G. Doquire and M. Verleysen, “Mutual information-based feature selection for multilabel classification,” Neurocomputing, vol. 122, pp. 148–155, Dec. 2013, doi: 10.1016/J.NEUCOM.2013.06.035.
[28]S. Sharmin, M. Shoyaib, A. A. Ali, M. A. H. Khan, and O. Chae, “Simultaneous feature selection and discretization based on mutual information,” Pattern Recognition, vol. 91, pp. 162–174, Jul. 2019, doi: 10.1016/J.PATCOG.2019.02.016.
[29]L. Pan et al., “A robust wrapper-based feature selection technique based on modified teaching learning-based optimization with hierarchical learning scheme,” Engineering Science and Technology, an International Journal, vol. 61, p. 101935, Jan. 2025. 
[30]D. Bajer, M. Dudjak, and B. Zoric, “Wrapper-based feature selection: How important is the wrapped classifier?” Proceedings of 2020 International Conference on Smart Systems and Technologies, SST 2020, pp. 97–105, Oct. 2020, doi: 10.1109/SST49455.2020.9264072.
[31]M. G. Altarabichi, S. Nowaczyk, S. Pashami, and P. S. Mashhadi, “Fast Genetic Algorithm for feature selection — A qualitative approximation approach,” Expert Syst Appl, vol. 211, p. 118528, Jan. 2023, doi: 10.1016/J.ESWA.2022.118528.
[32]Z. Y. Taha, A. A. Abdullah, and T. A. Rashid, “Optimizing Feature Selection with Genetic Algorithms: A Review of Methods and Applications,” Sep. 2024, Accessed: Jun. 01, 2025. [Online]. Available: https://arxiv.org/pdf/2409.14563.
[33]I. Tsamardinos, G. Borboudakis, P. Katsogridakis, P. Pratikakis, and V. Christophides, “A greedy feature selection algorithm for Big Data of high dimensionality,” Mach Learn, vol. 108, no. 2, pp. 149–202, Feb. 2019, doi: 10.1007/S10994-018-5748-7/FIGURES/5.
[34]P. Shekhar and A. Patra, “A forward–backward greedy approach for sparse multiscale learning,” Comput Methods Appl Mech Eng, vol. 400, p. 115420, Oct. 2022.
[35]W. Wang, F. Zhang, and J. Wang, “Low-rank matrix recovery via regularized nuclear norm minimization,” Appl Comput Harmon Anal, vol. 54, pp. 1–19, Sep. 2021.
[36]J. Wen, X. Fang, Y. Xu, C. Tian, and L. Fei, “Low-rank representation with adaptive graph regularization,” Neural Networks, vol. 108, pp. 83–96, Dec. 2018.
[37]M. Liu, X. Zhang, and L. Tang, “Weighted t-Schatten-p Norm Minimization for Real Color Image Denoising,” IEEE Access, vol. 8, pp. 150350–150359, 2020.
[38]X. Jin, J. Miao, Q. Wang, G. Geng, and K. Huang, “Sparse matrix factorization with L2,1 norm for matrix completion,” Pattern Recognition, vol. 127, p. 108655, Jul. 2022.
[39]S. Fu, W. Liu, K. Zhang, Y. Zhou, and D. Tao, “Semi-supervised classification by graph p-Laplacian convolutional networks,” Inf Sci (N Y), vol. 560, pp. 92–106, Jun. 2021, doi: 10.1016/J.INS.2021.01.075.
[40]A. Kapur, K. Marwah, and G. Alterovitz, “Gene expression prediction using low-rank matrix completion,” BMC Bioinformatics, vol. 17, no. 1, pp. 1–13, Jun. 2016.
[41]R. B. Putchanuthala and E. S. Reddy, “Image retrieval using locality preserving projections,” The Journal of Engineering, vol. 2020, no. 10, pp. 889–892, Oct. 2020.
[42]M. Zong, Z. Ma, F. Zhu, Y. Ma, and R. Wang, “Laplacian eigenmaps based manifold regularized CNN for visual recognition,” Inf Sci (N Y), vol. 689, p. 121503, Jan. 2025.
[43]S. Feng and C. Lang, “Graph regularized low-rank feature mapping for multi-label learning with application to image annotation,” Multidimension System Signal Process, vol. 29, no. 4, pp. 1351–1372, Oct. 2018.
[44]B. V. Lad, M. F. Hashmi, and A. G. Keskar, “Boundary Preserved Salient Object Detection Using Guided Filter Based Hybridization Approach of Transformation and Spatial Domain Analysis,” IEEE Access, vol. 10, pp. 67230–67246, 2022. 
[45]G. Li et al., “Personal Fixations-Based Object Segmentation with Object Localization and Boundary Preservation,” IEEE Transactions on Image Processing, vol. 30, pp. 1461–1475, 2021.
[46]Y. Wang, X. Zhang, and Y. Cheng, “A global and local unified feature selection algorithm based on hierarchical structure constraints,” Expert Syst Appl, vol. 282, p. 127535, Jul. 2025, doi: 10.1016/J.ESWA.2025.127535.
[47]Y. Xue, C. Zhang, F. Neri, M. Gabbouj, and Y. Zhang, “An external attention-based feature ranker for large-scale feature selection,” Knowl Based Syst, vol. 281, p. 111084, Dec. 2023, doi: 10.1016/J.KNOSYS.2023.111084.
[48]S. Liang, Y. Zhang, K. Zheng, and Y. Bai, “FeatureX: An explainable feature selection for deep learning,” Expert Syst Appl, vol. 282, p. 127675, Jul. 2025.
[49]M. C. Barbieri, B. I. Grisci, and M. Dorn, “Analysis and comparison of feature selection methods towards performance and stability,” Expert Syst Appl, vol. 249, p. 123667, Sep. 2024, doi: 10.1016/J.ESWA.2024.123667.
[50]C. Kuzudisli, B. Bakir-Gungor, N. Bulut, B. Qaqish, and M. Yousef, “Review of feature selection approaches based on grouping of features,” PeerJ, vol. 11, p. e15666, 2023, doi: 10.7717/PEERJ.15666.
[51]K. Hirose, S. Tateishi, and S. Konishi, “Tuning parameter selection in sparse regression modeling,” Comput Stat Data Anal, vol. 59, no. 1, pp. 28–40, Mar. 2013.
[52]M. L. Zhang and Z. H. Zhou, “ML-KNN: A lazy learning approach to multi-label learning,” Pattern Recognition, vol. 40, no. 7, pp. 2038–2048, Jul. 2007, doi: 10.1016/J.PATCOG.2006.12.019.
[53]A. Elisseeff and J. Weston, “A kernel method for multi-labelled classification,” Adv Neural Inf Process Syst, vol. 14, 2001.
[54]G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,” International Journal of Data Warehousing and Mining, vol. 3, no. 3, pp. 1–13, 2007. 
[55]J. Lee, H. Lim, and D. W. Kim, “Approximating mutual information for multi-label feature selection,” Electron Lett, vol. 48, no. 15, pp. 929–930, Jul. 2012.
[56]J. Lee and D. W. Kim, “Feature selection for multi-label classification using multivariate mutual information,” Pattern Recognition Lett, vol. 34, no. 3, pp. 349–357, Feb. 2013, doi: 10.1016/J.PATREC.2012.10.005.
[57]Y. Lin, Q. Hu, J. Liu, and J. Duan, “Multi-label feature selection based on max-dependency and min-redundancy,” Neurocomputing, vol. 168, pp. 92–103, Nov. 2015.
[58]J. Lee and D. W. Kim, “Fast multi-label feature selection based on information-theoretic feature ranking,” Pattern Recognition, vol. 48, no. 9, pp. 2761–2771, Sep. 2015.
[59]H. Lim, J. Lee, and D. W. Kim, “Optimization approach for feature selection in multi-label classification,” Pattern Recognition Lett, vol. 89, pp. 25–30, Apr. 2017.
[60]R. Huang, W. Jiang, and G. Sun, “Manifold-based constraint Laplacian score for multi-label feature selection,” Pattern Recognition Lett, vol. 112, pp. 346–352, Sep. 2018.
[61]P. Zhang, G. Liu, W. Gao, and J. Song, “Multi-label feature selection considering label supplementation,” Pattern Recognition, vol. 120, p. 108137, Dec. 2021.
[62]Y. Li, L. Hu, and W. Gao, “Multi-label feature selection via robust flexible sparse regularization,” Pattern Recognition, vol. 134, p. 109074, Feb. 2023.