Dimensionality Reduction for Classification and Clustering

Full Text (PDF, 590KB), PP.61-68

Views: 0 Downloads: 0


D. Asir Antony Gnana Singh 1,* E. Jebamalar Leavline 2

1. Department of Computer Science and Engineering, Anna University, BIT-Campus, Tiruchirappalli, India

2. Department of Electronics and Communication Engineering, Anna University, BIT-Campus, Tiruchirappalli, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2019.04.06

Received: 28 Apr. 2018 / Revised: 5 Jun. 2018 / Accepted: 14 Jul. 2018 / Published: 8 Apr. 2019

Index Terms

Wrapper-based dimensionality reduction, naïve Bayes classifier, Random forest classifier, OneR classifier, Variable selection


Now-a-days, data are generated massively from various sectors such as medical, educational, commercial, etc. Processing these data is a challenging task since the massive data take more time to process and make decision. Therefore, reducing the size of data for processing is a pressing need. The size of the data can be reduced using dimensionality reduction methods. The dimensionality reduction is known as feature selection or variable selection. The dimensionality reduction reduces the number of features present in the dataset by removing the irrelevant and redundant variables to improve the accuracy of the classification and clustering tasks. The classification and clustering techniques play a significant role in decision making. Improving accuracy of classification and clustering is an essential task of the researchers to improve the quality of decision making. Therefore, this paper presents a dimensionality reduction method with wrapper approach to improve the accuracy of classification and clustering.

Cite This Paper

D. Asir Antony Gnana Singh, E. Jebamalar Leavline, "Dimensionality Reduction for Classification and Clustering", International Journal of Intelligent Systems and Applications(IJISA), Vol.11, No.4, pp.61-68, 2019. DOI:10.5815/ijisa.2019.04.06


[1]L. Liu, S. Ma, L. Rui and J. Lu, "Locality constrained dictionary learning for non-linear dimensionality reduction and classification," in IET Computer Vision, vol. 11, no. 1, (2017), pp. 60-67.
[2]S. Yuan, X. Mao and L. Chen, "Multilinear Spatial Discriminant Analysis for Dimensionality Reduction," in IEEE Transactions on Image Processing, vol. 26, no. 6, (2017), pp. 2669-2681.
[3]X. Wang, Y. Kong, Y. Gao and Y. Cheng, "Dimensionality Reduction for Hyperspectral Data Based on Pairwise Constraint Discriminative Analysis and Nonnegative Sparse Divergence," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 4, ( 2017) pp. 1552-1562.
[4]Q. Yu, R. Wang, B. N. Li, X. Yang and M. Yao, "Robust Locality Preserving Projections With Cosine-Based Dissimilarity for Linear Dimensionality Reduction," in IEEE Access, vol. 5, no.3, (2017), pp. 2676-2684.
[5]J. Stuckman, J. Walden and R. Scandariato, "The Effect of Dimensionality Reduction on Software Vulnerability Prediction Models," in IEEE Transactions on Reliability, vol. 66, no. 1, (2017), pp. 17-37.
[6]Y. Dong, B. Du, L. Zhang and L. Zhang, "Exploring Locally Adaptive Dimensionality Reduction for Hyperspectral Image Classification: A Maximum Margin Metric Learning Aspect," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 3, (2017), pp. 1136-1150.
[7]M. Jiang, W. Huang, Z. Huang and G. G. Yen, "Integration of Global and Local Metrics for Domain Adaptation Learning Via Dimensionality Reduction," in IEEE Transactions on Cybernetics, vol. 47, no. 1, (2017) pp. 38-51.
[8]C. Zhang, H. Fu, Q. Hu, P. Zhu and X. Cao, "Flexible Multi-View Dimensionality Co-Reduction," in IEEE Transactions on Image Processing, vol. 26, no. 2, (2017), pp. 648-659.
[9]D. Sacha et al., "Visual Interaction with Dimensionality Reduction: A Structured Literature Analysis," in IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, (2017), pp. 241-250.
[10]J. Liang, G. Yu, B. Chen and M. Zhao, "Decentralized Dimensionality Reduction for Distributed Tensor Data Across Sensor Networks," in IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 11, pp. 2174-2186, Nov. 2016.
[11]D. Asir Antony Gnana Singh, E. Jebamalar Leavline, R. Priyanka, P. Padma Priya, "Dimensionality Reduction using Genetic Algorithm for Improving Accuracy in Medical Diagnosis", International Journal of Intelligent Systems and Applications (IJISA), Vol.8, No.1, pp.67-73, 2016.
[12]H. B. Kekre, Kavita Sonawane,"Histogram Bins Matching Approach for CBIR Based on Linear grouping for Dimensionality Reduction", IJIGSP, vol.6, no.1, pp. 68-82, 2014.
[13]Ah. E. Hegazy, M. A. Makhlouf, Gh. S. El-Tawel, " Dimensionality Reduction Using an Improved Whale Optimization Algorithm for Data Classification", International Journal of Modern Education and Computer Science, Vol.10, No.7, pp. 37-49, 2018.
[14]Micheal O. Arowolo, Sulaiman O. Abdulsalam, Rafiu M. Isiaka, Kazeem A. Gbolagade, "A Hybrid Dimensionality Reduction Model for Classification of Microarray Dataset", International Journal of Information Technology and Computer Science, Vol.9, No.11, pp.57-63, 2017.
[15]Shilpa Sharma, Rachna Manchanda,"Implementation of Hand Sign Recognition for NonLinear Dimensionality Reduction based on PCA", International Journal of Image, Graphics and Signal Processing, Vol.9, No.2, pp.37-45, 2017.
[16]Amir Enshaei, Joe Faith,"Feature Selection with Targeted Projection Pursuit", International Journal of Information Technology and Computer Science, vol.7, no.5, pp.34-39, 2015.
[17]Masoumeh Zareapoor, Seeja K. R,"Feature Extraction or Feature Selection for Text Classification: A Case Study on Phishing Email Detection", IJIEEB, vol.7, no.2, pp.60-65, 2015.
[18]Singh, D.A.A.G., Balamurugan, S.A.A. & Leavline, E.J. “An unsupervised feature selection algorithm with feature ranking for maximizing performance of the classifiers” Int. J. Autom. Comput. (2015) 12: 511. https://doi.org/10.1007/s11633-014-0859-5
[19]Singh, D. Asir Antony Gnana, E. Jebamalar Leavline, E. Priyanka, and C. Sumathi. "Feature selection using rough set for improving the performance of the supervised learner." International Journal of Advanced Science and Technology 87 (2016): 1-8.
[20]Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition, 2016.
[21]Hall, M.A., 2000. Correlation-based feature selection of discrete and numeric class machine learning. Ph.D Thesis, Department of Computer Science, the University of Waikato.
[22]Singh, G., Antony, D.A. and Leavline, E.J., 2013. Data Mining In Network Security-Techniques & Tools: A Research Perspective. Journal of theoretical & applied information technology, 57(2).
[23]R.C. Holte (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning. 11:63-91.
[24]Pal, M., 2005. Random forest classifier for remote sensing classification. International Journal of Remote Sensing, 26(1), pp.217-222.
[25]D. Asir Antony Gnana Singh , A. Escalin Fernando , E. Jebamalar Leavline’ “Performance Analysis on Clustering Approaches for Gene Expression Data” International Journal of Advanced Research in Computer and Communication Engineering Vol. 5, Issue 2, February 2016.
[26]D. Asir Antony Gnana Singh, E. Jebalamar Leavline, “Model-Based Outlier Detection System with Statistical Preprocessing”, Journal of Modern Applied Statistical Methods, May 2016, Vol. 15, No. 1, 789-801.