Hybrid Ensemble Learning Technique for Software Defect Prediction

Full Text (PDF, 615KB), PP.1-10

Views: 0 Downloads: 0


Mohammad Zubair Khan 1,*

1. Department of Computer Science, College of Computer Science and Engineering, Taibah University, Madinah, KSA.

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2020.01.01

Received: 5 Nov. 2019 / Revised: 20 Nov. 2019 / Accepted: 27 Dec. 2019 / Published: 8 Feb. 2020

Index Terms

Ensemble Learning, RF, AdaBoost, Bagging, Software, Defects, MLT, HELT


The reliability of software depends on its ability to function without error. Unfortunately, errors can be generated during any phase of software development. In the field of software engineering, the prediction of software defects during the initial stages of development has therefore become a top priority. Scientific data are used to predict the software's future release. Study shows that machine learning and hybrid algorithms are change benchmarks in the prediction of defects. During the past two decades, various approaches to software defect prediction that rely on software metrics have been proposed. This paper explores and compares well-known supervised machine learning and hybrid ensemble classifiers in eight PROMISE datasets. The experimental results showed that AdaBoost support vector machines and bagging support vector machines were the best performing classifiers in Accuracy, AUC, recall and F-measure.

Cite This Paper

Mohammad Zubair Khan, "Hybrid Ensemble Learning Technique for Software Defect Prediction", International Journal of Modern Education and Computer Science(IJMECS), Vol.12, No.1, pp. 1-10, 2020. DOI:10.5815/ijmecs.2020.01.01


[1] IEEE Standard Glossary of Software Engineering Terminology: In IEEE Std 610.12-1990, 31 December 1990, pp. 1–84 (1990).

[2] Y. Zhou, B. Xu, H. LeungOn the ability of complexity metrics to predict fault-prone classes in object-oriented systems J. Syst. Softw., 83 (2010), pp. 660-674.

[3] G.R. Choudhary, S. Kumar, K. Kumar, A. Mishra, C. CatalEmpirical analysis of change metrics for software fault prediction Comput. Electr. Eng., Elsevier, 67 (2018), pp. 15-24

[4] R. Malhotra empirical framework for defect prediction using machine learning techniques with Android software Appl. Soft Comput., Elsevier, 49 (C) (2016), pp. 1034-1050

[5] R. Moser, Pedrycz W. SucciA Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction May 10–18 ICSE’08, Leipzig, Germany (2008), pp. 181-190

[6] Tantithamthavorn, Chakkrit, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto. "An empirical comparison of model validation techniques for defect prediction models." IEEE Transactions on Software Engineering 43, no. 1 (2017): 1-18.

[7] Nam, Jaechang, Wei Fu, Sunghun Kim, Tim Menzies, and Lin Tan. "Heterogeneous defect prediction." IEEE Transactions on Software Engineering (2017).

[8] Fenton and Bieman, 2015 N. Fenton, J. Bieman Software Metrics. A Rigorous and Practical Approach (3rd edition), CRC Press, Taylor, and Francis group (2015)

[9] Hassan, M. M., Afzal, W., Blom, M., Lindström, B., Andler, S.F., Eldh, S.: Testability and software robustness: a systematic literature review. In: 2015 41st Euromicro Conference on Software Engineering and Advanced Applications, Funchal, pp. 341–348 (2015)

[10] Gondra, I.: Applying machine learning to software fault-proneness prediction. J. Syst. Softw. 81(2), 186–195 (2008). https://doi.org/10.1016/j.jss.2007.05.035

[11] Thwin, M.M.T., Quah, T.-S.: Application of neural networks for software quality prediction using object-oriented metrics. J. Syst. Softw. 76, 147–156 (2005)

[12] Bo, Y., Xiang, L.: A study on software reliability prediction based on support vector machines. In: 2007 IEEE International Conference on Industrial Engineering and Engineering Management, pp. 1176–1180 (2007).

[13] Alsaeedi, A. and Khan, M.Z. (2019) Software Defect Prediction Using Supervised Machine Learning and Ensemble Techniques: A Comparative Study. Journal of Software Engineering and Applications, 12, 85-100.https://doi.org/10.4236/jsea.2019.125007

[14] Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)

[15] Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: Using the support vector machine as a classification method for software defect prediction with static code metrics. In: Engineering Applications of Neural Networks, pp. 223–234. Springer, Berlin (2009)

[16] Rong, X., Li, F., Cui, Z.: A model for software defect prediction using support vector machine based on CBA. Int. J. Intell. Syst. Technol. Appl. 15(1), 19–34 (2016)

[17] Park, B.-J., Oh, S.-K., Pedrycz, W.: The design of polynomial function-based neural network predictors for detection of software defects. Inf. Sci. 229(20), 40–57 (2013)

[18] X. Yang, K. Tang, X. Yao A learning-to-rank approach to software defect prediction IEEE Trans. Reliab., 64 (1) (2015), pp. 234-246

[19] C. Manjula, L. FlorenceHybrid approach for software defect prediction using machine learning with optimization technique Int. J. Comput. Inf. Eng., World Acad. Sci. Eng. Technol., 12 (1) (2018), pp. 28-32

[20] Bishnu, P.S., Bhattacherjee, V.: Software fault prediction using Quad Tree-based K-means clustering algorithm. IEEE Trans. Knowl. Data Eng. 24(6), 1146–1150 (2012)

[21] Jacob, S.G., et al. (2015) Improved Random Forest Algorithm for Software Defect Prediction through Data Mining Techniques. International Journal of Computer Applications, 117, 18-22. https://doi.org/10.5120/20693-3582

[22] Challagulla, V.U.B., Bastani, F.B., Yen, I.L. and Paul, R.A. (2005) Empirical Assessment of Machine Learning-Based Software Defect Prediction Techniques. Proceedings of the 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems, 2-4 February 2005, Sedona, 263-270

[23] Aleem, S., Capretz, L. and Ahmed, F. (2015) Benchmarking Machine Learning Technologies for Software Defect Detection. International Journal of Software Engineering & Applications, 6, 11-23. https://doi.org/10.5121/ijsea.2015.6302

[24] Perreault, L., Berardinelli, S., Izurieta, C., and Sheppard, J. (2017) Using Classifiers for Software Defect Detection. 26th International Conference on Software Engineering and Data Engineering, 2-4 October 2017, Sydney, 2-4.

[25] Alsawalqah, H., Faris, H., Aljarah, I., Alnemer, L. and Alhindawi, N. (2017) Hybrid Smote-Ensemble Approach for Software Defect Prediction. In: Silhavy, R., Silhavy, P., Prokopova, Z., Senkerik, R. and Oplatkova, Z., Eds., Software Engineering Trends and Techniques in Intelligent Systems, Springer, Berlin, 355-366. https://doi.org/10.1007/978-3-319-57141-6_39

[26] W. Rhmann, B. Pandey, G. Ansari et al., Software fault prediction based on change metrics using hybrid algorithms: An empirical study, Journal of King Saud University – Computer and Information Sciences,https://doi.org/10.1016/j.jksuci.2019.03.006

[27] Hossin, M.and Sulaiman, m.n. “a review on evaluation metrics for data classification evaluations”. international journal of Data Mining & Knowledge Management Process (IJDKP) Vol.5, No.2, March 2015

[28] http://www.cs.waikato.ac.nz/ml/weka

[29] Software Defect Dataset: PROMISE REPOSITORY. http://promise.site.uottawa.ca/SERepository/datasets-page.html

[30] https://libguides.library.kent.edu/SPSS/PairedSamplestTest