An Improved Machine Learning-Based Short Message Service Spam Detection System

Full Text (PDF, 848KB), PP.40-48

Views: 0 Downloads: 0


Odukoya Oluwatoyin 1,* Akinyemi Bodunde 1 Gooding Titus 1 Aderounmu Ganiyu 1

1. Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria

* Corresponding author.


Received: 12 Nov. 2019 / Revised: 20 Nov. 2019 / Accepted: 27 Nov. 2019 / Published: 8 Dec. 2019

Index Terms

Short Message Service (SMS), Spam Detection, Ensemble method, Machine Learning


The use of Short Message Services (SMS) as a mechanism of communication has resulted to loss of sensitive information such as credit card details, medical information and bank account details (user name and password). Several Machine learning-based approaches have been proposed to address this problem, but they are still unable to detect modified SMS spam messages more accurately. Thus, in this research, a stack- ensemble of four machine learning algorithms consisting of Random Forest (RF), Logistic Regression (LR), Multilayer Perceptron (MLP), and Support Vector Machine (SVM), were employed to detect more accurately SMS spams. The simulation was carried out using Python Scikit- learn tools. The performance evaluation of the proposed model was carried out by benchmarking it with an existing model. The evaluation results showed that the proposed model has an increase of 3.03% of accuracy, 8.94% of Recall, 2.17% of F-measure; and a decrease of 4.55% of Precision over the existing model. This indicates that the proposed model reduces the false alarm rate and thus detects spams more accurately. In conclusion, the ensemble method performed better than any individual algorithms and can be adopted by the Network service providers for better Quality of Service.

Cite This Paper

Odukoya Oluwatoyin, Akinyemi Bodunde, Gooding Titus, Aderounmu Ganiyu, "An Improved Machine Learning-Based Short Message Service Spam Detection System", International Journal of Computer Network and Information Security(IJCNIS), Vol.11, No.12, pp.40-48, 2019. DOI:10.5815/ijcnis.2019.12.05


[1]A. Al-Hassana, E. M. El-Alfyb, “Dendritic Cell Algorithm for Mobile Phone Spam Filtering,” 6th International Conference on Ambient Systems, Networks and Technologies, Procedia Computer Science, vol. 52, pp. 244 – 251, 2015.
[2]Baldwin, “350,000 different types of spam SMS messages were targeted at mobile users in 2012,” Computer weekly publication [online] February 2013. Available:
[3]D.N. Sohn, J.T. Lee, K.S. Han, and H.C. Rim, “Content-based mobile spam classification using stylistically motivated features”. Pattern Recognition Letters, vol. 33, no. 3, pp.364–369, 2012.
[4]Suleiman and G. Al-Naymat, “SMS Spam Detection Using H2O framework.” Procedia Computer Science, vol. 113, pp 154-161, 2017.
[5]H. Sajedi, G. Z. Parast, and F. Akbari, “ SMS Spam Filtering Using Machine Learning Techniques: A Survey” . Machine Learning Research. Vol. 1, no. 1, pp. 1-4, 2016.
[6]N. Choudhary and A.K.Jain. “Towards Filtering of SMS Spam Messages Using Machine Learning Based Technique”. In: Singh D., Raman B., Luhach A., Lingras P. (eds) Advanced Informatics for Computing Research. Communications in Computer and Information Science, Springer, Singapore, vol. 712, pp 18-30, 2017.
[7]L. N. Lota and B M Mainul Hossain ,"A Systematic Literature Review on SMS Spam Detection Techniques", International Journal of Information Technology and Computer Science (IJITCS), vol.9, no.7, pp.42-50, 2017.
[8]T.H. Pham and P. Le-Hong, “Content-based Approach for Vietna- mese Spam SMS Filtering”. In proceedings of 2016 International Conference on Asian Language Processing (IALP), Tainan, pp. 41-44, 2016.
[9]G.V. Cormack, J.M. Gómez Hidalg, and E.P. Sánz, “Feature Engineering for mobile (SMS) spam filtering,” Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23- 27, 2007, Amsterdam, pp 871-872, 2007.
[10]N. Chaudhari, P. Jayvala, and P. Vinitashah,” Survey on Spam SMS filtering using Data mining Techniques,” International Journal of Advanced Research in Computer and Communication Engineering, Vol. 5, Issue 11, 2016
[11]I. Ahmed, D. Guan and T. C. Chung, “ SMS Classification Based on Naïve Bayes Classifier and Apriori Algorithm Frequent Itemset,“ International Journal of Machine Learning and Computing, Vol. 4, No. 2, pp 184-187, 2014
[12]K. Yadav, P. Kumaraguru, A. Goyal, A. Gupta and V. Naik, “SMS Assassin: Crowdsourcing Driven Mobile-based System for SMS Spam Filtering,” in Proceedings of the 12th Workshop on Mobile Computing Systems and Applications, pp 1-6, 2011.
[13]J. Brownlee, “Machine Learning Mastery with Python: Understand Your Data, Create Accurate Models and Work Projects End-To-End., Edition: v1.5, pp 1-24, 2016,
[14]H. Trevor, T. Robert, J. H Friedman and F. James, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction,” In proceedings of the Mathematical Intelligencer, Vol. 27, No 2, pp 83-85, 2004.
[15]T. A. Almeida and J. M Gómez Hidalgo, “SMS Spam Collection Data Set- UCI Machine Learning Repository,” Available: 2011
[16]S. Guido and A. C. Muller, “Introduction to machine learning with Python: a guide for data scientists. O’Reilly Media, Inc., 2016
[17]H. Shirani-Mehr, "SMS Spam Detection using Machine Learning Approach,” CS229 Project 2013, Stanford University, USA, pp. 1–4, 2013
[18]S. Schrauwen, “Machine learning approach to sentiment analysis using the Dutch Netlog Corpus.” Computational Linguistic and Psycholingistics Research Center, pp1-78, 2010
[19]K. Shin, D. Fernandes and S. Miyazaki. “Consistency Measure for feature Selection: A formal Definition, Relative Sensitivity Comparison and a fast Algorithm”. In Proceeding of Twenty –Second International Joint Conference on Artificial Intelligence, pp 1491-1497, 2011