Najla Odeh; Sherin Hijazi

Detection and Prevention of Phishing Short URLs Using Machine Learning and Blacklist Approaches

PDF (795KB), PP.37-53

Views: 0 Downloads: 0

Author(s)

1. Palestine Technical University – Kadoorie, Computer Science Department, Faculty of Information Technology, Tulkarm, P.O Box 305, Palestine

* Corresponding author.

DOI: https://doi.org/10.5815/ijwmt.2025.03.03

Received: 18 Dec. 2024 / Revised: 9 Jan. 2025 / Accepted: 5 Mar. 2025 / Published: 8 Jun. 2025

Index Terms

Cybersecurity, Machine Learning Algorithms, Short URLs, Security and Privacy, Phishing Attacks

Abstract

Phishing attacks are a common and serious issue in our digital age, short uniform resource locators are frequently used in these attacks to trick unwary visitors into visiting malicious websites. Short uniform resource locators are often used to hide a link's true destination, making it harder for visitors to establish whether a link is legitimate or phishing. Due to this, individuals and organizations attempting to protect themselves from phishing attempts have a significant problem. This research introduces a novel system that integrates machine learning algorithms with a blacklist approach to enhance phishing detection. The system's objective is to support transparency protect user privacy, and increase the precision and efficiency of identifying phishing attacks hidden behind Short URLs, thereby granting users real-time protection against phishing attacks. The findings demonstrate that the proposed system is highly effective. Many machine learning algorithms were used and compared, Gradient Boosting emerged as the best algorithm among those tested, with an excellent accuracy rate of 97.1%. This algorithm outperformed other algorithms in distinguishing between legitimate and phishing uniform resource locators, demonstrating its strong capabilities in the face of the growing threat landscape of phishing attacks via short uniform resource locators. By addressing gaps in prior research, particularly in detecting phishing using short URLs, this study provides a valuable contribution to cybersecurity.

Cite This Paper

Najla Odeh, Sherin Hijazi, "Detection and Prevention of Phishing Short URLs Using Machine Learning and Blacklist Approaches", International Journal of Wireless and Microwave Technologies(IJWMT), Vol.15, No.3, pp. 37-53, 2025. DOI:10.5815/ijwmt.2025.03.03

Reference

[1]B. Xie, Q. Li, and W. Na, "Phishing short URL detection based on link jumping on social networks," in ITM Web of Conferences, 2022, Vol. 47: EDP Sciences. [Online]. Available: https://doi.org/10.1051/itmconf/20224701009.
[2]A. Aleroud and L. Zhou, "Phishing environments, techniques, and countermeasures: A survey," Computers & Security, vol. 68, pp. 160-196, 2017. [Online].
[3]K. L. Chiew, K. S. C. Yong, and C. L. Tan, "A survey of phishing attacks: Their types, vectors and technical approaches," Expert Systems with Applications, vol. 106, pp. 1-20, 2018. [Online].
[4]Verizon, "2018 Data Breach Investigations Report," 2018. [Online]. Available: https://www.verizon.com/business/resources/reports/DBIR_2018_Report.pdf.
[5]APWG, "Phishing Activity Trends Report, 1st Quarter 2018," 2018. [Online]. Available: https://docs.apwg.org/reports/apwg_trends_report_q1_2018.pdf
[6]Statista, "Number of unique phishing sites detected worldwide from 3rd quarter 2013 to 34th quarter 2022". Available at: https://www.statista.com/statistics/266155/number-of-phishing-domain-names-worldwide/, accessed on 2024-01-13.
[7]P. Rajivan and C. Gonzalez, "Creative persuasion: a study on adversarial behaviors and strategies in phishing attacks," Frontiers in psychology, vol. 9, p. 135, 2018. [Online]. Available: https://doi.org/10.3389/fpsyg.2018.00135.
[8]T. Stojnic, D. Vatsalan, and N. A. Arachchilage, "Phishing email strategies: understanding cybercriminals' strategies of crafting phishing emails," Security privacy, vol. 4, no. 5, p. e165, 2021. [Online]. Available: https://doi.org/ 10.1002/spy2.165.
[9]T. Pattewar, C. Mali, S. Kshire, M. Sadarao, J. Salunkhe, and M. A. Shah, "Malicious Short URLs Detection: A Survey," International Research Journal of Engineering and Technology (IRJET), Vol. 6, No. 11, 2019. [Online].
[10]MetaFilter. “We want 'em shorter”. Available at: https://www.metafilter.com/8916/We-want-em-shorter, accessed on 2024-01-03.
[11]S. Le Page, G.-V. Jourdan, G. V. Bochmann, J. Flood, and I.-V. Onut, "Using url shorteners to compare phishing and malware attacks," in 2018 APWG Symposium on Electronic Crime Research (eCrime), 2018, pp. 1-13: IEEE. [Online]. Available: https://doi.org/10.1109/ECRIME.2018.8376215.
[12]S. Bell and P. Komisarczuk, "Measuring the effectiveness of twitter’s url shortener (t. co) at protecting users from phishing and malware attacks," in Proceedings of the Australasian Computer Science Week Multiconference, 2020, pp. 1-11. [Online].
[13]R. K. Nepali and Y. Wang, "You look suspicious!!: Leveraging visible attributes to classify malicious short urls on twitter," in 2016 49th Hawaii International Conference on System Sciences (HICSS), 2016, pp. 2648-2655: IEEE. [Online]. Available: https://doi.org/10.1109/HICSS.2016.332.
[14]S. Chhabra, A. Aggarwal, F. Benevenuto, and P. Kumaraguru, "Phi. sh/$ ocial: the phishing landscape through short urls," in Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, 2011, pp. 92-101. [Online].
[15]F. Maggi et al., "Two Years of Short URLs Internet Measurement," 2013. [Online]. Available: https://doi.org/10.1145/2488388.248846.
[16]L. Tang and Q. H. Mahmoud, "A survey of machine learning-based solutions for phishing website detection," Machine Learning and Knowledge Extraction, vol. 3, no. 3, pp. 672-694, 2021. [Online]. Available: https://doi.org/10.3390/make3030034.
[17]G. Aaron, R. Rasmussen, and A. Routt, "Global Phishing Survey: Trends and Domain Name Use in 1H2014," in Anti-Phishing Working Group, 2015, [Online]. Available: https://docs.apwg.org//reports/APWG_GlobalPhishingSurvey_1H2014.pdf.
[18]R. Venkatesh, J. K. Rout, and S. Jena, "Malicious account detection based on short URLs in Twitter," in Proceedings of the International Conference on Signal, Networks, Computing, and Systems: ICSNCS 2016, Volume 1, 2017, pp. 243-251: Springer. [Online]. Available: https://doi.org/10.1007/978-81-322-3592-7_24.
[19]H.-J. Mun and Y. Li, "Secure short url generation method that recognizes risk of target url," Wireless Personal Communications, vol. 93, pp. 269-283, 2017. [Online]. Available: https://doi.org/10.1007/s11277-016-3866-8.
[20]N. Nikiforakis et al., "Stranger danger: exploring the ecosystem of ad-based url shortening services," in Proceedings of the 23rd international conference on World wide web, 2014, pp. 51-62. [Online].
[21]N. Gupta, A. Aggarwal, and P. Kumaraguru, "bit. ly/malicious: Deep dive into short url based e-crime detection," in 2014 APWG Symposium on Electronic Crime Research (eCrime), 2014, pp. 14-24: IEEE. [Online].
[22]D. Wang, S. B. Navathe, L. Liu, D. Irani, A. Tamersoy, and C. Pu, "Click traffic analysis of short url spam on twitter," in 9th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing, 2013, pp. 250-259: IEEE. [Online].
[23]F. Klien and M. Strohmaier, "Short links under attack: geographical analysis of spam in a URL shortener network," in Proceedings of the 23rd ACM conference on Hypertext and social media, 2012, pp. 83-88. [Online].
[24]A. Neumann, J. Barnickel, and U. Meyer, "Security and privacy implications of url shortening services," in Proceedings of the Workshop on Web 2.0 Security and Privacy, 2010. [Online].
[25]Y.Alshboul, R.Nepali, & Y.Wang, "Detecting malicious short URLs on Twitter," in Americas Conference on Information Systems, Puerto Rico. 2015.
[26]PhishTank, "PhishTank | Join the fight against phishing". Available at: https://www.phishtank.com/.
[27]A. A. A. Ahmed, H. Paruchuri, S. Vadlamudi, and A. Ganapathy, "Cryptography in Financial Markets: potential channels for future financial stability," Academy of Accounting Financial Studies Journal, vol. 25, no. 4, pp. 1-9, 2021. [Online]. Available: https://doi.org/10.5281/zenodo.4774829.
[28]A. Ganapathy, "Cascading Cache Layer in Content Management System," Asian Business Review, Vol. 8, No. 3, pp. 177-182, 2018. [Online]. Available: https://doi.org/10.18034/abr.v8i3.542.
[29]E. R. Sruthi, "Understand Random Forest Algorithms With Examples (Updated 2023) ". Available at: https://www.analyticsvidhya.com/blog/2021/06/understanding-random-forest/ accessed on 2024-02-01.
[30]Javatpoint, "Decision Tree Algorithm in Machine Learning". Available at: https://www.javatpoint.com/machine-learning-decision-tree-classification-algorithm, accessed on 2023-12-01.
[31]T. Cover and P. Hart, "Nearest neighbor pattern classification," IEEE transactions on information theory, vol. 13, No. 1, pp. 21-27, 1967. [Online]. Available: https://doi.org/10.1109/TIT.1967.1053964.
[32]M. Bansal, A. Goyal, and A. Choudhary, "A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short-term memory algorithms in machine learning," Decision Analytics Journal, vol. 3, p. 100071, 2022. [Online]. Available: https://doi.org/10.1016/j.dajour.2022.100071.
[33]Geeksforgeeks, "Support Vector Machine SVM Algorithm" Available at: https://www.geeksforgeeks.org/support-vector-machine-algorithm/, accessed on 2023-12-13.
[34]I. Wickramasinghe and H. Kalutarage, "Naive Bayes: applications, variations and vulnerabilities: a review of literature with code snippets for implementation," Soft Computing, vol. 25, No. 3, pp. 2277-2293, 2021/02/01 2021. [Online]. Available: https://doi.org/10.1007/s00500-020-05297-6.
[35]S. Jessica, "How Does Logistic Regression Work? ", Available at: https://www.kdnuggets.com/2022/07/logistic-regression-work.html, accessed on 2023-12-13.
[36]A. Saini, "AdaBoost Algorithm: Understand, Implement and Master AdaBoost" Available at: https://www.analyticsvidhya.com/blog/2021/09/adaboost-algorithm-a-complete-guide-for-beginners/, accessed on 2023-12-13.
[37]R. Mohammad and L. McCluskey, "Phishing Websites. UCI Machine Learning Repository," ed, 2015. [Online]. Available: https://archive.ics.uci.edu/dataset/327/phishing+websites

International Journal of Wireless and Microwave Technologies (IJWMT)

MECS Press Journal