Tuan Dung Pham

Work place: VNU University of Engineering and Technology, E3 Building, 144 Xuan Thuy Street, Cau Giay District, 11310, Ha Noi, Vietnam

E-mail: dungpt98@vnu.edu.vn

Website: https://orcid.org/0000-0002-2183-4640

Research Interests: Computational Learning Theory, Models of Computation,


Pham Tuan Dung graduated from Vietnam National University, University of Engineering and Technology. With his Bachelor Degree in Computer Science, he continue to work in the Faculty of Engineering and Technology, UET as assistant lecturer and study his Master in Computer Science. His research interest include privacy learning with deep learning, generative models for text and images.

Author Articles
Evaluation of GAN-based Models for Phishing URL Classifiers

By Thi Thanh Thuy Pham Tuan Dung Pham Viet Cuong Ta

DOI: https://doi.org/10.5815/ijcnis.2023.02.01, Pub. Date: 8 Apr. 2023

Phishing attacks by malicious URL/web links are common nowadays. The user data, such as login credentials and credit card numbers can be stolen by their careless clicking on these links. Moreover, this can lead to installation of malware on the target systems to freeze their activities, perform ransomware attack or reveal sensitive information. Recently, GAN-based models have been attractive for anti-phishing URLs. The general motivation is using Generator network (G) to generate fake URL strings and Discriminator network (D) to distinguish the real and the fake URL samples. This is operated in adversarial way between G and D so that the synthesized URL samples by G become more and more similar to the real ones. From the perspective of cybersecurity defense, GAN-based motivation can be exploited for D as a phishing URL detector or classifier. This means after training GAN on both malign and benign URL strings, a strong classifier/detector D can be achieved. From the perspective of cyberattack, the attackers would like to to create fake URLs that are as close to the real ones as possible to perform phishing attacks. This makes them easier to fool users and detectors. In the related proposals, GAN-based models are mainly exploited for anti-phishing URLs. There have been no evaluations specific for GAN-generated fake URLs. The attacker can make use of these URL strings for phishing attacks. In this work, we propose to use TLD (Top-level Domain) and SSIM (Structural Similarity Index Score) scores for evaluation the GAN-synthesized URL strings in terms of the structural similariy with the real ones. The more similar in the structure of the GAN-generated URLs are to the real ones, the more likely they are to fool the classifiers. Different GAN models from basic GAN to others GAN extensions of DCGAN, WGAN, SEQGAN are explored in this work. We show from the intensive experiments that D classifier of basic GAN and DCGAN surpasses other GAN models of WGAN and SegGAN. The effectiveness of the fake URL patterns generated from SeqGAN is the best compared to other GAN models in both structural similarity and the ability in deceiving the phishing URL classifiers of LSTM (Long Short Term Memory) and RF (Random Forest).

[...] Read more.
Other Articles