Information Engineering for Fake Job Postings Classification in Electronic Business Based on Machine Learning Technology

PDF (4216KB), PP.93-146

Views: 0 Downloads: 0

Author(s)

Markiian-Mykhailo Paprotskyi 1 Victoria Vysotska 2 Lyubomyr Chyrun 3 Yuriy Ushenko 4,* Zhengbing Hu 5 Dmytro Uhryn 4

1. Department of Information Systems and Networks, Lviv Polytechnic National University, Lviv, 79013, Ukraine

2. Department of Information Systems and Networks, Institute of Computer Sciences and Information Technologies, Lviv Polytechnic National University, Lviv, 79013, Ukraine

3. Ivan Franko National University of Lviv, Lviv, 79000, Ukraine

4. Department of Computer Science, Educational and Research Institute of Physical, Technical and Computer Sciences, Yuriy Fedkovych Chernivtsi National University, 58012, Ukraine

5. School of Computer Science, Hubei University of Technology, Wuhan, China

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2025.05.07

Received: 6 Mar. 2025 / Revised: 11 May 2025 / Accepted: 16 Jun. 2025 / Published: 8 Oct. 2025

Index Terms

Machine Learning, Natural Language Processing, E-Business, Fake Jobs, Classification, Logistic Regression, Accuracy, Fullness, F1-Measure, Information Engineering

Abstract

This study investigates the application of machine learning methods for the classification of fraudulent job postings in e-business platforms. Using the publicly available fake_job_postings.csv dataset, textual and categorical features of vacancies were processed and vectorised through TF-IDF, HashingVectorizer, and optimised TF-IDF. Eight machine learning algorithms were compared, including Logistic Regression, Random Forest, Gradient Boosting, Decision Tree, Multinomial Naive Bayes, Linear SVC, K-Nearest Neighbours, and XGBoost. The experiments demonstrate that XGBoost achieved the best performance (Accuracy = 0.990, Precision = 0.982, Recall = 0.998, F1 = 0.990) across all feature representations. Its superior results can be attributed to the ability of boosted ensembles to capture complex non-linear relationships in high-dimensional feature spaces while maintaining robustness against noise and class imbalance.
However, it should be noted that the evaluation was performed on a single static dataset. While the high recall shows the model’s ability to reliably detect fraudulent ads in this context, questions remain about its generalisability. Fraud tactics evolve rapidly, and new job scams may significantly differ from patterns in the training data. This creates  a potential risk of overfitting to dataset-specific features, which limits direct transfer to real-world scenarios without continuous retraining and monitoring. The practical contribution of the study is a reproducible framework that integrates text and categorical processing, vectorisation, hyperparameter optimisation, and comparative model benchmarking. Such a framework could be embedded into online job platforms to support automated filtering of suspicious ads. Still, its deployment requires additional measures: periodic retraining with updated data, integration with platform APIs, and the inclusion of explainability modules to ensure transparency and user trust. Overall, the research demonstrates that ensemble-based models, particularly XGBoost, offer strong potential for fraud detection in the e-business labour market. At the same time, further work is necessary to validate model robustness on unseen and evolving fraudulent job posting strategies, ensuring scalability and reliability in production environments.

Cite This Paper

Markiian-Mykhailo Paprotskyi, Victoria Vysotska, Lyubomyr Chyrun, Yuriy Ushenko, Zhengbing Hu, Dmytro Uhryn, "Information Engineering for Fake Job Postings Classification in Electronic Business Based on Machine Learning Technology", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.17, No.5, pp. 93-146, 2025. DOI:10.5815/ijieeb.2025.05.07

Reference

[1]A. S. Pillai, “Detecting fake job postings using bidirectional LSTM,” arXiv preprint arXiv:2304.02019, 2023. doi:10.56726/IRJMETS35202.
[2]M. Naudé, K. J. Adebayo and R. Nanda, “A machine learning approach to detecting fraudulent job types,” AI & Soc., vol. 38, pp. 1013–1024, 2023. doi:10.1007/s00146-022-01469-0.
[3]A. D. Rathudi, *Fake Job Post Prediction*, Ph.D. dissertation, National College of Ireland, Dublin, 2023. [Online]. Available: https://norma.ncirl.ie/id/eprint/7258
[4]Rakeshmerala16, “Fake-Job-Prediction-Application,” GitHub repository. [Online]. Available: https://github.com/Rakeshmerala16/Fake-Job-Prediction-Application
[5]Amruthjithrajvr, “Recruitment Scam,” Kaggle dataset. [Online]. Available: https://www.kaggle.com/datasets/amruthjithrajvr/recruitment-scam  
[6]S. Dutta and S. K. Bandyopadhyay, “Fake job recruitment detection using machine learning approach,” Int. J. Eng. Trends Technol., vol. 68, no. 4, pp. 48–53, 2020. [Online]. Available: https://ijettjournal.org/assets/Volume-68/Issue-4/IJETT-V68I4P209S.pdf
[7]H. L. Vijay Kumar and B. M. Bhavya, “Machine learning for fake job detection,” Int. J. Adv. Res. Comput. Commun. Eng., 2024. doi:10.17148/IJARCCE.2024.13824. [Online]. Available: https://ijarcce.com/wp-content/uploads/2024/08/IJARCCE.2024.13824.pdf  
[8]K. Shivani, P. Ajay, C. A. Reddy, B. ShreeVardhan and D. Pushpa, “Detecting Real or Fake Job Postings Using Machine Learning,” Int. J. Res. Publication Rev., vol. 6, no. 4, Apr. 2025, pp. 5326–5331. [Online]. Available: https://ijrpr.com/uploads/V6ISSUE4/IJRPR42150.pdf 
[9]R. Sandhya, N. V. Kumar, B. Pravallika and G. Vijay Kumar, “Prediction of Fake Job Ad using NLP-based Multilayer Perceptron,” 2024. [Online]. Available: https://www.jetir.org/papers/JETIR2404340.pdf
[10]C. S. Anita, P. Nagarajan, G. A. Sairam, P. Ganesh and G. Deepakkumar, “Fake job detection and analysis using machine learning and deep learning algorithms,” Revista Geintec-Gestao Inovacao e Tecnologias, vol. 11, no. 2, pp. 642–650, 2021. [Online]. Available: https://www.researchgate.net/publication/352159024_Fake_Job_Detection_and_Analysis_Using_Machine_Learning_and_
Deep_Learning_Algorithms
 
[11]K. Taneja, J. Vashishtha and S. Ratnoo, “Fraud-BERT: transformer-based context aware online recruitment fraud detection,” Discover Comput., vol. 28, p. 9, 2025. doi:10.1007/s10791-025-09502-8.
[12]S. Vrinda, S. Thushara and N. Bindu, “Fraudulent Online Job Advertisement Detection using Machine Learning Models,” Int. J. Innovative Res. Sci., Eng. Technol., 2024. [Online]. Available: https://www.ijirset.com/upload/2024/september/113_Fraudulent.pdf
[13]V. Revathi and C. Balakrishnan, “An Effective Survey on Prediction of Fraudulent Online Job Recruitment,” TIJER, 2024. [Online]. Available: https://tijer.org/tijer/papers/TIJER2406026.pdf 
[14]S. M. Imran and G. Mokshagna, “Fake Job Posting Detection Using Machine Learning: A Comparative Study,” Int. Res. J. Eng. Technol. (IRJET), vol. 12, no. 08, 2025. [Online]. Available: https://www.irjet.net/archives/V12/i8/IRJET-V12I815.pdf
[15]E. Baraneetharan, “Detection of fake job advertisements using machine learning algorithms,” J. Artif. Intell. Capsule Netw., vol. 4, no. 3, pp. 200–210, 2022.
[16]A. Amaar, W. Aljedaani, F. Rustam, S. Ullah, V. Rupapara and S. Ludi, “Detection of fake job postings by utilizing machine learning and natural language processing approaches,” Neural Process. Lett., vol. 54, no. 3, pp. 2219–2247, 2022.
[17]I. Nessa et al., “Recruitment scam detection using gated recurrent unit,” in Proc. 2022 IEEE 10th Region 10 Humanitarian Technology Conf. (R10-HTC), 2022, pp. 445–449.
[18]S. Bansal, “Real / Fake Job Posting Prediction,” Kaggle dataset. [Online]. Available: https://www.kaggle.com/datasets/shivamb/real-or-fake-fake-jobposting-prediction
[19]R. Rofik, R. A. Hakim, J. Unjung, B. Prasetiyo and M. A. Muslim, “Optimization of SVM and gradient boosting models using GridSearchCV in detecting fake job postings,” MATRIX: J. Manag., Technol. Inf. Eng., vol. 23, no. 2, pp. 419–430, 2024.
[20]R. A. Shree, D. Nirmala, S. Sweatha and S. Sneha, “Ensemble modeling on job scam detection,” J. Phys.: Conf. Ser., vol. 1916, p. 012167, 2021. doi:10.1088/1742-6596/1916/1/012167.
[21]G. Malaichamy, Online Job Posting Authenticity Prediction using Machine and Deep Learning Techniques, Ph.D. dissertation, National College of Ireland, 2023.
[22]N. Goyal, N. Sachdeva and P. Kumaraguru, “Spy the lie: fraudulent jobs detection in recruitment domain using knowledge graphs,” in Int. Conf. Knowledge Sci., Eng. Manage., Cham: Springer, 2021, pp. 612–623.
[23]P. Dubey, P. Dubey and P. N. Bokoro, “A Unified Transformer–BDI Architecture for Financial Fraud Detection: Distributed Knowledge Transfer Across Diverse Datasets,” Forecasting, vol. 7, no. 2, p. 31, 2025.
[24]A. Al-Khafaji and O. Karan, “Explainable AI for Predicting User Behavior in Digital Advertising,” in Int. Conf. Emerging Trends Appl. AI, Cham: Springer, 2023, pp. 520–531.
[25]M. T. Vo, A. H. Vo, T. Nguyen, R. Sharma and T. Le, “Dealing with the class imbalance problem in the detection of fake job descriptions,” Computers, Materials & Continua, vol. 68, no. 1, pp. 521–535, 2021.
[26]Q. Cao, M. Sirivianos, X. Yang and T. Pregueiro, “Aiding the detection of fake accounts in large scale social online services,” in Proc. 9th USENIX Symp. Netw. Syst. Design Implement. (NSDI '12), 2012, pp. 197–210.
[27]J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy and A. Bouchachia, “A survey on concept drift adaptation,” ACM Comput. Surv., vol. 46, no. 4, 2014.
[28]J. Singla, A. K. Bashir, Y. Nam, N. U. Hasan and U. Tariq, “Handling class imbalance in online transaction fraud detection,” Computers, Materials and Continua, vol. 70, no. 2, pp. 2861–2877, 2021.
[29]K. C. Ng, P. F. Ke, M. K. So and K. Y. Tam, “Augmenting fake content detection in online platforms: A domain adaptive transfer learning via adversarial training approach,” Product. Oper. Manag., vol. 32, no. 7, pp. 2101–2122, 2023.
[30]V. Lytvyn, V. Vysotska, P. Pukach, I. Bobyk and B. Pakholok, “A method for constructing recruitment rules based on the analysis of a specialist's competences,” Eastern-European J. Enterp. Technol., no. 6(2), pp. 4–14, 2016.
[31]A. Rzheuskyi, O. Kutyuk, V. Vysotska, Y. Burov, V. Lytvyn and L. Chyrun, “The architecture of distant competencies analyzing system for IT recruitment,” in Proc. 2019 IEEE 14th Int. Conf. Computer Sci. Info. Technol. (CSIT), vol. 3, 2019, pp. 254–261.
[32]V. Lytvyn, V. Vysotska and A. Rzheuskyi, “Technology for the psychological portraits formation of social networks users for the IT specialists recruitment based on big five, NLP and big data analysis,” CEUR Workshop Proc., 2019, pp. 147–171.
[33]M. Bublyk et al., “The Decision Tree Usage for the Results Analysis of the Psychophysiological Testing,” CEUR Workshop Proc., 2020, pp. 458–472.
[34]N. Shakhovska, V. Vysotska and L. Chyrun, “Features of e-learning realization using virtual research laboratory,” in Proc. 2016 XIth Int. Sci. & Tech. Conf. Computer Sci. Info. Technol. (CSIT), 2016, pp. 143–148.
[35]A. Rzheuskyi et al., “The intellectual system development of distant competencies analyzing for IT recruitment,” in Conf. Computer Sci. Info. Technol., Cham: Springer, 2019, pp. 696–720.
[36]L. Chyrun, I. Kis, V. Vysotska and L. Chyrun, “Content monitoring method for cut formation of person psychological state in social scoring,” in Proc. 2018 IEEE 13th Int. Sci. & Tech. Conf. CSIT, vol. 2, 2018, pp. 106–112.
[37]V. Vysotska et al., “Methods and tools for web resources processing in e-commercial content systems,” in Proc. 2020 IEEE 15th Int. Sci. & Tech. Conf. CSIT, vol. 1, 2020, pp. 114–118.
[38]L. Goode, “Deepfakes, Scams, and the Age of Paranoia,” Wired. [Online]. Available: https://www.wired.com/story/paranoia-social-engineering-real-fake/ 
[39]“The rise of the recruitment scam,” The Sunday Times. [Online]. Available: https://www.thetimes.com/business-money/money/article/recruitment-scam-whatsapp-news-uk-fake-job-offer-nbjvqdvxc 
[40]K. Venkatakrishna, D. Yalamanchi, D. S. Reddy, M. S. Baba and G. Anshuman, “Application of Data Mining to Detect Fraudulent Job Advertisements in the Age of social media and Electronic Platforms,” IJESAT. [Online]. Available: https://www.ijesat.com/ijesat/files/V23I12017_1702717809.pdf
[41]B. A. Mr. Abhale, A. B. Sonawane and S. S. Thorat, “A survey on fake job recruitment detection using different machine learning and data mining algorithms,” IJRPR, 2022. [Online]. Available: https://ijrpr.com/uploads/V3ISSUE5/IJRPR4057.pdf
[42]M. S. M. Rafi, M. M. Hossen and T. Alam, “Evaluating XGBoost and Naive Bayes for Efficient Fraudulent Job Detection: An Explainable Approach,” in 2025 Int. Conf. Electrical, Computer and Communication, 2025. 
[43]S. Vuppu and G. S. Reddy, “Transformer-Based Deep Learning Approaches for Online Recruitment Fraud (ORF) Detection,” SSRN, 2025. [Online]. Available: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5227554 
[44]V. Vysotska, K. Przystupa, Y. Kulikov, S. Chyrun, Y. Ushenko, Z. Hu and D. Uhryn, “Recognizing Fakes, Propaganda and Disinformation in Ukrainian Content based on NLP and Machine-learning Technology,” Int. J. Comput. Netw. Inf. Secur. (IJCNIS), vol. 17, no. 1, pp. 92–127, 2025. 
[45]F. Bigdeli, “Cross-platform Fake Review Detection: A Comparative Analysis of Supervised and Deep Learning Models,” Int. J. Inf. Technol. Comput. Sci. , vol. 17, no. 3, pp. 52–60, 2025. 
[46]N. Odeh, D. Eleyan and A. Eleyan, “Enhancing Web Security through Machine Learning-based Detection of Phishing Websites,” IJCNIS, vol. 17, no. 1, pp. 39–56, 2025. 
[47]M. Nazarkevych, V. Vysotska, V. Lytvyn, Y. Ushenko, D. Uhryn and Z. Hu, “Agile Methodology for Identifying Original and Fake Printed Documents based on Secret Raster Formation,” IJCNIS, vol. 17, no. 2, pp. 51–71, 2025. 
[48]B. A. Bodunde, O. Adewusi and A. Oyebade, “An Improved Classification Model for Fake News Detection in Social Media,” IJITCS, vol. 12, no. 1, pp. 34–43, 2020. 
[49]S. Bauskar, V. Badole, P. Jain and M. Chawla, “Natural Language Processing based Hybrid Model for Detecting Fake News Using Content-Based Features and Social Features,” Int. J. Inf. Eng. Electron. Bus., vol. 11, no. 4, pp. 1–10, 2019. 
[50]A. M. Meligy, H. M. Ibrahim and M. F. Torky, “Identity Verification Mechanism for Detecting Fake Profiles in Online Social Networks,” IJCNIS, vol. 9, no. 1, pp. 31–39, 2017. 
[51]S. K. Kiran, M. Shashi and K. B. Madhuri, “Multi-stage Transfer Learning for Fake News Detection Using AWD-LSTM Network,” IJITCS, vol. 14, no. 5, pp. 58–69, 2022. 
[52]A. S. Noah, N. E. Ghannam, G. A. Elsharawy and A. S. Desuky, “An Intelligent System for Detecting Fake Materials on the Internet,” Int. J. Modern Educ. Comput. Sci., vol. 15, no. 5, pp. 42–59, 2023.
[53]V. Vysotska et al., “Disinformation, Fakes and Propaganda Identifying Methods in Online Messages Based on NLP and Machine Learning Methods,” IJCNIS, vol. 16, no. 5, pp. 57–85, 2024. 
[54]X. Men and V. Y. Mariano, “Explainable Fake News Detection Based on BERT and SHAP Applied to COVID-19,” IJMECS, vol. 16, no. 1, pp. 11–22, 2024. 
[55]A. A. Olagunju and I. O. Awoyelu, “Performance Evaluation of Fake News Detection Models,” IJITCS, vol. 16, no. 6, pp. 89–100, 2024. 
[56]K. A. Kumar, S. Tandan and A. Koirala, “A Fake Product Identification and Prevention System Using Blockchain Technology,” Int. J. Educ. Manage. Eng. (IJEME), vol. 14, no. 6, pp. 20–31, 2024. 
[57]A. N. S. Rao, G. V. Kumar and P. A. Student, “A Machine Learning Approach for Identifying Fake Job Postings,” IJESAT. [Online]. Available: https://www.ijesat.com/ijesat/files/V25I0434IJESATAMachineLearningApproachforIdentifyingFakeJobPostings_1745760484.pdf
[58]B. P. Pradeep Kumar, D. J. Bhagya, G. Gagana and N. Mani, “Fake Job Post Detection Using Machine Learning,” in 2025 Int. Conf. Computing for Sustainability and Intelligent Future (COMP-SIF), 2025, pp. 1–6. https://doi.org/10.1109/COMP-SIF65618.2025.10969867 
[59]T. Bhatia and J. Meena, “Detection of fake online recruitment using machine learning techniques,” in Proc. 2022 4th Int. Conf. Advances Comput., Commun., Control Netw. (ICAC3N), 2022, pp. 300–304. https://doi.org/10.1109/ICAC3N56670.2022.10074276
[60]Trustworthy A. I., “Explainability in Fraud Detection: Trustworthy AI and Pattern Detection,” in Artificial Intelligence for Global Security: First IFIP WG 12.13 Int. Conf., AI4GS 2024, Proc., vol. 743, p. 178, Springer, 2025. [Online]. Available: https://link.springer.com/content/pdf/10.1007/978-3-031-96522-7.pdf#page=195
[61]D. Rajani, K. Praveena, M. P. Rachana and A. S. Supriya, “Fake job detection using machine learning,” Mater. Sci., vol. 23, no. 04, 2024. [Online]. Available: https://materialsciencetech.com/mst/uploads/2024-42440.pdf
[62]M. Chung, “Detecting Fake Job Postings with the Random Forest Model,” Medium. [Online]. Available: https://medium.com/analytics-vidhya/detecting-fake-job-postings-with-the-random-forest-model-c96493108901