Diana Vyshotravka; Victoria Vysotska; Zhengbing Hu; Dmytro Uhryn; Yuriy Ushenko; Kyrylo Smelyakov

Smart Application for Recruiting Based on Natural Language Processing Methods, Transformer Models and Siamese Neural Network Architecture

PDF (6610KB), PP.95-157

Views: 0 Downloads: 0

Author(s)

Diana Vyshotravka ¹ Victoria Vysotska ¹ Zhengbing Hu ² Dmytro Uhryn ³ Yuriy Ushenko ^3,4 Kyrylo Smelyakov ⁵

1. Department of Information Systems and Networks, Institute of Computer Sciences and Information Technologies, Lviv Polytechnic National University, Lviv, 79013, Ukraine

2. School of Computer Science, Hubei University of Technology, Wuhan, China

3. Department of Computer Science, Educational and Research Institute of Physical, Technical and Computer Sciences, Yuriy Fedkovych Chernivtsi National University, 58012, Ukraine

4. Department of Physics, Shaoxing University, Shaoxing, Zhejiang Province 312000, China

5. Department of Software Engineering, Kharkiv National University of Radio Electronics, Nauky Ave. 14, Kharkiv, 61166, Ukraine

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2025.05.07

Received: 6 Jun. 2025 / Revised: 28 Jul. 2025 / Accepted: 26 Aug. 2025 / Published: 8 Oct. 2025

Index Terms

Candidate Ranking, Recommendation Systems, Precision@10, Top-N Recommendations, Relevance Evaluation, Information Retrieval, Transformers, Semantic Mapping, RoBERTa, SimSCE, MSELoss, Siamese Networks

Abstract

This study presents a deep learning-based approach to automated resume and job matching that uses semantic similarity between texts. The solution is based on SimCSE RoBERTa transformer embeddings and a Siamese neural architecture trained using the MSELoss loss function. Unlike traditional filtering systems by keywords or characteristics, the proposed model learns to place semantically compatible pairs (resume-vacancy) in a common vector space. Unlike traditional keyword-based or attributive matching systems, our method is designed to capture deep semantic alignment between resumes and job descriptions. To evaluate the effectiveness of this architecture, we conducted extensive experiments on a labelled dataset of over 7,000 resume–vacancy pairs obtained from the HuggingFace repository. The dataset includes three classes (Good Fit, Potential Fit, No Fit), which we restructured into a binary classification task. Annotation labels reflect textual compatibility based on skills, responsibilities, and experience, ensuring task relevance.
It resulted in a moderately imbalanced dataset with approximately 66% positive and 34% negative examples. Labels were assigned based on semantic compatibility, including skill match, job responsibilities, and experience alignment. Our model achieved accuracy = 72%, precision = 70%, recall = 74%, F1-score = 72%, and Precision@10 = 75%, significantly outperforming both classical (TF-IDF + cosine similarity) and neural (Sentence-BERT without fine-tuning) baselines. These results validate the empirical effectiveness of our architecture for candidate ranking and selection. To justify the use of a complex Siamese architecture, the system was compared to two baselines: (1) a classical TF-IDF + cosine similarity method, and (2) a pretrained Sentence-BERT model without task-specific fine-tuning. The proposed model significantly outperformed both baselines across all evaluation metrics, confirming that its complexity translates to meaningful performance gains. A basic self-learning mechanism is implemented and functional. Recruiters can provide binary feedback (Fit / No Fit) for each recommended candidate, which is stored in a feedback table. This feedback can be used to retrain or fine-tune the model periodically, enabling adaptive behaviour over time. While initial retraining experiments were conducted offline, full automation and continuous integration of feedback into training pipelines remain a goal for future development. The system offers sub-5-second response times, integration with vector databases, and a web-based user interface. It is designed for use in HR departments, recruiting agencies, and employment platforms, with potential for broader commercial deployment and domain adaptation. We additionally implemented a feedback-driven retraining loop that enables future self-supervised adaptation. While UI and vector retrieval infrastructure were developed to support prototyping and deployment, the primary research innovation centres on the modelling framework, learning setup, and comparative evaluation methodology. This work contributes to the advancement of semantically-aware intelligent recruiting systems and offers a replicable baseline for future studies in neural recommendation for HR applications. The risks of algorithmic bias are emphasised separately: even in the absence of obvious demographic characteristics in the input data, the model can implicitly reproduce social or historical inequalities inherent in the data. In this regard, the study outlines areas for further development, in particular equity auditing, bias reduction techniques, and the integration of human validation in decision-making.

Cite This Paper

Diana Vyshotravka, Victoria Vysotska, Zhengbing Hu, Dmytro Uhryn, Yuriy Ushenko, Kyrylo Smelyakov, "Smart Application for Recruiting Based on Natural Language Processing Methods, Transformer Models and Siamese Neural Network Architecture", International Journal of Intelligent Systems and Applications(IJISA), Vol.17, No.5, pp.95-157, 2025. DOI:10.5815/ijisa.2025.05.07

Reference

[1]SeekOut. AI Recruiting Software Platform. URL: https://www.seekout.com/platform/recruit
[2]Tokar.ua. (2023). 14 facts about Djinni.co: what is the best Ukrainian service for finding a job in IT. URL: https://tokar.ua/read/15964/14-faktiv-pro-djinni-co-chym-zhyve-naykrashchyy-ukr/
[3]Bose, B. (2022). NLP Text Encoding: A Beginner's Guide. Medium. URL: https://bishalbose294.medium.com/nlp-text-encoding-a-beginners-guide-fa332d715854
[4]DataCamp. How Transformers Work. URL: https://www.datacamp.com/tutorial/how-transformers-work
[5]Nag, R. (2023). A Comprehensive Guide to Siamese Neural Networks. Medium. URL: https://medium.com/@rinkinag24/a-comprehensive-guide-to-siamese-neural-networks-3358658c0513
[6]Kaggle. Resume Dataset. URL: https://www.kaggle.com/datasets/snehaanbhawal/resume-dataset
[7]Kaggle. Job Vacancy Tweets. URL: https://www.kaggle.com/datasets/prasad22/job-vacancy-tweets
[8]HuggingFace. Resume-Job Description Fit. URL: https://huggingface.co/datasets/cnamuangtoun/resume-job-description-fit/viewer/default/train
[9]Lytvyn, V., Vysotska, V., Pukach, P., Bobyk, I., & Pakholok, B. (2016). A method for constructing recruitment rules based on the analysis of a specialist's competences. East European Journal of Advanced Technology, (6 (2)), 4-14.
[10]Rzheuskyi, A., Kutyuk, O., Vysotska, V., Burov, Y., Lytvyn, V., & Chyrun, L. (2019, September). The architecture of distant competencies analyzing system for it recruitment. In 2019 IEEE 14th International Conference on Computer Sciences and Information Technologies (CSIT) (Vol. 3, pp. 254-261). IEEE.
[11]Rzheuskyi, A., Kutyuk, O., Voloshyn, O., Kowalska-Styczen, A., Voloshyn, V., Chyrun, L., ... & Rak, T. (2019, September). The intellectual system development of distant competencies analyzing for IT recruitment. In Conference on Computer Science and Information Technologies (pp. 696-720). Cham: Springer International Publishing.
[12]Lytvyn, V., Vysotska, V., & Rzheuskyi, A. (2019). Technology for the psychological portraits formation of social networks users for the IT specialists recruitment based on big five, NLP and big data analysis. In CEUR Workshop Proceedings (pp. 147-171).
[13]Łępicki, M., Latkowski, T., Antoniuk, I., Bukowski, M., Świderski, B., Baranik, G., ... & Kurek, J. (2025). Comparative Evaluation of Sequential Neural Network (GRU, LSTM, Transformer) Within Siamese Networks for Enhanced Job–Candidate Matching in Applied Recruitment Systems. Applied Sciences, 15(11), 5988.
[14]Lytvyn, V., Pukach, P., Bobyk, I., & Vysotska, V. (2016). The method of formation of the status of personality understanding based on the content analysis. East European Journal of Advanced Technology, (5 (2)), 4-12.
[15]Oniani, D., Chandrasekar, P., Sivarajkumar, S., & Wang, Y. (2023). Few-Shot learning for clinical natural language processing using siamese neural networks: algorithm development and validation study. JMIR AI, 2, e44293.
[16]Tian, X., Pavur, R., Han, H., & Zhang, L. (2023). A machine learning-based human resources recruitment system for business process management: using LSA, BERT and SVM. Business Process Management Journal, 29(1), 202-222.
[17]Chen, L. (2024). Research on Improving Recruitment Using Natural Language Processing and AI Technology. https://webofproceedings.org/proceedings_series/ECOM/ICEMEET%202024/ET51.pdf
[18]Deshmukh, A., & Raut, A. (2024). Enhanced Resume Screening for Smart Hiring Using Sentence-Bidirectional Encoder Representations from Transformers (S-BERT). International Journal of Advanced Computer Science & Applications, 15(8).
[19]Kinger, S., Kinger, D., Thakkar, S., & Bhake, D. (2024). Towards smarter hiring: resume parsing and ranking with YOLOv5 and DistilBERT. Multimedia Tools and Applications, 83(35), 82069-82087.
[20]Bevara, R. V. K., Mannuru, N. R., Karedla, S. P., Lund, B., Xiao, T., Pasem, H., ... & Rupeshkumar, S. (2025). Resume2Vec: Transforming Applicant Tracking Systems with Intelligent Resume Embeddings for Precise Candidate Matching. Electronics, 14(4), 794.
[21]Bouhoun, Z., Guerrois, T., Li, X., Baker, M., Elhadji Ille Gado, N., Roumili, E., ... & Plana, R. (2023, June). Information retrieval using domain adapted language models: application to resume documents for HR recruitment assistance. In International Conference on Computational Science and Its Applications (pp. 440-457). Cham: Springer Nature Switzerland.
[22]G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information Processing & Management, vol. 24, no. 5, pp. 513–523, 1988.
[23]M. Lan, C. L. Tan, J. Su, and Y. Lu, “Supervised and traditional term weighting methods for automatic text categorization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 721–735, 2008.
[24]F. Ren and M. G. Sohrab, “Class-indexing-based term weighting for automatic text classification,” Information Sciences, vol. 236, pp. 109–125, 2013.
[25]C. D. Manning, P. Raghavan, and H. Schütze, “Boolean retrieval,” in Introduction to Information Retrieval, Cambridge, UK: Cambridge University Press, 2008, pp. 1–18.
[26]H. Schütze, C. D. Manning, and P. Raghavan, Introduction to Information Retrieval, vol. 39, Cambridge, UK: Cambridge University Press, 2008, pp. 234–265.
[27]S. Ceri, A. Bozzon, M. Brambilla, E. Della Valle, P. Fraternali, and S. Quarteroni, “An introduction to information retrieval,” in Web Information Retrieval, Berlin, Germany: Springer, 2013, pp. 3–11.
[28]S. Robertson and H. Zaragoza, “The probabilistic relevance framework: BM25 and beyond,” Foundations and Trends® in Information Retrieval, vol. 3, no. 4, pp. 333–389, 2009.
[29]J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), Minneapolis, MN, USA, June 2019, pp. 4171–4186.
[30]N. Reimers and I. Gurevych, “Sentence-BERT: Sentence embeddings using siamese BERT-networks,” arXiv preprint, arXiv:1908.10084, 2019.
[31]M. Łępicki, T. Latkowski, I. Antoniuk, M. Bukowski, B. Świderski, G. Baranik, and J. Kurek, “Comparative evaluation of sequential neural network (GRU, LSTM, Transformer) within Siamese networks for enhanced job–candidate matching in applied recruitment systems,” Applied Sciences, vol. 15, no. 11, pp. 5988–5995, 2025.
[32]X. Xue, J. Wang, B. Ma, J. Ren, W. Zhang, S. Gao, and H. Wang, “Fine-grained semantics-enhanced graph neural network model for person-job fit,” Entropy, vol. 27, no. 7, pp. 703–710, 2025.
[33]R. Alonso, D. Dessí, A. Meloni, and D. R. Recupero, “A novel approach for job matching and skill recommendation using transformers and the O*NET database,” Big Data Research, vol. 36, pp. 100509–100515, 2025.
[34]P. Cremonesi, Y. Koren, and R. Turrin, “Performance of recommender algorithms on top-n recommendation tasks,” in Proceedings of the 4th ACM Conference on Recommender Systems, Barcelona, Spain, September 2010, pp. 39–46.
[35]H. Wang, N. Wang, and D. Y. Yeung, “Collaborative deep learning for recommender systems,” in Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, August 2015, pp. 1235–1244.
[36]D. Zhang, J. Liu, H. Zhu, Y. Liu, L. Wang, P. Wang, and H. Xiong, “Job2Vec: Job title benchmarking with collective multi-view representation learning,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, November 2019, pp. 2763–2771.
[37]H. Liu and Y. Ge, “Job and employee embeddings: A joint deep learning approach,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 7, pp. 7056–7067, 2022.
[38]A. Hidri, R. Mkhinini Gahar, and M. Sassi Hidri, “What factors distinguish overlapping data job postings? Towards ML-based models for job category factors prediction,” Intelligent Decision Technologies, vol. 18, no. 3, pp. 2161–2176, 2024.
[39]M. Yamashita, Y. Li, T. Tran, Y. Zhang, and D. Lee, “Looking further into the future: Career pathway prediction,” in Proceedings of the International Conference on Web Search and Data Mining (WSDM) Computational Jobs Marketplace, Virtual, March 2022, pp. 1–8.
[40]J. Zhu, “Learning recruitment-related representations from graphs and sequential data,” Ph.D. dissertation, Université Paris-Saclay, Paris, France, 2023. [Online]. Available: https://theses.hal.science/tel-04102832/
[41]M. Yamashita, J. T. Shen, T. Tran, H. Ekhtiari, and D. Lee, “James: Normalizing job titles with multi-aspect graph embeddings and reasoning,” in 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA), Thessaloniki, Greece, October 2023, pp. 1–10.
[42]B. Sobol, M. Tonneau, S. Fraiberger, D. Lee, and N. Grinberg, “280 characters to employment: Using Twitter to quantify job vacancies,” in Proceedings of the International AAAI Conference on Web and Social Media, vol. 18, Buffalo, NY, USA, May 2024, pp. 1477–1489.
[43]S. Gandhi, R. Nagesh, and S. Das, “Learning skills adjacency representations for optimized reskilling recommendations,” in 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, December 2022, pp. 2253–2258.
[44]M. T. Cerilla, A. Santillan, C. J. Vinas, and M. B. D. Fuente, “Career path modeling and recommendations with LinkedIn career data and predicted salary estimations,” in Proceedings of the International Conference on Learning Representations (ICLR) Workshop, Virtual, 2023, pp. 1–8. [Online]. Available: https://openreview.net/forum?id=R5NNAThG0i
[45]X. Wang, Z. Jiang, and L. Peng, “A deep-learning-inspired person–job matching model based on sentence vectors and subject–term graphs,” Complexity, vol. 2021, pp. 6206288–6206295, 2021.
[46]M. E. Kanakis, R. Khalili, and L. Wang, “Machine learning for computer systems and networking: A survey,” ACM Computing Surveys, vol. 55, no. 4, pp. 1–36, 2022.
[47]K. Rekanar, M. J. Hayes, and C. Eising, “Mimicking human attention in driving scenarios for enhanced visual question answering: Insights from eye-tracking and the human attention filter,” SSRN Electronic Journal, 2024. [Online]. Available: https://ssrn.com/abstract=5015496
[48]Q. Ni, “Deep neural network model construction for digital human resource management with person–job matching,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1418020–1418027, 2022.
[49]Y. Deng, H. Lei, X. Li, and Y. Lin, “An improved deep neural network model for job matching,” in 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, May 2018, pp. 106–112.
[50]Z. Wang, W. Wei, C. Xu, J. Xu, and X. L. Mao, “Person-job fit estimation from candidate profile and related recruitment history with co-attention neural networks,” Neurocomputing, vol. 501, pp. 14–24, 2022.
[51]A. Thun, “Matching job applicants to free text job ads using semantic networks and natural language inference,” M.S. thesis, Uppsala University, Uppsala, Sweden, 2020. [Online]. Available: https://www.diva-portal.org/smash/record.jsf?pid=diva2:1467916
[52]J. F. B. C. Rojas, “Networks of relations in placement programs serving the transition from school-to-work: The case of secondary school electronics majors in Venezuela,” Ph.D. dissertation, New York University, New York, NY, USA, 2001.
[53]H. Chen, C. Mason, Q. Wang, and Y. Zhao, “DBSSM: Deep BERT-based semantic skill matching from resumes to a public skill taxonomy,” in Australasian Joint Conference on Artificial Intelligence, Brisbane, Australia, November 2024, pp. 316–328.
[54]C. Qin, H. Zhu, T. Xu, C. Zhu, L. Jiang, E. Chen, and H. Xiong, “Enhancing person-job fit for talent recruitment: An ability-aware neural network approach,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, June 2018, pp. 25–34.
[55]L. M. Pombo, “Landing on the right job: A machine learning approach to match candidates with jobs applying semantic embeddings,” M.S. thesis, Universidade NOVA de Lisboa, Lisbon, Portugal, 2019.
[56]Danylo Levkivskyi, Victoria Vysotska, Lyubomyr Chyrun, Yuriy Ushenko, Dmytro Uhryn, Cennuo Hu, "Agile Methodology of Information Engineering for Semantic Annotations Categorization and Creation in Scientific Articles Based on NLP and Machine Learning Methods", International Journal of Information Engineering and Electronic Business, Vol.17, No.2, pp. 1-50, 2025.

International Journal of Intelligent Systems and Applications (IJISA)