Wallace A. Pinheiro; Ricardo Q. A. Fernandes

Framework for Incident Identification Based on LLMs and Cybersecurity Ontologies

PDF (783KB), PP.125-138

Views: 0 Downloads: 0

Author(s)

Wallace A. Pinheiro ^1,2,* Ricardo Q. A. Fernandes ³

1. Center for Systems Development, Brazilian Army, Brazil

2. Military Institute of Engineering (IME), Brazilian Army, Brazil

3. National Government CSIRT, Brazilian Presidency, Brazil

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2026.02.09

Received: 11 Sep. 2025 / Revised: 7 Nov. 2025 / Accepted: 6 Jan. 2026 / Published: 8 Apr. 2026

Index Terms

LLM, Ontology, Security, Event, Incident

Abstract

Accurate and immediate incident identification is essential in the cybersecurity area, as it allows the timely detection of threats, along with countermeasures and mitigation, ensuring security for organizations and individuals. This reduces false positives and enables efforts to be concentrated on real risks. This paper presents a framework that integrates ontologies and Large Language Models (LLMs) to identify incidents from events within the context of security threats. Ontology rules are employed to infer probable incidents, resulting in an initial set of incidents for analysis. Furthermore, ontologies provide contextual information, which is combined with event data to formulate queries for LLMs. These interactions with LLMs produce a second set of probable incidents. The outputs from ontol-ogy-based inferences and LLM-driven responses are then compared, and the discrepancies are leveraged to refine ontology rules and adjust LLM responses. Experimental results, focusing on context generation and incident detection, demonstrate that the integration of ontologies and LLMs significantly enhances the accuracy of incident identification when compared to using only LLMs.

Cite This Paper

Wallace A. Pinheiro, Ricardo Q. A. Fernandes, "Framework for Incident Identification Based on LLMs and Cybersecurity Ontologies", International Journal of Intelligent Systems and Applications(IJISA), Vol.18, No.2, pp.125-138, 2026. DOI:10.5815/ijisa.2026.02.09

Reference

[1]L. Huang et al., "A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions," ACM Transactions on Information Systems, 2023.
[2]S. Grimm, A. Abecker, J. Völker, and R. Studer, "Ontologies and the Semantic Web," in Handbook of Semantic Web Technologies, J. Domingue, D. Fensel, and J. A. Hendler, Eds., Berlin: Springer, 2011, doi: 10.1007/978-3-540-92913-0_13.
[3]P. Hitzler, "A Review of the Semantic Web Field," Communications of the ACM, vol. 64, no. 2, pp. 76–83, 2021.
[4]A. Hogan, "The Semantic Web: Two Decades On," Semantic Web, vol. 11, no. 1, pp. 169–185, 2020.
[5]F. Lecue, "On the Role of Knowledge Graphs in Explainable AI," Semantic Web, vol. 11, no. 1, pp. 41–51, 2020.
[6]C. d’Amato, "Machine Learning for the Semantic Web: Lessons Learnt and Next Research Directions," Semantic Web, vol. 11, no. 1, pp. 195–203, 2020.
[7]P. Hitzler, F. Bianchi, M. Ebrahimi, and M. Sarker, "Neural-Symbolic Integration and the Semantic Web," Semantic Web, vol. 11, no. 1, pp. 3–11, 2020.
[8]P. F. Patel-Schneider, P. Hayes, I. Horrocks, F. van Harmelen. OWL Web Ontology Language: Semantics and Abstract Syntax. W3C Candidate Recommendation. 2002. Disponível em http://www.w3.org/TR/owl-semantics.
[9]I. Horrocks, et al.. SWRL: A Semantic Web Rule Language. *W3C Submission*. 2004.
[10]A. Vaswani et al., "Attention Is All You Need," in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.
[11]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv preprint arXiv:1810.04805, 2018.
[12]H. Naveed et al., "A Comprehensive Overview of Large Language Models," arXiv preprint arXiv:2307.06435, 2023, doi: 10.48550/arXiv.2307.06435.
[13]Z. Shang et al., "OntoFact: Unveiling Fantastic Fact-Skeleton of LLMs via Ontology-Driven Reinforcement Learning," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, pp. 29859–29869, 2024, doi: 10.1609/aaai.v38i17.29859.
[14]D. Allemang and J. Sequeda, "Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!", arXiv preprint arXiv:2405.11706, 2024, doi: 10.48550/arXiv.2405.11706.
[15]M. S. Farooq and M. T. Waseem, "Developing and Building Ontologies in Cyber Security," arXiv preprint arXiv:2306.00377, 2023, doi: 10.48550/arXiv.2306.00377.
[16]Z. Syed, A. Padia, T. Finin, L. Mathews, and A. Joshi, "UCO: Unified Cybersecurity Ontology," in AAAI Workshop on Artificial Intelligence for Cyber Security, 2016. [Online]. Available: http://ebiq.org/p/722.
[17]G. González-Granadillo, S. González-Zarzosa, and R. Diaz, "Security Information and Event Management (SIEM): Analy-sis, Trends, and Usage in Critical Infrastructures," Sensors, vol. 21, no. 14, p. 4759, 2021, doi: 10.3390/s21144759.
[18]A. C. Simonneau, "Security Events, Alerts, Incidents: What Are the Differences?" Oversoc.com, 2024. [Online]. Available: https://www.oversoc.com/en/ressources/article/evenements-de-securite-alertes-incidents-quelles-differences.
[19]C. Pham, "From Events to Incidents," SANS GSEC Practical Assignment, 2001. [Online]. Available: https://www.sans.org/white-papers/646/.
[20]Y. Zhang et al., "A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Dis-covery," in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 2024, pp. 8783–8817, doi: 10.18653/v1/2024.emnlp-main.670.
[21]T. B. Brown et al., "Language Models Are Few-Shot Learners," arXiv preprint arXiv:2005.14165, accessed Dec. 13, 2024, doi: 10.48550/arXiv.2005.14165.
[22]T. S. Nanjundeswaraswamy and S. Divakar, "Determination of Sample Size and Sampling Methods in Applied Research," Proceedings on Engineering Sciences, vol. 3, no. 1, pp. 25–32, 2021.
[23]W. G. Cochran, "Sampling Techniques," 3rd ed., John Wiley & Sons, 1977.
[24]W. Pinheiro, "Experimental Data [Data Set]," Zenodo, 2025. [Online]. Available: https://zenodo.org/records/14557808?preview=1&token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6IjQ5ZTkxYmY0LTFmZmMtNDdhZi1hZTY0LTk1Yzg5MmIzMWFlOCIsImRhdGEiOnt9LCJyYW5kb20iOiI1NzBjNjMxZmU4ZDM3ZTQ3ZmI4ODUzNDdhMGIwYzYwNyJ9.8ufEtO0rI6UK_cLNsIjzBm6SpR6OQSF3l5TOFwOPEs8zcdRXWwmIW1zU-Kz87mkgL5VS7QzCgCbPMfykpCVarQ
[25]S. D. Bolboacã, L. Jäntschi, A. F. Sestraş, R. E. Sestraş, and D. C. Pamfil, "Pearson-Fisher Chi-Square Statistic Revisited," Information, vol. 2, no. 3, pp. 528–545, 2011, doi: 10.3390/info2030528.
[26]O. Rainio, J. Teuho, and R. Klén, "Evaluation Metrics and Statistical Tests for Machine Learning," Scientific Reports, vol. 14, no. 1, p. 6086, 2024, doi: 10.1038/s41598-024-56706-x.
[27]P. M. Marcus, "Performance Measures," in Assessment of Cancer Screening, Cham: Springer, 2022, pp. 23–28, doi: 10.1007/978-3-030-94577-0_3.
[28]M. Sokolova and G. Lapalme, "A Systematic Analysis of Performance Measures for Classification Tasks," Information Processing & Management, vol. 45, no. 4, pp. 427–437, 2009.
[29]I. Batyrshin, "Towards a General Theory of Similarity and Association Measures: Similarity, Dissimilarity and Correlation Functions," Journal of Intelligent & Fuzzy Systems, vol. 36, no. 4, pp. 2977–3004, 2019.
[30]A. Huang, "Similarity Measures for Text Document Clustering," in Proceedings of the 6th New Zealand Computer Science Research Student Conference (NZCSRSC), Christchurch, New Zealand, 2008, vol. 4, pp. 9–56.

International Journal of Intelligent Systems and Applications (IJISA)