Methods and Tools for Identifying Human Resource Lesions in Emergency Based on Multimodal Analysis and Deep Learning

PDF (1053KB), PP.17-53

Views: 0 Downloads: 0

Author(s)

Yurii Ushenko 1,2,* Dmytro Uhryn 3 Victoria Vysotska 4,5 Lyubomyr Chyrun 4,6,7 Zhengbing Hu 8 Tetiana Rekunenko 10,9

1. Department of Physics, Shaoxing University, Shaoxing, Zhejiang Province 312000, China

2. Department of Computer Science, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58012, Ukraine

3. Department of Computer Science of the Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58012, Ukraine

4. Information Systems and Networks Department, Lviv Polytechnic National University, Lviv, 79013, Ukraine

5. Combating Cybercrime Department, Kharkiv National University of Internal Affairs, Kharkiv, 61080, Ukraine

6. Computer Science Department, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58012, Ukraine

7. Applied Mathematics Department, Ivan Franko National University of Lviv, Lviv, 79000, Ukraine

8. School of Computer Science and Artificial Intelligence, Hubei University of Technology, Wuhan, China

9. Department of Education Quality Assurance, Kharkiv National University of Internal Affairs, 27, L. Landau Avenue, 61080 Kharkiv, Ukraine

10. Department of Administrative Law and Process, Donetsk State University of Internal Affairs, 1, Velyka Perspektyvna Street, Kropyvnytskyi, 25000, Ukraine

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2026.03.02

Received: 15 Jan. 2026 / Revised: 23 Feb. 2026 / Accepted: 18 Mar. 2026 / Published: 8 Jun. 2026

Index Terms

Multimodal Analysis, Deep Learning, Lesion Identification, Emergencies, Transformer Cross-Attention, Medical Triage, Triage, Neural Networks, Decision Support

Abstract

Emergencies of natural, technological, and military origin require rapid and accurate assessment of victims' conditions to support effective rescue and medical response. Traditional visual examination methods are often limited by stress, time pressure, and incomplete information, leading to delayed or inaccurate decisions. This study proposes a multimodal deep learning approach for automated identification of human resource lesions in emergency scenarios. The developed framework integrates visual, audio, and text/sensory data using convolutional neural networks, Transformer-based models, and a Transformer Cross-Attention fusion mechanism. The proposed architecture enables effective extraction and integration of heterogeneous features for lesion classification, severity estimation, and automated medical triage. Experimental evaluation was conducted on multimodal datasets containing injury images, audio recordings, and symptom descriptions. The model was trained using a combined loss function and evaluated with classification, regression, and triage metrics. The results demonstrate high system performance, achieving a macro-F1 score of 0.87, validation accuracy of 86–87%, and triage accuracy above 90%, including 95% for the RED category. The regression model for severity prediction achieved an R² value of 0.92, while modality importance analysis confirmed the dominant contribution of visual information. The experiments also showed stable model convergence and strong generalisation ability without significant overfitting. The proposed multimodal framework confirms the effectiveness of deep learning and cross-attention mechanisms for automated lesion identification and emergency medical triage. The developed approach can be applied in decision-support systems for rescue operations, emergency medicine, and intelligent VR/AR training simulators.

Cite This Paper

Yurii Ushenko, Dmytro Uhryn, Victoria Vysotska, Lyubomyr Chyrun, Zhengbing Hu, Tetiana Rekunenko, "Methods and Tools for Identifying Human Resource Lesions in Emergency Based on Multimodal Analysis and Deep Learning", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.18, No.3, pp. 17-53, 2026. DOI:10.5815/ijigsp.2026.03.02

Reference

[1]H.-Y. Zhou, Y. Yu, C. Wang, S. Zhang, Y. Gao, J. Pan, J. Shao, G. Lu, K. Zhang, and W. Li, “A transformer-based representation learning model with unified processing of multimodal input for clinical diagnostics,” Nature Biomedical Engineering, vol. 7, pp. 743–755, 2023, doi: 10.1038/s41551-023-01045-x.
[2]Y. Li, M. E. H. Daho, P.-H. Conze, R. Zeghlache, H. L. Boité, R. Tadayoni, B. Cochener, M. Lamard, and G. Quellec, “A review of deep learning-based information fusion for medical classification tasks,” Computers in Biology and Medicine, vol. 177, p. 108635, 2024, doi: 10.1016/j.compbiomed.2024.108635.
[3]Z. Yao, F. Lin, S. Chai, W. He, L. Dai, and X. Fei, “Integrating medical imaging and clinical reports using multimodal deep learning for advanced disease analysis,” arXiv preprint, 2024, doi: 10.48550/arXiv.2405.17459.
[4]B. Jorf and F. Shamout, “MedPatch: Confidence-guided multi-stage fusion for multimodal clinical data,” arXiv preprint, 2025, doi: 10.48550/arXiv.2508.09182.
[5]J. Tang, T. Li, L. Liu, and D. Wu, “Rapid trauma classification under data scarcity: An emergency on-scene decision model combining natural language processing and machine learning,” Medical & Biological Engineering & Computing, vol. 63, no. 12, pp. 3521–3530, 2025, doi: 10.1007/s11517-025-03414-x.
[6]K.-C. Chin, Y.-C. Cheng, J.-T. Sun, C.-Y. Ou, C.-H. Hu, M.-C. Tsai, M. H.-M. Ma, W.-C. Chiang, and A. Y. Chen, “Machine learning–based text analysis to predict severely injured patients in emergency medical dispatch: Model development and validation,” Journal of Medical Internet Research, vol. 24, no. 6, p. e30210, 2022, doi: 10.2196/30210.
[7]A. Guerra-Manzanares and F. E. Shamout, “MIND: Modality-informed knowledge distillation framework for multimodal clinical prediction tasks,” arXiv preprint, 2025, doi: 10.48550/arXiv.2502.01158.
[8]P. Zhang, W. Zhang, J. Weng, and Z. Shen, “Construction and effectiveness test of multimodal data fusion prediction model for intracranial infection after severe craniocerebral injury in children based on deep learning,” BMC Neurology, vol. 25, p. 517, 2025, doi: 10.1186/s12883-025-04502-z.
[9]S. Orenuga, P. Jordache, D. Mirzai, T. Monteros, E. Gonzalez, A. Madkoor, R. Hirani, R. K. Tiwari, and M. Etienne, “Traumatic brain injury and artificial intelligence: Shaping the future of neurorehabilitation—A review,” Life, vol. 15, no. 3, p. 424, 2025, doi: 10.3390/life15030424.
[10]S. E. Erginoğlu, N. K. Ülgen, N. Yiğit, A. S. Nazlıgül, and M. O. Akkurt, “Multimodal large language model for fracture detection in emergency orthopedic trauma: A diagnostic accuracy study,” Diagnostics, vol. 16, no. 3, p. 476, 2026.
[11]T. Cai, H. Ni, M. Yu, X. Huang, K. Wong, J. Volpi, J. Z. Wang, and S. T. C. Wong, “DeepStroke: An efficient stroke screening framework for emergency rooms with multimodal adversarial deep learning,” arXiv preprint, 2021, doi: 10.48550/arXiv.2109.12065.
[12]B. Yuda, S. Jihan, Y. Guo, A. Anees, Z. Feng, and V. Calhoun, “MultiViT2: A data-augmented multimodal neuroimaging prediction framework via latent diffusion model,” arXiv preprint, 2025, doi: 10.48550/arXiv.2506.13667.
[13]J. Deng, T. Deng, Y. C. Zhang et al., “Interpretable multiomics models for predicting surgical interventions and blood transfusion requirements in traumatic brain injury,” npj Digital Medicine, vol. 8, p. 693, 2025, doi: 10.1038/s41746-025-02072-5.
[14]G. B. Berikol, A. Kanbakan, B. Ilhan, and F. Doğanay, “Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education,” Turkish Journal of Emergency Medicine, vol. 25, no. 2, pp. 67–91, 2025, doi: 10.4103/tjem.tjem_45_25.
[15]V. Lytvyn, V. Vysotska, S. Tyshko, O. Lavrut, T. Lavrut, and M. Nazarkevych, “Diagnostic method development when weapons characteristics measuring based on spectral analysis for signals phase shift determination,” in MoDaST, pp. 42–58, 2024.
[16]Z. Hu, D. Uhryn, Y. Ushenko, V. Korolenko, V. Lytvyn, and V. Vysotska, “System programming of a disease identification model based on medical images,” in Proc. Sixteenth International Conference on Correlation Optics, vol. 12938, pp. 59–62, SPIE, 2024.
[17]V. Lytvyn, Y. Burov, P. Kravets, V. Vysotska, A. Demchuk, A. Berko, Y. Ryshkovets, S. Shcherbak, and O. Naum, “Methods and models of intellectual processing of texts for building ontologies of software for medical terms identification in content classification,” in IDDM, pp. 354–368, 2019.
[18]M. Zakharov, D. Uhryn, D. Iliuk, Y. Masikevych, O. Iliuk, and V. Vysotska, “Objective readiness scoring of jurors using health metrics, genetic markers, and AI,” in 2025 IEEE 13th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS), pp. 274–279, IEEE, 2025.