Vineela Krishna. Suri; Prasad. GVSNRV

A Hybrid CNN-Transformer Model for Multimodal Fake News Detection Using Feature Fusion

PDF (1272KB), PP.132-146

Views: 0 Downloads: 0

Author(s)

Vineela Krishna. Suri ^1,* Prasad. GVSNRV ²

1. Department of CSE, Jawaharlal Nehru Technological University, Kakinada, Andhra Pradesh, India

2. Department of CSE, Seshadri Rao Gudlavalleru Engineering College, Gudlavalleru, Andhra Pradesh, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2026.02.08

Received: 1 Apr. 2025 / Revised: 6 Sep. 2025 / Accepted: 31 Jan. 2026 / Published: 8 Apr. 2026

Index Terms

Multimodal Data, Fake News Detection, Convolution Neural Networks, Multimodal Fusion, Transformers

Abstract

The widespread distribution of fake news poses a critical societal challenge by influencing public opinion and shaping political discourse. Addressing this problem requires models that can capture multimodal cues beyond text alone. This work proposes a lightweight Multimodal Cross-attention Fusion–based Fake News Detection (MCAF-FND) model which combines textual and visual features through cross-attention strategy. The study evaluates MCAF-FND on the Fakeddit benchmark, a large-scale dataset comprising 682,996 multimodal samples collected from social media. Textual features are extracted using DistilBERT, while spatially aware image representations are derived from VGG-19 convolutional layers. The cross-attention module enables semantic alignment between text tokens and image patches, modeling inter-modal dependencies more effectively than conventional fusion strategies. The fused representation is classified using a Multilayer Perceptron(MLP) with softmax, ensuring contributions from both modalities. Experimental results demonstrate that MCAF-FND consistently outperforms unimodal baselines and traditional fusion methods, achieving 93.2% accuracy with strong precision, recall, and F1-score. Cross-attention based visualizations illustrate how the model aligns textual cues with salient visual regions, enhancing interpretability. By combining computational efficiency with robust multimodal reasoning, the proposed approach provides a reliable and extensible solution for automated fake news detection.

Cite This Paper

Vineela Krishna. Suri, Prasad. GVSNRV, "A Hybrid CNN-Transformer Model for Multimodal Fake News Detection Using Feature Fusion", International Journal of Modern Education and Computer Science(IJMECS), Vol.18, No.2, pp. 132-146, 2026. DOI:10.5815/ijmecs.2026.02.08

Reference

[1]Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. SIGKDD Explorations Newsletter, 19(1), 22–36. https://doi.org/10.1145/3137597.3137600
[2]Haque, M. M., Yousuf, M., Alam, A. S., Saha, P., Ahmed, S. I., & Hassan, N. (2020). Combating misinformation in Bangladesh: Roles and responsibilities as perceived by journalists, fact-checkers, and users. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2), Article 130, 1–32. https://doi.org/10.1145/3415201
[3]Balogun, T. E., Abosede, S. A., Joda, S. C., Balogun, O., Olotu, P. K., & Faluyi, S. G. (2024). Machine learning approaches for detecting and mitigating the impact of fake news in online information ecosystems. In 2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG) (pp. 1–6). https://doi.org/10.1109/SEB4SDG60871.2024.10630248
[4]Segura-Bedmar, I., & Alonso-Bartolome, S. (2022). Multimodal fake news detection. Information, 13(6), 284. https://doi.org/10.3390/info13060284
[5]Raval, M. S., Roy, M., & Kuribayashi, M. (2022). Survey on vision-based fake news detection and its impact analysis. In 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (pp. 1837–1841). https://doi.org/10.23919/APSIPAASC55919.2022.9980089
[6]Alnabhan, M. Q., & Branco, P. (2024). Fake news detection using deep learning: A systematic literature review. IEEE Access, 12, 114435–114459. https://doi.org/10.1109/ACCESS.2024.3435497
[7]Wang, W. Y. (2017). “Liar, liar pants on fire”: A new benchmark dataset for fake news detection. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 422–426). https://doi.org/10.48550/arXiv.1705.00648
[8]Põldvere, N., Uddin, Z., & Thomas, A. (2023). The PolitiFact-Oslo corpus: A new dataset for fake news analysis and detection. Information, 14(12), 627. https://doi.org/10.3390/info14120627
[9]Mattern, J., Qiao, Y., Kerz, E., Wiechmann, D., & Strohmaier, M. (2021). FANG-COVID: A new large-scale benchmark dataset for fake news detection in German. https://doi.org/10.18653/v1/2021.fever-1.9
[10]Choi, H., & Ko, Y. (2022). Effective fake news video detection using domain knowledge and multimodal data fusion on YouTube. Pattern Recognition Letters, 154, 44–52. https://doi.org/10.1016/j.patrec.2022.01.007
[11]Yang, Y., Zheng, L., Zhang, J., Cui, Q., Li, Z., & Yu, P. S. (2018). TI-CNN: Convolutional neural networks for fake news detection. https://doi.org/10.48550/arXiv.1806.00749
[12]Patel, S., & Surati, S. (2024). MTL-rtFND: Multimodal transfer learning for real-time fake news detection on social media. https://doi.org/10.1007/978-3-031-53731-8_19
[13]Yadav, A., & Gupta, A. (2024). An emotion-driven, transformer-based network for multimodal fake news detection. International Journal of Multimedia Information Retrieval, 13. https://doi.org/10.1007/s13735-023-00315-3
[14]Zhang, L., Zhang, X., Zhou, Z., Huang, F., & Li, C. (2024). Reinforced adaptive knowledge learning for multimodal fake news detection. In Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 16777–16785. https://doi.org/10.1609/aaai.v38i15.29618
[15]Yadav, A., Gaba, S., Khan, H., Budhiraja, I., Singh, A., & Singh, K. K. (2024). ETMA: Efficient transformer-based multilevel attention framework for multimodal fake news detection. IEEE Transactions on Computational Social Systems, 11(4), 5015–5027. https://doi.org/10.1109/TCSS.2023.3255242
[16]Wang, J., Zheng, J., Yao, S., Wang, R., & Du, H. (2023). TLFND: A multimodal fusion model based on three-level feature matching distance for fake news detection. Entropy, 25(11), 1533. https://doi.org/10.3390/e25111533
[17]Liang, Z. (2023). Fake news detection based on multimodal inputs. Computers, Materials & Continua, 75, 4519–4534. https://doi.org/10.32604/cmc.2023.037035
[18]Jing, J., Wu, H., Sun, J., Fang, X., & Zhang, H. (2023). Multimodal fake news detection via progressive fusion network. Information Processing & Management, 60(1), 103120. https://doi.org/10.1016/j.ipm.2022.103120
[19]Fakeddit website. Available at: https://fakeddit.netlify.app/
[20]Fakeddit GitHub repository. Available at: https://github.com/entitize/Fakeddit
[21]Nakamura, K., Levy, S., & Wang, W. Y. (2020). Fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection. In Proceedings of the Twelfth Language Resources and Evaluation Conference (pp. 6149–6157).
[22]Yu, F., Liu, Q., Wu, S., Wang, L., Tan, T., et al. (2017). A convolutional approach for misinformation identification. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 3901–3907). https://doi.org/10.24963/ijcai.2017/545
[23]Jin, Z., Cao, J., Guo, H., Zhang, Y., & Luo, J. (2017). Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM International Conference on Multimedia (pp. 795–816). https://doi.org/10.1145/3123266.312345
[24]Khattar, D., Goud, J. S., Gupta, M., & Varma, V. (2019). MVAE: Multimodal variational autoencoder for fake news detection. In The World Wide Web Conference (WWW ’19). https://doi.org/10.1145/3308558.3313552
[25]Castillo, C., Mendoza, M., & Poblete, B. (2011). Information credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web (WWW ’11) (pp. 675–684). https://doi.org/10.1145/1963405.1963500
[26]Shu, K., Mahudeswaran, D., Wang, S., Lee, D., & Liu, H. (2020). FakeNewsNet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data, 8(3), 171–188. https://doi.org/10.1089/big.2020.0062

International Journal of Modern Education and Computer Science (IJMECS)