Nhat-Kha Nguyen

Work place: School of Biomedical Engineering, International University, Vietnam National University HCM City, HCM City, 70000, Vietnam

E-mail: bebeiu19071@student.hcmiu.edu.vn

Website:

Research Interests: Artificial Intelligence

Biography

Nhat-Kha Nguyen received his B.S. degree in Biomedical Engineering and Biomechanics at International University, Vietnam National University Ho Chi Minh City, Vietnam, in 2023. His current research interest is wearable devices, applied Artificial Intelligence (AI) models for diseases early detection.

Author Articles
Transformer-Based vs. CNN-Based Deep Learning for Alzheimer’s Disease Classification: Performance and Deployment

By Nhat-Kha Nguyen Thi-Thu-Hien Pham Nhat-Minh Nguyen Tan-Nhu Nguyen Ngoc-Bich Le

DOI: https://doi.org/10.5815/ijisa.2026.02.04, Pub. Date: 8 Apr. 2026

It is well known that diagnosing Alzheimer's disease (AD) accurately and early is a major clinical challenge, especially when using brain MRI data to differentiate between subtle stages of cognitive decline. This study investigated the efficacy of two deep learning models for the classification of AD stages: Vision Transformer (ViT), a transformer-based architecture, and EfficientNetB7, a convolutional neural network. To enhance classification performance and address class imbalance, extensive data preprocessing and augmentation techniques were employed on the publicly accessible 'Alzheimer’s Dataset (4 class of Images)' from Kaggle. This dataset comprises 6,400 brain MRI images categorized into four AD stages: Non-Demented, Very Mild Demented, Mild Demented, and Moderate Demented. Techniques applied included cropping, horizontal and vertical flipping, 20-degree rotations, histogram equalization, Gaussian noise addition, Gaussian blurring, and thresholding, aimed at improving the representation of underrepresented classes. Hyperparameter optimization was executed via a two-phase methodology: an initial grid search to determine parameter ranges, succeeded by Bayesian optimization employing an upper confidence bound acquisition function to refine learning rates, batch sizes, momentum, and weight decay values. Experimental results indicated that EfficientNetB7 attained a classification accuracy of 93.5% with F1-scores surpassing 92% for early-stage classes, whereas Vision Transformer (ViT) recorded a lower accuracy of 88.7% and exhibited diminished sensitivity to early-stage instances. The performance disparity is due to ViT's dependence on extensive training datasets, which may restrict its generalization when utilized on comparatively smaller medical imaging datasets. The results indicate that, in dataset-constrained  
scenarios, CNN-based architectures such as EfficientNetB7 may provide more consistent and effective performance. Using distinct training, validation, and test datasets, the model's generalization, training stability, and computational efficiency were assessed. With an intuitive user interface, the top-performing model, EfficientNetB7, was implemented as a web-based application to facilitate real-time supportive predictions for research demonstration. This comparative analysis demonstrated that the CNN-based EfficientNetB7 exhibited more robustness with constrained medical imaging data and was computationally economical, but the transformer-based ViT displayed increased sensitivity to dataset size and necessitated extended training to attain similar convergence. The development of a validated and deployable AI-based Alzheimer's disease diagnostic solution showed great promise for clinical use.

[...] Read more.
Other Articles