International Journal of Image, Graphics and Signal Processing (IJIGSP)

IJIGSP Vol. 18, No. 1, Feb. 2026

Cover page and Table of Contents: PDF (size: 962KB)

Table Of Contents

REGULAR PAPERS

Uncertainty-Aware Source-Free Domain Adaptation for Dental CBCT Image Segmentation

By Sviatoslav Dziubenko Tymur Dovzhenko Andriy Kyrylyuk Kamila Storchak

DOI: https://doi.org/10.5815/ijigsp.2026.01.01, Pub. Date: 8 Feb. 2026

The aim of this study is evaluating the efficacy of combining source-free domain adaptation techniques with quantitative uncertainty assessment, aimed at enhancing image segmentation in new domains. The research employs an uncertainty-aware source-free domain adaptation strategy, encompassing the generation of pseudo-labels, their filtration based on entropy and variance of predictions, alongside the involvement of an Exponential Moving Average (EMA) teacher and a tailored loss function. For validation purposes, segmentation models pre-trained on one image dataset were subsequently adapted to another dataset. A comprehensive comparative and ablation analysis, coupled with the visualization of the correlation between segmentation errors and the degree of uncertainty, was conducted. The ablation study corroborated that the complete configuration with the EMA teacher yielded the most favorable results. Data visualization elucidated a direct correlation between high uncertainty and an increased risk for segmentation errors. The findings of this study substantiate the viability of employing uncertainty assessment within the source-free domain adaptation process for clinical dentistry. The proposed methodology facilitates the adaptation of models to new conditions without necessitating retraining, thereby rendering the decision-making process more transparent. Future studies should consider assessing the efficacy of the proposed approach in additional dental visualization tasks, such as implant planning or orthodontic analysis.

[...] Read more.
Fuzzy-Enhanced U-Net with Dual Attention for Histopathological Image Analysis in High Grade Serous Ovarian Cancer

By Anandakumar K. Chandrasekar C

DOI: https://doi.org/10.5815/ijigsp.2026.01.02, Pub. Date: 8 Feb. 2026

High-quality image reconstruction plays an important part in histopathological image analysis, especially for HGSOC diagnosis, because of a great deal of fine cellular structures that should be clearly visible. In real scenarios, however, medical images usually face a series of problems due to acquisition limitations, which might obscure some significant diagnostic features. This work presents FUDA-NET, a new image denoising framework that enhances noisy histopathological images while maintaining the integrity of structure and texture. The architecture is based on an improved U-Net design integrated with a dual attention mechanism- Channel and Spatial attention, which enables the network to selectively emphasize meaningful features and suppress background noise. Additionally, a fuzzy logic layer is incorporated at the bottleneck to handle uncertainty and enhance contextual reasoning during feature extraction. This proposed FUDA-NET framework combines Mean Squared Error (MSE) and Structural Similarity Index Measures (SSIM) based loss function to ensure both pixel wise accuracy and perception similarity. Experiment conducted on 12,019 training images and 1188 testing images of High Grade Serous Ovarian Cancer, histopathological data set shows that FUDA-NET achieves superior denoising performance outperforming traditional and recent deep learning methods such as DnCNN, U-Net, U-Net with Attention and Noise2Noise in terms of PSNR, SSIM, MSE, MAE and FSIM. This approach contributes to improve visual clarity and diagnostic reliability in medical imaging.

[...] Read more.
An Effective Semi-Supervised Feature Extraction Model with Reduced Architectural Complexity for Image Forgery Classification

By Jisha K. R. Sabna N.

DOI: https://doi.org/10.5815/ijigsp.2026.01.03, Pub. Date: 8 Feb. 2026

A generalized deep learning approach tracking image forgeries of any category with reduced architectural complexity, without compromising the performance is presented in this paper. A convolutional encoder-decoder architecture-based image reconstruction model is framed to extract all the pertinent information from the images. Performance comparison of similar networks constructed with varying architectural complexity led to the selection of this design. The best reconstruction feature extractor showed faster convergence and improved accuracy, as observed from the training and validation performance curves. Dimensionally compressed information from the reconstruction model is utilized by dense layers and further classified. Experimenting with forgery datasets inclusive of different forgery types ensured the generalizability of the model. In comparison with the reconstruction models adopting transfer learning in the encoder side utilizing MobileNet, ResNet 50, and VGG 19, the proposed model exhibited competitive and consistently improved mean Precision and F1-score performance across multiple datasets, as validated through multi-seed experimentation. Additionally, with the reduced architecture, the proposed model performed on par than the state-of-the-art approaches against which it was compared.

[...] Read more.
Multi-Scale and Auxiliary-Supervised Architectures for Accurate Road Network Mapping

By Mohamed El Mehdi Imam Lila Meddeber Tarik Zouagui

DOI: https://doi.org/10.5815/ijigsp.2026.01.04, Pub. Date: 8 Feb. 2026

Automated road network extraction from satellite imagery represents a critical advancement for Geographic Information Systems (GIS) applications in infrastructure management and urban planning. This paper introduces two novel deep learning architectures based on LinkNet: RoadNet-MS (Multi-Scale) and RoadNet-AUX (Multi-Scale with Auxiliary Supervision), specifically designed to enhance road segmentation performance. RoadNet-MS incorporates Multi-Scale Contextual Blocks (CMS-Blocks) and hybrid blocks to effectively capture diverse contextual features at multiple scales, achieving F1-scores of 78.87% on the challenging DeepGlobe dataset and 82.30% on the Boston & Los Angeles dataset. RoadNet-AUX extends this architecture through auxiliary supervision, further improving performance with F1-scores of 79.14% on DeepGlobe and 82.33% on Boston-LA. Both proposed architectures demonstrate competitive performance and consistent improvements over existing methods, including the state-of-the-art NL-LinkNet, across both evaluation datasets. Notably, RoadNet-MS achieves the highest precision (83.55%) among all compared methods on DeepGlobe. These contributions provide a pathway toward more accurate and scalable road network mapping, essential for modern urban planning and infrastructure monitoring applications.

[...] Read more.
Segment Wise EEG Signal Compression Using LSTM Auto Encoder for Enhanced Efficiency

By Uma. M. Mohammed Javidh S. Ruchi Shah Prabhu Sethuramalingam M. M. Reddy

DOI: https://doi.org/10.5815/ijigsp.2026.01.05, Pub. Date: 8 Feb. 2026

Efficient compression of electroencephalogram (EEG) signals is crucial for enabling real-time monitoring, storage, and transmission in various medical and non-medical applications. This paper presents a segment-wise processing approach using temporal modeling-based auto encoders for EEG signal compression. By leveraging models such as Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Recurrent Neural Network (RNN), and Self-Attention, the proposed method effectively captures temporal dependencies in the EEG data. Segment-wise processing not only enhances compression efficiency but also significantly reduces the processing time of these sequence models. Extensive experiments demonstrate that GRU-based auto encoders offer the best performance, particularly at lower Data Reduction Factors (DRFs), achieving a minimal signal loss of 0.2% at a 50% compression ratio, making it suitable for medical applications. For non-medical scenarios, a higher compression ratio of 75% with a signal loss of 5.4% is found to be acceptable. The results indicate that the proposed approach achieves a favorable balance between compression efficiency, signal fidelity, and computational performance.

[...] Read more.
Automated PCOS Detection Using Fine-Grained Deep Feature Extraction and Explainable AI: A Transformer-Based Ensemble Approach

By Ifra Bilal Shah Pramod Kumar Yadav

DOI: https://doi.org/10.5815/ijigsp.2026.01.06, Pub. Date: 8 Feb. 2026

Polycystic Ovary Syndrome (PCOS) is a prevalent endocrine condition affecting women of reproductive age, hallmarked by hormonal abnormalities, ovarian cysts, and metabolic issues. Early diagnosis is essential to prevent long-term effects such as infertility, diabetes, and cardiovascular issues. Conventional diagnostic approaches relying on manual interpretation of ultrasound images are time-consuming and error-prone. To overcome these limitations, we propose an automated diagnostic framework leveraging deep feature extraction and ensemble learning. Initially, ResNet50 is utilized as a convolutional feature extractor, and its extracted features are classified using ensemble of Random Forest (RF) and Gradient Boosting (GB) classifiers. Subsequently, we also employed the Swin Transformer which is a hierarchical vision transformer to extract deep features from ultrasound images, which were fed to Random Forest and Gradient Boosting classifiers. These features were handled separately from those of ResNet50, and no feature concatenation was done. Compared to the ResNet50-based ensemble model, which achieved a classification accuracy of 99.2%, the Swin Transformer–based ensemble model performed better by attaining the accuracy of 99.87%. Furthermore, Explainable AI approaches (Grad-CAM) were applied to both ResNet50-based model and Swin Transformer-based model to highlight key regions contributing to the predictions. This scalable and interpretable system offers encouraging potential for advancing PCOS detection and other medical imaging applications.

[...] Read more.
Weighted Late Fusion based Deep Attention Neural Network for Detecting Multi-Modal Emotion

By Srinivas P. V. V. S. Shaik Nazeera Khamar Nohith Borusu Mohan Guru Raghavendra Kota Harika Vuyyuru Sampath Patchigolla

DOI: https://doi.org/10.5815/ijigsp.2026.01.07, Pub. Date: 8 Feb. 2026

In the field of affective computing research, multi-modal emotion detection has gained popularity as a way to boost recognition robustness and get around the constraints of processing a multiple type of data. Human emotions are utilized for defining a variety of methodologies, including physiological indicators, facial expressions, as well as neuroimaging tactics. Here, a novel deep attention mechanism is used for detecting multi-modal emotions. Initially, the data are collected from audio and video features. For dimensionality reduction, the audio features are extracted using Constant-Q chromagram and Mel-Frequency Cepstral Coefficients (MM-FC2). After extraction, the audio generation is carried out by a Convolutional Dense Capsule Network (Conv_DCN) is used. Next is video data; the key frame extraction is carried out using Enhanced spatial-temporal and Second-Order Gaussian kernels. Here, Second-Order Gaussian kernels are a powerful tool for extracting features from video data and converting it into a format suitable for image-based analysis. Next, for video generation, DenseNet-169 is used. At last, all the extracted features are fused, and  
emotions are detected using a Weighted Late Fusion Deep Attention Neural Network (WLF_DAttNN). Python tool is used for implementation, and the performance measure achieved an accuracy of 97% for RAVDESS and 96% for CREMA-D dataset.

[...] Read more.
A Novel Hybrid Approach Using MRMR-based Feature Selection and Bayesian Optimized Random Forest Classification for Accurate Fabric Defect Detection

By Ritu Juneja Anil Dudy

DOI: https://doi.org/10.5815/ijigsp.2026.01.08, Pub. Date: 8 Feb. 2026

The textile industry holds a central position in India's economy, contributing substantially to both employment and GDP. Despite technological advancements, maintaining stringent quality standards remains a persistent challenge due to defects such as cracks, stains, and inconsistencies in fabrics. Traditional manual inspection methods, while effective to a degree, are labor-intensive, time-consuming, and prone to human error. This paper proposes an innovative approach to address these challenges through the application of machine learning and computer vision techniques in fabric defect detection. Specifically, the research focuses on integrating advanced texture feature extraction methods—Gray-Level Co-occurrence Matrix (GLCM), Local Binary Patterns (LBP), and Histogram of Oriented Gradients (HOG)—with a robust classification framework using Bayesian optimized Random Forest. The methodology emphasizes efficient feature selection via Minimum Redundancy Maximum Relevance (MRMR), enhancing the system's accuracy and efficiency. By leveraging a comprehensive dataset from Kaggle encompassing diverse fabric defects, the proposed system aims to significantly improve defect detection accuracy, reduce manual intervention, and ensure consistent product quality across textile manufacturing processes. The highest accuracy achieved in the evaluation is 99.52%.

[...] Read more.
Diabetic Kidney Disease Prediction Using Hybrid Deep Learning Model

By Konne Madhavi Harwant Singh Arri

DOI: https://doi.org/10.5815/ijigsp.2026.01.09, Pub. Date: 8 Feb. 2026

Diabetic Kidney Disease (DKD) was recently identified as a significant microvascular consequence of diabetes. Many researchers are working on the classification of DKD from non-diabetic kidney disease (NDKD), but the required accuracy has not been achieved yet. This study aims to enhance diagnostic accuracy using a hybrid Deep Learning (DL) method, Convolutional Neural Network, and Long Short-Term Memory (CNN-LSTM). Clinical data on DKD were collected and preprocessed to address issues like missing values, duplicates, and outliers. Key preprocessing steps included imputation, z-score, min-max normalization, and feature encoding. Feature selection based on a correlation matrix identified the most relevant variables. Subsequently, both CNN-LSTM and Convolutional Neural Network (CNN) models were trained using processed data, with identical hyperparameters, as detailed in the methodology. Evaluation metrics such as Accuracy, Sensitivity, Specificity, Precision, F1-score, and ROC plots were employed to assess model performance. The CNN-LSTM model achieved a high Accuracy of 98%, surpassing the CNN model’s Accuracy of 96.5%. In addition to accuracy, all metrics showed that the CNN-LSTM outperformed the CNN.

[...] Read more.
Multimodal Emotion Recognition Using EEG and Facial Expressions with Potential Applications in Driver Monitoring

By Ch. Raga Madhuri Anideep Seelam Fatima Farheen Shaik Aadi Siva Kartheek Pamarthi Mohan Kireeti Krovi

DOI: https://doi.org/10.5815/ijigsp.2026.01.10, Pub. Date: 8 Feb. 2026

Mental conditions such as fatigue, distraction, and cognitive overload are known to contribute significantly to traffic accidents. Accurate recognition of these cognitive and emotional states is therefore important for the development of intelligent monitoring systems. In this study, a multimodal emotion recognition framework using electroencephalography (EEG) signals and facial expression features is proposed, with potential applications in driver monitoring. The approach integrates Long Short-Term Memory (LSTM) networks and Transformer architectures for EEG-based temporal feature extraction, along with Vision Transformers (ViT) for facial feature representation. Feature-level fusion is employed to combine physiological and visual modalities, enabling improved emotion classification performance compared to unimodal approaches. The model is evaluated using accuracy, precision, recall, and F1-score metrics, achieving an overall accuracy of 96.38%, demonstrating the effectiveness of multimodal learning. Although the experiments are conducted on general-purpose emotion datasets, the results indicate that the proposed framework can serve as a reliable foundation for driver monitoring applications, such as fatigue, distraction, and cognitive state assessment, in intelligent transportation systems.

[...] Read more.