ISSN: 2074-9074 (Print)
ISSN: 2074-9082 (Online)
DOI: https://doi.org/10.5815/ijigsp
Website: https://www.mecs-press.org/ijigsp
Published By: MECS Press
Frequency: 6 issues per year
Number(s) Available: 140
IJIGSP is committed to bridge the theory and practice of images, graphics, and signal processing. From innovative ideas to specific algorithms and full system implementations, IJIGSP publishes original, peer-reviewed, and high quality articles in the areas of images, graphics, and signal processing. IJIGSP is a well-indexed scholarly journal and is indispensable reading and references for people working at the cutting edge of images, graphics, and signal processing applications.
IJIGSP has been abstracted or indexed by several world class databases: Scopus, Google Scholar, Microsoft Academic Search, CrossRef, Baidu Wenku, IndexCopernicus, IET Inspec, EBSCO, JournalSeek, ULRICH's Periodicals Directory, WorldCat, Scirus, Academic Journals Database, Stanford University Libraries, Cornell University Library, UniSA Library, CNKI Scholar, ProQuest, J-Gate, ZDB, BASE, OhioLINK, iThenticate, Open Access Articles, Open Science Directory, National Science Library of Chinese Academy of Sciences, The HKU Scholars Hub, etc..
IJIGSP Vol. 17, No. 6, Dec. 2025
REGULAR PAPERS
Environmental pollution resulting from waste is a critical global challenge that significantly affects both the environment and public health, especially in countries like Indonesia. Effective waste management and recycling depend on accurately detecting and classifying different waste types. This study tackles this challenge by evaluating the YOLOv8s algorithm for object detection and conducting a comparative analysis of two mobile-optimized convolutional neural networks (CNNs), MobileNetV2 and EfficientNet, for waste classification. The YOLOv8s model established a promising baseline for detection, achieving a mean Average Precision (mAP@50) of 0.621 on the hold-out test set. MobileNetV2 proved to be the superior architecture in the classification task, attaining a higher accuracy of 94.4% compared to EfficientNet’s 87.8%. Additionally, MobileNetV2 demonstrated significantly greater computational efficiency, with a processing time of 229 ms per step, in contrast to EfficientNet’s 606 ms per step. These findings confirm that combining YOLOv8s for detection and MobileNetV2 for classification provides a robust and efficient pathway for developing automated waste management systems.
[...] Read more.According to the World Health Organization (WHO) touchstones of 2022, Tuberculosis is the second dominant disease after COVID-19. Around one-fourth of the comprehensive population is ascertained to have tuberculosis. Timely detection and prevention of tuberculosis is a must to overcome its harmful effects. The method most often used in ascertaining whether a patient has tuberculosis, is examining his or her sputum sample. In the process, the isolation of the bacilli is done manually, and hence it is prone to error. Segmentation illustrates and enlightens objects or particles within an image, thus extracting the Region of Interest (ROI). The contemplated study uses TransUNet architecture to segment tuberculosis bacilli from sputum images to increase diagnostic accuracy and performance. The attention mechanism used in the TransUNet model helps to identify the spatial hierarchies present in image. It is an extremely tough task for naive or traditional segmentation algorithms to deal with the inherent complexity of sputum images. Hence, this study introduces an approach to capture the intrinsic features and dependencies needed to segment mycobacterium or TB bacilli by leveraging the TransUNet model. The model achieved an average Dice Score of 92.795%, a mean Intersection over Union (IoU) of 88.845%, and a segmentation accuracy of 99.19% on the Mosaic and Ziehl-Neelsen datasets. These results surpassed several existing state-of-art methods like UNet, clustering and thresholding, depicting the superior capability of TransUNet in segmenting the TB bacilli. It deepens the potentiality of transformer-based CNN models, especially TransUNet, for improving the diagnosis of tuberculosis and supporting disease management.
[...] Read more.Breast cancer is one of the most common and serious types of cancer. It can affect people of all ages and genders around the world. The increasing incidence of breast cancer, coupled with its complexity, has placed a significant burden on healthcare systems and patients alike. Traditional diagnostic methods, while effective, often face limitations in early detection and accurate prognosis, which are critical for improving patient outcomes. In recent years, artificial intelligence (AI) and machine learning (ML) are changing the way we solve problems and make decisions in the field of medical diagnostics, enhancing the ability to detect, diagnose and predict breast cancer. However, there are still challenges, such as the need for large and diverse datasets to train these models, making AI tools work smoothly in hospitals, and addressing ethical concerns in healthcare. This paper looks at how AI and ML are used in breast cancer care, especially in analyzing real-world medical data like images, histopathology, and other datasets such as doctor notes & discharge summaries, to identify patterns that may be unnoticeable to medical experts. Large Language Models (LLMs) using embeddings, are highlighted for their capacity to improve the accuracy of image related interpretations, potentially detect early-stage tumours, and predict disease progression and treatment responses. Real-world medical datasets have been collected and analysed using different models. A publicly available Convolutional Neural Network (CNN) and a custom-built Large Language Model (LLM) with embeddings were tested. The Generative AI model achieved 98.44% accuracy, significantly higher than the traditional AI model's 61.72%. Future research can explore how Generative AI can help classify patients based on risk levels. This could lead to personalized treatment plans, reducing unnecessary treatments and improving patients' quality of life. Given the research is primarily focussed on breast cancer, there is an attempt to showcase that by harnessing the power of AI and ML, there is potential to significantly reduce the global burden of breast cancer, offering new avenues for early detection, accurate diagnosis, and tailored therapeutic strategies. Continued research and collaboration among oncologists, data scientists, and policymakers are essential to fully realize the benefits of AI in the fight against breast cancer, ultimately leading to better patient outcomes and a decrease in breast cancer-related mortality.
[...] Read more.A noninvasive blood hemoglobin monitoring device was designed specifically for monitoring anemia and polycythemia. Invasive techniques, which are painful and expensive, are commonly used to estimate blood hemoglobin concentrations. This paper presents a noninvasive method for monitoring blood hemoglobin values. A photodiode and a near-infrared (NIR) LED with a wavelength of 940 nm were used to construct a finger probe. At 940 nm wavelength shows distinct variation between oxygenated and deoxygenated hemoglobin and single-wavelength systems significantly reduce hardware complexity, cost, power consumption, and size. Use a continuous-wave NIR LED light through the finger to check the sensitivity of different hemoglobin concentrations. A total of 100 patients participated in our proposed device for evaluating noninvasive hemoglobin concentration. These participants collected both invasive and noninvasive hemoglobin concentration values. The correlation coefficient between the predicted (noninvasive) hemoglobin value and the reference (invasive) hemoglobin value was 0.9496, with a normalized root mean squared error (NRMSE) of 0.6504 and a mean absolute percentage error (MAPE) of 0.0505. The noninvasive blood hemoglobin level was classified using the k-nearest neighbour (kNN) classifier, and the proposed device accuracy was calculated at 90%. The Bland-Altman methodology evaluated differences between invasive and noninvasive blood hemoglobin concentrations. The absolute mean difference was 0.1124 (95% confidence interval [CI] -0.01535 to 0.2401), with an upper agreement limit of 1.374 (95% CI [1.153 - 1.595]) and a lower agreement limit of -1.149 (95% CI [-1.371 - 0.9282]).
[...] Read more.Using deep learning approaches, recognizing human actions from video sequences by automatically deriving significant representations has demonstrated effective results from unprocessed video information. Artificial intelligence (AI) systems, including monitoring, automation, and human-computer interface, have become crucial for security and human behaviour analysis. For the visual depiction of video clips during the training phase, the existing action identification algorithms mostly use pre-trained weights of various AI designs, which impact the characteristics discrepancies and perseverance, including the separation among the visual and temporal indicators. The research proposes a 3-dimensional Convolutional Neural Network and Long Short-Term Memory (3D-CNN-LSTM) network that strategically concentrates on useful information in the input frame to recognize the various human behaviours in the video frames to overcome this problem. The process utilizes stochastic gradient descent (SGD) optimization to identify the model parameters that best match the expected and observed outcomes. The proposed framework is trained, validated, and tested using publicly accessible UCF11 benchmark dataset. According to the experimental findings of this work, the accuracy rate was 93.72%, which is 2.42% higher compared to the state-of-the-art previous best result. When compared to several other relevant techniques that are already in use, the suggested approach achieved outstanding performance in terms of accuracy.
[...] Read more.Onion size is a crucial physiological characteristic that can be explained by a number of factors, including diameter, weight, volume, and length. Determining the size of onions is frequently necessary for sorting them for a variety of reasons, including processing machine specifications, legal requirements for sorting standards, and consumer preferences. In the process of phenotyping onions, size is another crucial quantitative feature to consider. Traditionally, algorithms based on morphology, colour, thresholding, and geometric approaches have been used to estimate the shape and size of onions. However, research that relies on these geometric or colour-based functions is limited to approximations and frequently produces erroneous results when conducted at precisely controlled heights. Healthy onions are collected and utilized as an input dataset for this paper. The gathered images are pre-processed to reduce noise and improve contrast by applying the circular adaptive median filter and homomorphic filtering with Elk-herd optimization. Next, utilizing the dilated and deformable feature pyramid network, object detection is performed on the pre-processed images. To segment the onion from the image for removing the unwanted portions, an edge-based segmentation algorithm is used, such as an edge-attention guidance network. The dual attention fusion-net, which ranks data into labelled groups and measures onion size. Accuracy, confusion metrics, FDR, hit rate, and other performance metrics are assessed for both the current and proposed models in the proposed model. Consequently, the suggested onion size detection approach outperforms the current algorithm. This method produced 97.6% accuracy, 2.9% FDR, 96% Hit Rate, 98.5% Selectivity, and 97.3% NPV. Thus, this proposed approach is the best choice for detecting the size of the onion.
[...] Read more.Language Identification (LID) is a subset of Dialect Identification that addresses specific challenges and matters related to linguistic similarity between dialects. Various current approaches are used for dialect identification, but automated prediction is difficult because the clarity of voices is not in a perfect range, and inaccurate selection of features. It is essential to utilize an appropriate feature subset that contains sufficient signal information for the learning model to correctly recognize language dialects. So as to eradicate the mentioned issues, optimized stacking based ensemble learning is developed. The identification process initiates with the pre-processing by using an adaptive least mean square filter and a fractional bandpass filter. The features from the pre-processed audio signal will be extracted by using Gammatone frequency Cepstral coefficients (GFCC) and Shifted Delta Cepstral Coefficient (SDCC). Then, the extracted features will be reduced with the help of Independent Component Analysis (ICA). Furtherly, the classification of selected features will be further given to the Recurrent Neural Network (RNN), which acts as a meta-classifier and additionally gets information from a pair of distinct classifiers, such as Radial Basis Functional Neural Network (RBFNN) and Deep Belief Network (DBN). The hyperparameter present in the RNN classifier was tuned using the Deer Hunting Optimization Algorithm (DHOA). The proposed approach has an accuracy of 97%, a precision of 96%, also an F1-score of 97%. Therefore, for an automatic dialect identification, the suggested approach is the best option.
[...] Read more.Semantic understanding of camera-captured scene text images is an important problem in computer vision. Scene character recognition is the pivotal task in this problem, and deep learning is now-a-days the most prospective approach. However, limited sample-size of scene character datasets appear to be a major hindrance for training deep networks. In this paper, we present (i) various augmentation techniques for increasing the sample size of such datasets along with associated insights, (ii) an extended version of the popular Chars74k dataset (herein referred to as E-Chars74k), and (iii) the benchmark performance on the developed E-Chars74k dataset. Experiments on various sets of data such as digits, alphabets and their combination, belonging to the usual as well as wild scenarios, clearly reflect significant performance gain (20%-30% increase in scene character recognition accuracy). It is noteworthy to mention that in all these experiments, a deep convolutional neural network powered with two conv-pool pairs is trained with the uniform training test partition to foster comparison on equal bench.
[...] Read more.Because of the nature of the Internet and the growing number of people using digital media, copyright protection is becoming more important. One of the most common ways to protect this is by implementing digital image watermarking. This protection method safeguards the image from unauthorized access. The Gorilla Troop Optimization Algorithm (GTO), a new evolutionary algorithm, is what we propose to be a powerful watermarking technique. Initially, we applied Discrete Wavelet Transform (DWT) to the cover image, followed by Singular Value Decomposition (SVD) for enhanced security, and finally, we applied SVD to the Watermark image for its embedding into the cover image. In this process, we aim to optimize the multiple scaling factors (MSFs) by applying the GTO algorithm and testing the proposed algorithm in the MATLAB environment using some standard images. We then evaluated the experiment using performance metrics such as Normalized Cross-Correlation (NCC), the Structural Similarity Index (SSIM), and the Peak Signal-to-Noise Ratio (PSNR). These metrics proved the imperceptibility of different attacks and the proposed algorithm’s performance.
[...] Read more.In this paper, we have presented Discrete Wavelet Transform (DWT) based Fast Output Generating Set Partitioning in Hierarchical Trees (FOGSPIHT) algorithm for MRI brain image compression. The FOGSPIHT is scalable, faster, and robust algorithm. Image compression is an important technique that enables fast and high throughput imaging applications by reducing the storage space or transmission bandwidth. DWT transforms the image to get a set of coefficients that are used for efficient compression. The Set Partitioning In Hierarchical Trees (SPIHT) algorithm is an efficient algorithm used for DWT based image compression. The limitations of SPIHT coding are the complexity and memory requirements. To reduce the complexity, we propose the FOGSPIHT algorithm that works on the basic principles of SPIHT. The FOGSPIHT algorithm works on coefficients that are converted to bit planes. FOGSPIHT eliminates the comparison operations in the compression process of SPIHT by simple logical operations on bits. The values of Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR), and Structural Similarity Index Measure (SSIM) are calculated and plotted against Compression Ratio (CR). The result obtained with the FOGSPIHT algorithm is equal to or better than the SPIHT algorithm. The FOGSPIHT algorithm is faster which has reduced encoding and decoding time. The implementation of the FOGSPIHT algorithm with an 8x8 image DWT coefficient on FPGA requires the lower amount of resource and power requirements in comparison with the SPIHT algorithm.
[...] Read more.Mushrooms are the most familiar delicious food which is cholesterol free as well as rich in vitamins and minerals. Though nearly 45,000 species of mushrooms have been known throughout the world, most of them are poisonous and few are lethally poisonous. Identifying edible or poisonous mushroom through the naked eye is quite difficult. Even there is no easy rule for edibility identification using machine learning methods that work for all types of data. Our aim is to find a robust method for identifying mushrooms edibility with better performance than existing works. In this paper, three ensemble methods are used to detect the edibility of mushrooms: Bagging, Boosting, and random forest. By using the most significant features, five feature sets are made for making five base models of each ensemble method. The accuracy is measured for ensemble methods using five both fixed feature set-based models and randomly selected feature set based models, for two types of test sets. The result shows that better performance is obtained for methods made of fixed feature sets-based models than randomly selected feature set-based models. The highest accuracy is obtained for the proposed model-based random forest for both test sets.
[...] Read more.Image Processing is the art of examining, identifying and judging the significances of the Images. Image enhancement refers to attenuation, or sharpening, of image features such as edgels, boundaries, or contrast to make the processed image more useful for analysis. Image enhancement procedures utilize the computers to provide good and improved images for study by the human interpreters. In this paper we proposed a novel method that uses the Genetic Algorithm with Multi-objective criteria to find more enhance version of images. The proposed method has been verified with benchmark images in Image Enhancement. The simple Genetic Algorithm may not explore much enough to find out more enhanced image. In the proposed method three objectives are taken in to consideration. They are intensity, entropy and number of edgels. Proposed algorithm achieved automatic image enhancement criteria by incorporating the objectives (intensity, entropy, edges). We review some of the existing Image Enhancement technique. We also compared the results of our algorithms with another Genetic Algorithm based techniques. We expect that further improvements can be achieved by incorporating linear relationship between some other techniques.
[...] Read more.This paper presents a design and development of an Artificial Intelligence (AI) based mobile application to detect the type of skin disease. Skin diseases are a serious hazard to everyone throughout the world. However, it is difficult to make accurate skin diseases diagnosis. In this work, Deep learning algorithms Convolution Neural Networks (CNN) is proposed to classify skin diseases on the HAM10000 dataset. An extensive review of research articles on object identification methods and a comparison of their relative qualities were given to find a method that would work well for detecting skin diseases. The CNN-based technique was recognized as the best method for identifying skin diseases. A mobile application, on the other hand, is built for quick and accurate action. By looking at an image of the afflicted area at the beginning of a skin illness, it assists patients and dermatologists in determining the kind of disease present. Its resilience in detecting the impacted region considerably faster with nearly 2x fewer computations than the standard MobileNet model results in low computing efforts. This study revealed that MobileNet with transfer learning yielding an accuracy of about 85% is the most suitable model for automatic skin disease identification. According to these findings, the suggested approach can assist general practitioners in quickly and accurately diagnosing skin diseases using the smart phone.
[...] Read more.In the field of medical image analysis, supervised deep learning strategies have achieved significant development, while these methods rely on large labeled datasets. Self-Supervised learning (SSL) provides a new strategy to pre-train a neural network with unlabeled data. This is a new unsupervised learning paradigm that has achieved significant breakthroughs in recent years. So, more and more researchers are trying to utilize SSL methods for medical image analysis, to meet the challenge of assembling large medical datasets. To our knowledge, so far there still a shortage of reviews of self-supervised learning methods in the field of medical image analysis, our work of this article aims to fill this gap and comprehensively review the application of self-supervised learning in the medical field. This article provides the latest and most detailed overview of self-supervised learning in the medical field and promotes the development of unsupervised learning in the field of medical imaging. These methods are divided into three categories: context-based, generation-based, and contrast-based, and then show the pros and cons of each category and evaluates their performance in downstream tasks. Finally, we conclude with the limitations of the current methods and discussed the future direction.
[...] Read more.Image analysis belongs to the area of computer vision and pattern recognition. These areas are also a part of digital image processing, where researchers have a great attention in the area of content retrieval information from various types of images having complex background, low contrast background or multi-spectral background etc. These contents may be found in any form like texture data, shape, and objects. Text Region Extraction as a content from an mage is a class of problems in Digital Image Processing Applications that aims to provides necessary information which are widely used in many fields medical imaging, pattern recognition, Robotics, Artificial intelligent Transport systems etc. To extract the text data information has becomes a challenging task. Since, Text extraction are very useful for identifying and analysis the whole information about image, Therefore, In this paper, we propose a unified framework by combining morphological operations and Genetic Algorithms for extracting and analyzing the text data region which may be embedded in an image by means of variety of texts: font, size, skew angle, distortion by slant and tilt, shape of the object which texts are on, etc. We have established our proposed methods on gray level image sets and make qualitative and quantitative comparisons with other existing methods and concluded that proposed method is better than others.
[...] Read more.Denoising is a vital aspect of image preprocessing, often explored to eliminate noise in an image to restore its proper characteristic formation and clarity. Unfortunately, noise often degrades the quality of valuable images, making them meaningless for practical applications. Several methods have been deployed to address this problem, but the quality of the recovered images still requires enhancement for efficient applications in practice. In this paper, a wavelet-based universal thresholding technique that possesses the capacity to optimally denoise highly degraded noisy images with both uniform and non-uniform variations in illumination and contrast is proposed. The proposed method, herein referred to as the modified wavelet-based universal thresholding (MWUT), compared to three state-of-the-art denoising techniques, was employed to denoise five noisy images. In order to appraise the qualities of the images obtained, seven performance indicators comprising the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Structural Content (SC), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index Method (SSIM), Signal-to-Reconstruction-Error Ratio (SRER), Blind Spatial Quality Evaluator (NIQE), and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) were employed. The first five indicators – RMSE, MAE, SC, PSNR, SSIM, and SRER- are reference indicators, while the remaining two – NIQE and BRISQUE- are referenceless. For the superior performance of the proposed wavelet threshold algorithm, the SC, PSNR, SSIM, and SRER must be higher, while lower values of NIQE, BRISQUE, RMSE, and MAE are preferred. A higher and better value of PSNR, SSIM, and SRER in the final results shows the superior performance of our proposed MWUT denoising technique over the preliminaries. Lower NIQE, BRISQUE, RMSE, and MAE values also indicate higher and better image quality results using the proposed modified wavelet-based universal thresholding technique over the existing schemes. The modified wavelet-based universal thresholding technique would find practical applications in digital image processing and enhancement.
[...] Read more.This Image deblurring aims to eliminate or decrease the degradations that has been occurred while the image has been obtained. In this paper, we proposed a unified framework for restoration process by enhancement and more quantified deblurred images with the help of Genetic Algorithm. The developed method uses an iterative procedure using evolutionary criteria and produce better images with most restored frequency-content. We have compared the proposed methods with Lucy-Richardson Restoration method, method proposed by W. Dong [34] and Inverse Filter Restoration Method; and demonstrated that the proposed method is more accurate by achieving high quality visualized restored images in terms of various statistical quality measures.
[...] Read more.During past few years, brain tumor segmentation in magnetic resonance imaging (MRI) has become an emergent research area in the ?eld of medical imaging system. Brain tumor detection helps in finding the exact size and location of tumor. An efficient algorithm is proposed in this paper for tumor detection based on segmentation and morphological operators. Firstly quality of scanned image is enhanced and then morphological operators are applied to detect the tumor in the scanned image.
[...] Read more.This paper performs three different contrast testing methods, namely contrast stretching, histogram equalization, and CLAHE using a median filter. Poor quality images will be corrected and performed with a median filter removal filter. STARE dataset images that use images with different contrast values for each image. For this reason, evaluating the results of the three parameters tested are; MSE, PSNR, and SSIM. With the gray level scale image and contrast stretching which stretches the pixel value by stretching the stretchlim technique with the MSE result are 9.15, PSNR is 42.14 dB, and SSIM is 0.88. And the HE method and median filter with the results of the average value of MSE is 18.67, PSNR is 41.33 dB, and SSIM is 0.77. Whereas for CLAHE and median filters the average yield of MSE is 28.42, PSNR is 35.30 dB, and SSIM is 0.86. From the test results, it can be seen that the proposed method has MSE and PSNR values as well as SSIM values.
[...] Read more.This article proposes a receiving device in which arbitrary input signals are subject to pre-detector processing for the subsequent implementation of the idea of compressing broadband modulated pulses with a matched filter to increase the signal-to-noise ratio and improve resolution. For this purpose, a model of a dispersive delay line is developed based on series-connected high-frequency time delay lines with taps in the form of bandpass filters, and analysis of this model is performed as a part of the radio receiving device with chirp signal compression. The article presents the mathematical description of the processes of formation and compression of chirp signals based on their matched filtering using the developed model and proposes the block diagram of a radio receiving device using the principle of compression of received signals. The proposed model can be implemented in devices for receiving unknown signals, in particular in passive radar. It also can be used for studying signal compression processes based on linear frequency modulation in traditional radar systems.
[...] Read more.Mushrooms are the most familiar delicious food which is cholesterol free as well as rich in vitamins and minerals. Though nearly 45,000 species of mushrooms have been known throughout the world, most of them are poisonous and few are lethally poisonous. Identifying edible or poisonous mushroom through the naked eye is quite difficult. Even there is no easy rule for edibility identification using machine learning methods that work for all types of data. Our aim is to find a robust method for identifying mushrooms edibility with better performance than existing works. In this paper, three ensemble methods are used to detect the edibility of mushrooms: Bagging, Boosting, and random forest. By using the most significant features, five feature sets are made for making five base models of each ensemble method. The accuracy is measured for ensemble methods using five both fixed feature set-based models and randomly selected feature set based models, for two types of test sets. The result shows that better performance is obtained for methods made of fixed feature sets-based models than randomly selected feature set-based models. The highest accuracy is obtained for the proposed model-based random forest for both test sets.
[...] Read more.Image Processing is the art of examining, identifying and judging the significances of the Images. Image enhancement refers to attenuation, or sharpening, of image features such as edgels, boundaries, or contrast to make the processed image more useful for analysis. Image enhancement procedures utilize the computers to provide good and improved images for study by the human interpreters. In this paper we proposed a novel method that uses the Genetic Algorithm with Multi-objective criteria to find more enhance version of images. The proposed method has been verified with benchmark images in Image Enhancement. The simple Genetic Algorithm may not explore much enough to find out more enhanced image. In the proposed method three objectives are taken in to consideration. They are intensity, entropy and number of edgels. Proposed algorithm achieved automatic image enhancement criteria by incorporating the objectives (intensity, entropy, edges). We review some of the existing Image Enhancement technique. We also compared the results of our algorithms with another Genetic Algorithm based techniques. We expect that further improvements can be achieved by incorporating linear relationship between some other techniques.
[...] Read more.Denoising is a vital aspect of image preprocessing, often explored to eliminate noise in an image to restore its proper characteristic formation and clarity. Unfortunately, noise often degrades the quality of valuable images, making them meaningless for practical applications. Several methods have been deployed to address this problem, but the quality of the recovered images still requires enhancement for efficient applications in practice. In this paper, a wavelet-based universal thresholding technique that possesses the capacity to optimally denoise highly degraded noisy images with both uniform and non-uniform variations in illumination and contrast is proposed. The proposed method, herein referred to as the modified wavelet-based universal thresholding (MWUT), compared to three state-of-the-art denoising techniques, was employed to denoise five noisy images. In order to appraise the qualities of the images obtained, seven performance indicators comprising the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Structural Content (SC), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index Method (SSIM), Signal-to-Reconstruction-Error Ratio (SRER), Blind Spatial Quality Evaluator (NIQE), and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) were employed. The first five indicators – RMSE, MAE, SC, PSNR, SSIM, and SRER- are reference indicators, while the remaining two – NIQE and BRISQUE- are referenceless. For the superior performance of the proposed wavelet threshold algorithm, the SC, PSNR, SSIM, and SRER must be higher, while lower values of NIQE, BRISQUE, RMSE, and MAE are preferred. A higher and better value of PSNR, SSIM, and SRER in the final results shows the superior performance of our proposed MWUT denoising technique over the preliminaries. Lower NIQE, BRISQUE, RMSE, and MAE values also indicate higher and better image quality results using the proposed modified wavelet-based universal thresholding technique over the existing schemes. The modified wavelet-based universal thresholding technique would find practical applications in digital image processing and enhancement.
[...] Read more.In the field of medical image analysis, supervised deep learning strategies have achieved significant development, while these methods rely on large labeled datasets. Self-Supervised learning (SSL) provides a new strategy to pre-train a neural network with unlabeled data. This is a new unsupervised learning paradigm that has achieved significant breakthroughs in recent years. So, more and more researchers are trying to utilize SSL methods for medical image analysis, to meet the challenge of assembling large medical datasets. To our knowledge, so far there still a shortage of reviews of self-supervised learning methods in the field of medical image analysis, our work of this article aims to fill this gap and comprehensively review the application of self-supervised learning in the medical field. This article provides the latest and most detailed overview of self-supervised learning in the medical field and promotes the development of unsupervised learning in the field of medical imaging. These methods are divided into three categories: context-based, generation-based, and contrast-based, and then show the pros and cons of each category and evaluates their performance in downstream tasks. Finally, we conclude with the limitations of the current methods and discussed the future direction.
[...] Read more.Ultrasound based breast screening is gaining attention recently especially for dense breast. The technological advancement, cancer awareness, and cost-safety-availability benefits lead rapid rise of breast ultrasound market. The irregular shape, intensity variation, and additional blood vessels of malignant cancer are distinguishable in ultrasound images from the benign phase. However, classification of breast cancer using ultrasound images is a difficult process owing to speckle noise and complex textures of breast. In this paper, a breast cancer classification method is presented using VGG16 model based transfer learning approach. We have used median filter to despeckle the images. The layers for convolution process of the pretrained VGG16 model along with the maxpooling layers have been used as feature extractor and a proposed fully connected two layers deep neural network has been designed as classifier. Adam optimizer is used with learning rate of 0.001 and binary cross-entropy is chosen as the loss function for model optimization. Dropout of hidden layers is used to avoid overfitting. Breast Ultrasound images from two databases (total 897 images) have been combined to train, validate and test the performance and generalization strength of the classifier. Experimental results showed the training accuracy as 98.2% and testing accuracy as 91% for blind testing data with a reduced of computational complexity. Gradient class activation mapping (Grad-CAM) technique has been used to visualize and check the targeted regions localization effort at the final convolutional layer and found as noteworthy. The outcomes of this work might be useful for the clinical applications of breast cancer diagnosis.
[...] Read more.This paper presents a design and development of an Artificial Intelligence (AI) based mobile application to detect the type of skin disease. Skin diseases are a serious hazard to everyone throughout the world. However, it is difficult to make accurate skin diseases diagnosis. In this work, Deep learning algorithms Convolution Neural Networks (CNN) is proposed to classify skin diseases on the HAM10000 dataset. An extensive review of research articles on object identification methods and a comparison of their relative qualities were given to find a method that would work well for detecting skin diseases. The CNN-based technique was recognized as the best method for identifying skin diseases. A mobile application, on the other hand, is built for quick and accurate action. By looking at an image of the afflicted area at the beginning of a skin illness, it assists patients and dermatologists in determining the kind of disease present. Its resilience in detecting the impacted region considerably faster with nearly 2x fewer computations than the standard MobileNet model results in low computing efforts. This study revealed that MobileNet with transfer learning yielding an accuracy of about 85% is the most suitable model for automatic skin disease identification. According to these findings, the suggested approach can assist general practitioners in quickly and accurately diagnosing skin diseases using the smart phone.
[...] Read more.Image analysis belongs to the area of computer vision and pattern recognition. These areas are also a part of digital image processing, where researchers have a great attention in the area of content retrieval information from various types of images having complex background, low contrast background or multi-spectral background etc. These contents may be found in any form like texture data, shape, and objects. Text Region Extraction as a content from an mage is a class of problems in Digital Image Processing Applications that aims to provides necessary information which are widely used in many fields medical imaging, pattern recognition, Robotics, Artificial intelligent Transport systems etc. To extract the text data information has becomes a challenging task. Since, Text extraction are very useful for identifying and analysis the whole information about image, Therefore, In this paper, we propose a unified framework by combining morphological operations and Genetic Algorithms for extracting and analyzing the text data region which may be embedded in an image by means of variety of texts: font, size, skew angle, distortion by slant and tilt, shape of the object which texts are on, etc. We have established our proposed methods on gray level image sets and make qualitative and quantitative comparisons with other existing methods and concluded that proposed method is better than others.
[...] Read more.Diabetic retinopathy is one of the most serious eye diseases and can lead to permanent blindness if not diagnosed early. The main cause of this is diabetes. Not every diabetic will develop diabetic retinopathy, but the risk of developing diabetes is undeniable. This requires the early diagnosis of Diabetic retinopathy. Segmentation is one of the approaches which is useful for detecting the blood vessels in the retinal image. This paper proposed the three models based on a deep learning approach for recognizing blood vessels from retinal images using region-based segmentation techniques. The proposed model consists of four steps preprocessing, Augmentation, Model training, and Performance measure. The augmented retinal images are fed to the three models for training and finally, get the segmented image. The proposed three models are applied on publically available data set of DRIVE, STARE, and HRF. It is observed that more thin blood vessels are segmented on the retinal image in the HRF dataset using model-3. The performance of proposed three models is compare with other state-of-art-methods of blood vessels segmentation of DRIVE, STARE, and HRF datasets.
[...] Read more.Image reconstruction is the process of generating an image of an object from the signals captured by the scanning machine. Medical imaging is an interdisciplinary field combining physics, biology, mathematics and computational sciences. This paper provides a complete overview of image reconstruction process in MRI (Magnetic Resonance Imaging). It reviews the computational aspect of medical image reconstruction. MRI is one of the commonly used medical imaging techniques. The data collected by MRI scanner for image reconstruction is called the k-space data. For reconstructing an image from k-space data, there are various algorithms such as Homodyne algorithm, Zero Filling method, Dictionary Learning, and Projections onto Convex Set method. All the characteristics of k-space data and MRI data collection technique are reviewed in detail. The algorithms used for image reconstruction discussed in detail along with their pros and cons. Various modern magnetic resonance imaging techniques like functional MRI, diffusion MRI have also been introduced. The concepts of classical techniques like Expectation Maximization, Sensitive Encoding, Level Set Method, and the recent techniques such as Alternating Minimization, Signal Modeling, and Sphere Shaped Support Vector Machine are also reviewed. It is observed that most of these techniques enhance the gradient encoding and reduce the scanning time. Classical algorithms provide undesirable blurring effect when the degree of phase variation is high in partial k-space. Modern reconstructions algorithms such as Dictionary learning works well even with high phase variation as these are iterative procedures.
[...] Read more.Nowadays, the primary concern of any society is providing safety to an individual. It is very hard to recognize the human behaviour and identify whether it is suspicious or normal. Deep learning approaches paved the way for the development of various machine learning and artificial intelligence. The proposed system detects real-time human activity using a convolutional neural network. The objective of the study is to develop a real-time application for Activity recognition using with and without transfer learning methods. The proposed system considers criminal, suspicious and normal categories of activities. Differentiate suspicious behaviour videos are collected from different peoples(men/women). This proposed system is used to detect suspicious activities of a person. The novel 2D-CNN, pre-trained VGG-16 and ResNet50 is trained on video frames of human activities such as normal and suspicious behaviour. Similarly, the transfer learning in VGG16 and ResNet50 is trained using human suspicious activity datasets. The results show that the novel 2D-CNN, VGG16, and ResNet50 without transfer learning achieve accuracy of 98.96%, 97.84%, and 99.03%, respectively. In Kaggle/real-time video, the proposed system employing 2D-CNN outperforms the pre-trained model VGG16. The trained model is used to classify the activity in the real-time captured video. The performance obtained on ResNet50 with transfer learning accuracy of 99.18% is higher than VGG16 transfer learning accuracy of 98.36%.
[...] Read more.