IJIGSP Vol. 15, No. 5, Oct. 2023
Cover page and Table of Contents: PDF (size: 675KB)
Denoising is a vital aspect of image preprocessing, often explored to eliminate noise in an image to restore its proper characteristic formation and clarity. Unfortunately, noise often degrades the quality of valuable images, making them meaningless for practical applications. Several methods have been deployed to address this problem, but the quality of the recovered images still requires enhancement for efficient applications in practice. In this paper, a wavelet-based universal thresholding technique that possesses the capacity to optimally denoise highly degraded noisy images with both uniform and non-uniform variations in illumination and contrast is proposed. The proposed method, herein referred to as the modified wavelet-based universal thresholding (MWUT), compared to three state-of-the-art denoising techniques, was employed to denoise five noisy images. In order to appraise the qualities of the images obtained, seven performance indicators comprising the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Structural Content (SC), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index Method (SSIM), Signal-to-Reconstruction-Error Ratio (SRER), Blind Spatial Quality Evaluator (NIQE), and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) were employed. The first five indicators – RMSE, MAE, SC, PSNR, SSIM, and SRER- are reference indicators, while the remaining two – NIQE and BRISQUE- are referenceless. For the superior performance of the proposed wavelet threshold algorithm, the SC, PSNR, SSIM, and SRER must be higher, while lower values of NIQE, BRISQUE, RMSE, and MAE are preferred. A higher and better value of PSNR, SSIM, and SRER in the final results shows the superior performance of our proposed MWUT denoising technique over the preliminaries. Lower NIQE, BRISQUE, RMSE, and MAE values also indicate higher and better image quality results using the proposed modified wavelet-based universal thresholding technique over the existing schemes. The modified wavelet-based universal thresholding technique would find practical applications in digital image processing and enhancement.[...] Read more.
The presence of illegal activities such as illegitimate mining and sand theft in river dredging areas leads to economic losses. However, manual monitoring is expensive and time-consuming. Therefore, automated surveillance systems are preferred to mitigate such activities, as they are accurate and available at all times. In order to monitor river dredging areas, two essential steps for surveillance are vehicle detection and license plate recognition. Most current frameworks for vehicle detection employ plain feed-forward Convolutional Neural Networks (CNNs) as backbone architectures. However, these are scale-sensitive and cannot handle variations in vehicles' scales in consecutive video frames. To address these issues, Scale Invariant Hybrid Convolutional Neural Network (SIH-CNN) architecture is proposed for real-time vehicle detection in this study. The publicly available benchmark UA-DETRAC is used to validate the performance of the proposed architecture. Results show that the proposed SIH-CNN model achieved a mean average precision (mAP) of 77.76% on the UA-DETRAC benchmark, which is 3.94% higher than the baseline detector with real-time performance of 48.4 frames per seconds.[...] Read more.
Self-supervised learning has emerged as an effective paradigm for learning universal feature representations from vast amounts of unlabeled data. It’s remarkable success in recent years has been demonstrated in both natural language processing and computer vision domains. Serving as a cornerstone of the development of large-scale models, self-supervised learning has propelled the advancement of machine intelligence to new heights. In this paper, we draw inspiration from Siamese Networks and Masked Autoencoders to propose a denoising self-distilling Masked Autoencoder model for Self-supervised learning. The model is composed of a Masked Autoencoder and a teacher network, which work together to restore input image blocks corrupted by random Gaussian noise. Our objective function incorporates both pixel-level loss and high-level feature loss, allowing the model to extract complex semantic features. We evaluated our proposed method on three benchmark datasets, namely Cifar-10, Cifar-100, and STL-10, and compared it with classical self-supervised learning techniques. The experimental results demonstrate that our pre-trained model achieves a slightly superior fine-tuning performance on the STL-10 dataset, surpassing MAE by 0.1%. Overall, our method yields comparable experimental results when compared to other masked image modeling methods. The rationale behind our designed architecture is validated through ablation experiments. Our proposed method can serve as a complementary technique within the existing series of self-supervised learning approaches for masked image modeling, with the potential to be applied to larger datasets.[...] Read more.
The primary factors contributing to road accidents are drowsiness and fatigue. Additionally, it diminishes productivity within work environments and elevates the likelihood of accidents. The analysis of bio-signals is crucial in the examination of various physical conditions and the physiological state of an individual. Various biological signals were utilized to identify the presence of fatigue and drowsiness that is associated with fatigue. Various physiological signals were employed to identify driver or operator fatigue and drowsiness. Out of all these non-invasive signals, electrooculogram (EOG) exhibits well-accepted outcomes for detecting drowsiness and fatigue. By employing an EOG-based study, the real-time monitoring of the muscle and mental fatigue of the human subject can be done when they are engaging in their everyday activities. The present studies sought to employ a statistical analysis of electrooculograms (EOGs) to ascertain the stress levels of participants and provide insight into their state of fatigue and drowsiness. Two different experimental studies were performed with 120 and 80 healthy male and female research scholars of National Institute of Technology Durgapur, India. EOGs were recorded by the Biopac MP 45 data acquisition system at two and three different sessions of a day with huge cognitive tasks in between. Several entropies are evaluated from the time domain and frequency domain. The others complexity parameters are also incorporated to enrich the results of the experimental processes. An inferential statistical analysis based on the parametric t-test and non-parametric Wilcoxon test for study-I was considered to compare the stress levels between morning and evening sessions. Similarly, in study-II, the parametric ANOVA test and non-parametric Friedman test were carried out to monitor stress level in three different sessions of a day. The Tukey-Kramer post-hoc test is also undertaken to compare the outcomes among three different sessions and find the statistical differences based on a 5% significance level. Most complexity parameters show excellent results and clear differences in fatigue states for both the experiments and these analyses indicates the presence of onset fatigue among the subjects under consideration.[...] Read more.
We are aiming in this work to develop an improved face recognition system for person-dependent and person-independent variants. To extract relevant facial features, we are using the convolutional neural network. These features allow comparing faces of different subjects in an optimized manner. The system training module firstly recognizes different subjects of dataset, in another approach, the module processes a different set of new images. Use of CNN alone for face recognition has achieved promising recognition rate, however many other works have showed declined in recognition rate for many complex datasets. Further, use of CNN alone exhibits reduced recognition rate for large scale databases. To overcome the above problem, we are proposing a modified spatial texture pattern extraction technique namely modified Histogram oriented gradient (m-HOG) for extracting facial image features along three gradient directions along with CNN algorithm to classify the face image based on the features. In the preprocessing stage, the face region is captured by removing the background from the input face images and is resized to 100×100. The m-HOG features are retrieved using histogram channels evenly distributed between 0 and 180 degrees. The obtained features are resized as a matrix having dimension 66×198 and which are passed to the CNN to extract robust and discriminative features and are classified using softmax classification layer. The recognition rates obtained for L-Spacek, NIR, JAFFE and YALE database are 99.80%, 91.43%, 95.00% and 93.33% respectively and are found to be better when compared to the existing methods.[...] Read more.
At the current moment, all developed polarization methods utilize "single-point" statistical analysis algorithms for laser fields. A relevant task is to generalize traditional techniques by incorporating new correlation-based "two-point" algorithms for the analysis of polarization images. Theoretical foundations of the mutual and autocorrelation processing of phase maps of polarization-structural images of samples of dehydrated serum films are given. The maps of a new polarization-correlation parameters, namely complex degree of coherence (CDC) and complex degree of mutual polarization (CDMP) of soft matter layer boundary field by the example of dehydrated serum film samples are investigated. Two groups of representative samples, uterine myoma patients (control group 1) and patients with external genital endometriosis (study group 2), were considered. We applied a complex algorithm of analytical data processing - statistical (1stand 4th central statistical moments), correlation (Gram-Charlie expansion coefficients of autocorrelation functions) and fractal (fractal dimensions) parameters of polarization-correlation parameters maps. Objective markers for diagnosing extragenital endometriosis were found.[...] Read more.
E-healthcare systems (EHSD), medical communications, digital imaging (DICOM) things have gained popularity over the past decade as they have become the top contenders for interoperability and adoption as a global standard for transmitting and communicating medical data. Security is a growing issue as EHSD and DICOM have grown more usable on any-to-any devices. The goal of this research is to create a privacy-preserving encryption technique for EHSD rapid communication with minimal storage. A new 2D logistic-sine chaotic map (2DLSCM) is used to design the proposed encryption method, which has been developed specifically for peer-to-peer communications via unique keys. Through the 3D Lorenz map which feeds the initial values to it, the 2DLSCM is able to provide a unique keyspace of 2544 bits (2^544bits) in each go of peer-to-peer paired transmission. Permutation-diffusion design is used in the encryption process, and 2DLSCM with 3DLorenz system are used to generate unique initial values for the keys. Without interfering with real-time medical transmission, the approach can quickly encrypt any EHSD image and DICOM objects. To assess the method, five distinct EHSD images of different kinds, sizes, and quality are selected. The findings indicate strong protection, speed, and scalability when compared to existing similar methods in literature.[...] Read more.