Analysis of the Error Pattern of HMM based Bangla ASR

By Shourin R. Aura Md. Jakaria Rahimi Oli Lowna Baroi

DOI: https://doi.org/10.5815/ijigsp.2020.01.01, Pub. Date: 8 Feb. 2020

Speech Recognition research has been ongoing for more than 80 years. Various attempts have been made to develop and improve speech recognition process around the world. Research on ASR by machine has attracted much attention over the last few decades. Bengali is largely spoken all over the world. There are lots of scopes yet to explore in the research regarding offline automatic Bangla speech recognition system. In our work, a moderate size speech corpus and a HMM based speech recognizer have been built to analyze the error pattern. Audio recordings have been collected from different persons in both quiet and noisy area. Live test has been carried out also to check the performance of the model individually. The percentage of the error and the percentage of correction with the created models are presented in this paper along with the results obtained during the live test. Finally, the results are analyzed to get the error pattern needed for future development.

[...] Read more.

Secure Data Transmission in Video Format Based on LSB and Huffman Coding

By Shwe Sin Myat Than

DOI: https://doi.org/10.5815/ijigsp.2020.01.02, Pub. Date: 8 Feb. 2020

The growth of needing to transmit bit amount of data through the internet in secure format encourage the research for steganography technique, especially in video file. Stenographic technics in video format gives many advantages to transportation of important data because video files are a part of people’s daily life and the attackers can’t notice easily. The high embedding capacity of video file improves the popularity of video steganography among the various media types. Therefore, the simplest form but with many advantage of (Least significant bit) LSB, that is enforced with the high compression method of Huffman chunk coding method is proposed in this paper to embed data in video file in multi-step cryptography embedding schemes. The intension is to get more secure nature of the system and to get more embedding capacity system. The experiments are carried out with various sizes of video files and text file sizes are used to show the effectiveness of the proposed methods. The results manifest superior performance for proposed algorithm with the performance parameters like Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE) and Bit Error Rate (BER) are calculated to test the quality of stego video.

[...] Read more.

Island Loss for Improving the Classification of Facial Attributes with Transfer Learning on Deep Convolutional Neural Network

By Shuvendu Roy

DOI: https://doi.org/10.5815/ijigsp.2020.01.03, Pub. Date: 8 Feb. 2020

Classification task on the human facial attribute is hard because of the similarities in between classes. For example, emotion classification and age estimation are two important applications. There are very little changes between the different emotions of a person and a different person has a different way of expressing the same emotion. Same for age classification. There is little difference between consecutive ages. Another problem is the image resolution. Small images contain less information and large image requires a large model and lots of data to train properly. To solve both of these problems this work proposes using transfer learning on a pre-trained model combining a custom loss function called Island Loss to reduce the intra-class variation and increase the inter-class variation. The experiments have shown impressive results on both of the application with this method and achieved higher accuracies compared to previous methods on several benchmark datasets.

[...] Read more.

Segment-wise Quality Evaluation for Identification of Face Spoofing

By Akhilesh Kumar Pandey Rajoo Pandey

DOI: https://doi.org/10.5815/ijigsp.2020.01.04, Pub. Date: 8 Feb. 2020

Non-intrusive nature of the face-based recognition technology makes it more popular among hand held devices. Spoof detection in face-based recognition systems has been an important topic of the research in the last decade. Among several techniques available in the literature for liveness detection, image quality measure (IQM) based technique are particularly attractive due to their computational efficiency. In this paper, an approach based on segment-wise computation of image quality measures is proposed to improve the accuracy of detection. Two types of the non-overlapping segments are considered here: 1) rectangular segments of identical sizes, 2) segment based on neighborhood variance. It is found that both approaches exhibit better performance in comparison with other techniques without increasing too much computational complexity. The experiments are carried out with well-known Replay-Attack database to prove its robustness under different conditions.

[...] Read more.

Lung Tumor Segmentation and Staging from CT Images Using Fast and Robust Fuzzy C-Means Clustering

By Rupak Bhakta A. B. M. Aowlad Hossain

DOI: https://doi.org/10.5815/ijigsp.2020.01.05, Pub. Date: 8 Feb. 2020

Lung tumor is the result of abnormal and uncontrolled cell division and growth in lung region. Earlier detection and staging of lung tumor is of great importance to increase the survival rate of the suffered patients. In this paper, a fast and robust Fuzzy c-means clustering method is used for segmenting the tumor region from lung CT images. Morphological reconstruction process is performed prior to Fuzzy c-means clustering to achieve robustness against noises. The computational efficiency is improved through median filtering of membership partition. Tumor masks are then reconstructed using surface based and shape based filtering. Different features are extracted from the segmented tumor region including maximum diameter and the tumor stage is determined according to the tumor staging system of American Joint Commission on Cancer. 3D shape of the segmented tumor is reconstructed from series of 2D CT slices for volume measurement. The accuracy of the proposed system is found as 92.72% for 55 randomly selected images from the RIDER Lung CT dataset of Cancer imaging archive. Lower complexity in terms of iterations and connected components as well as better noise robustness are found in comparison with conventional Fuzzy c-means and k-means clustering techniques.

[...] Read more.

International Journal of Image, Graphics and Signal Processing (IJIGSP)

MECS Press Journal

Table Of Contents

Analysis of the Error Pattern of HMM based Bangla ASR

Secure Data Transmission in Video Format Based on LSB and Huffman Coding

Island Loss for Improving the Classification of Facial Attributes with Transfer Learning on Deep Convolutional Neural Network

Segment-wise Quality Evaluation for Identification of Face Spoofing

Lung Tumor Segmentation and Staging from CT Images Using Fast and Robust Fuzzy C-Means Clustering