Estimation and Statistical Analysis of Physical Task Stress on Human Speech Signal

Full Text (PDF, 405KB), PP.29-34

Views: 0 Downloads: 0


Saloni 1 R. K. Sharma 1 Anil K. Gupta 1

1. Department of Electronics and Communication Engineering National Institute of Technology, Kurukshetra, Haryana

* Corresponding author.


Received: 24 Jun. 2016 / Revised: 29 Jul. 2016 / Accepted: 6 Sep. 2016 / Published: 8 Oct. 2016

Index Terms

Speech signal, Physical task stress, Glottal flow parameters, MFCC, Energy, speech processing


Human speech signal is an acoustic wave, which conveys the information about the words or message being spoken, identity of the speaker, language spoken, the presence and type of speech pathologies, the physical and emotional state of the speaker. Speech under physical task stress shows variations from the speech in neutral state and thus degrades the speech system performance. In this paper we have characterized the voice samples under physical stress and the acoustic parameters are compared with the neutral state voice parameters. The traditional voice measures, glottal flow parameters, mel frequency cepstrum coefficients and energy in various frequency bands are used for this characterization. T-test is performed to check the statistical significance of parameters. Significant variations are noticed in the parameters under two states. Pitch, intensity, energy values are high for the physically stressed voice; On the other hand glottal parameter values get decreased. Cepstrum coefficients shift up from the coefficients of neutral state voice samples. Energy in lower frequency bands was more sensitive to physical stress. This study improves the performance of various speech processing applications by analyzing the unwanted effect of physical stress in voice. 

Cite This Paper

Saloni, R. K. Sharma, Anil K. Gupta,"Estimation and Statistical Analysis of Physical Task Stress on Human Speech Signal", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.8, No.10, pp.29-34, 2016. DOI: 10.5815/ijigsp.2016.10.04


[1]Alku, Paavo. "Glottal inverse filtering analysis of human voice production—A review of estimation and parameterization methods of the glottal excitation and their applications." Sadhana 36.5 (2011): 623-650.

[2]Haskell, William L., et al. "Physical activity and public health: updated recommendation for adults from the American College of Sports Medicine and the American Heart Association." Circulation 116.9 (2007): 1081. 

[3]Primov-Fever, Adi, et al. "The Effect of Physical Effort on Voice Characteristics." Folia Phoniatrica et Logopaedica 65.6 (2013): 288-293. 

[4]Mathers-Schmidt, Barbara A., and L. R. Brilla. "Inspiratory muscle training in exercise-induced paradoxical vocal fold motion." Journal of Voice 19.4 (2005): 635-644.

[5]Godin, Keith W., and John HL Hansen. "Analysis and perception of speech under physical task stress." INTERSPEECH. 2008.

[6]Johannes, Bernd, et al. "Non-linear function model of voice pitch dependency on physical and mental load." European journal of applied physiology 101.3 (2007): 267-276.

[7]Koblick, Heather M. Effects of Simulataneous Exercise And Speech Tasks On The Perception Of Effort And Vocal Measures In Aerobic Instructors. Diss. University of Central Florida Orlando, Florida, 2004. 

[8]Godin, Keith W., and John HL Hansen. "Analysis of the effects of physical task stress on the speech signal." The Journal of the Acoustical Society of America 130.6 (2011): 3992-3998.

[9]Styler, Will. "Using Praat for linguistic research." University of Colorado at Boulder Phonetics Lab (2013).

[10]Airas, Matti. "TKK Aparat: An environment for voice inverse filtering and parameterization." Logopedics Phoniatrics Vocology 33.1 (2008): 49-64.

[11]Mendoza, Leonardo Alfredo Forero, et al. "Classification of voice aging using ANN and glottal signal parameters." ANDESCON, 2010 IEEE. IEEE, 2010. 

[12]Chadawan Ittichaichareon, Siwat Suksri and Thaweesak Yingthawornsuk,"Speech Recognition using MFCC", International Conference on Computer Graphics, Simulation and Modeling (ICGSM'2012) , (2012),135-138.

[13]Saldanha, Jennifer C., T. Ananthakrishna, and Rohan Pinto. "Vocal fold pathology assessment using mel-frequency cepstral coefficients and linear predictive cepstral coefficients features." Journal of Medical Imaging and Health Informatics 4.2 (2014): 168-173.

[14]Zulfiqar, Ali, Aslam Muhammad, and AM Martinez Enriquez. "A speaker identification system using MFCC features with VQ technique." Intelligent Information Technology Application, 2009. IITA 2009. Third International Symposium on. Vol. 3. IEEE, 2009.

[15]Torres, Juan F., Elliot Moore, and Ernest Bryant. "A study of Glottal waveform features for deceptive speech classification." Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on. IEEE, 2008.