Speech/Nonspeech Detection Using Minimal Walsh Basis Functions

Pwint, Moe; Sattar, Farook

doi:10.1155/2007/39546

Research Article
Open access
Published: 18 October 2006

Speech/Nonspeech Detection Using Minimal Walsh Basis Functions

Moe Pwint¹ &
Farook Sattar¹

EURASIP Journal on Audio, Speech, and Music Processing volume 2007, Article number: 039546 (2006) Cite this article

2053 Accesses
3 Citations
Metrics details

Abstract

This paper presents a new method to detect speech/nonspeech components of a given noisy signal. Employing the combination of binary Walsh basis functions and an analysis-synthesis scheme, the original noisy speech signal is modified first. From the modified signals, the speech components are distinguished from the nonspeech components by using a simple decision scheme. Minimal number of Walsh basis functions to be applied is determined using singular value decomposition (SVD). The main advantages of the proposed method are low computational complexity, less parameters to be adjusted, and simple implementation. It is observed that the use of Walsh basis functions makes the proposed algorithm efficiently applicable in real-world situations where processing time is crucial. Simulation results indicate that the proposed algorithm achieves high-speech and nonspeech detection rates while maintaining a low error rate for different noisy conditions.

[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

References

ITU-T Recommendation G.729 Annex B : A silence compression scheme for G.729 optimized for terminals conforming to recommendation v.70. 1996
Google Scholar
Beritelli F, Casale S, Cavallaro A: A robust voice activity detector for wireless communications using soft computing. IEEE Journal on Selected Areas in Communications 1998,16(9):1818-1829. 10.1109/49.737650
Article Google Scholar
ETSI GSM 06.94, "Digital cellular telecommunications system (phase 2+); voice activity detectors (VAD) for adaptive multi-rate (AMR) speech traffic channels; european telecommunications standards institute," 1999
Google Scholar
McKinley BL, Whipple GH: Model based speech pause detection. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '97), April 1997, Munich, Germany 2: 1179-1182.
Google Scholar
Sohn J, Kim NS, Song W: A statistical model-based voice activity detection. IEEE Signal Processing Letters 1999,6(1):1-3. 10.1109/97.736233
Article Google Scholar
Cho YD, Kondoz A: Analysis and improvement of a statistical model-based voice activity detector. IEEE Signal Processing Letters 2001,8(10):276-278. 10.1109/97.957270
Article Google Scholar
Gazor S, Zhang W: A soft voice activity detector based on a Laplacian-Gaussian model. IEEE Transactions on Speech and Audio Processing 2003,11(5):498-505. 10.1109/TSA.2003.815518
Article Google Scholar
Marzinzik M, Kollmeier B: Speech pause detection for noise spectrum estimation by tracking power envelope dynamics. IEEE Transactions on Speech and Audio Processing 2002,10(2):109-118. 10.1109/89.985548
Article Google Scholar
Sheikhzadeh H, Brennan RL, Sameti H: Real-time implementation of HMM-based MMSE algorithm for speech enhancement in hearing aid applications. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '95), May 1995, Detroit, Mich, USA 1: 808-811.
Google Scholar
Rezayee A, Gazor S: An adaptive KLT approach for speech enhancement. IEEE Transactions on Speech and Audio Processing 2001,9(2):87-95. 10.1109/89.902276
Article Google Scholar
Wei J, Du L, Yan Z, Zeng H: A new algorithm for voice activity detection. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '03), May 2003, Bangkok, Thailand 2: 588-591.
Google Scholar
Jelinek M, Labonté F: Robust signal/noise discrimination for wideband speech and audio coding. Proceedings of the IEEE Workshop on Speech Coding, September 2000, Delavan, Wis, USA 151-153.
Google Scholar
Srinivasan K, Gersho A: Voice activity detection for cellular networks. Proceedings of the IEEE Workshop on Speech Coding for Telecommunications, October 1993, Sainte-Adele, Quebec, Canada 85-86.
Chapter Google Scholar
Freeman DK, Cosier G, Southcott CB, Boyd I: The voice activity detector for the Pan-European digital cellular mobile telephone service. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '89), May 1989, Glasgow, Scotland, UK 1: 369-372.
Google Scholar
Tanyer SG, Özer H: Voice activity detection in nonstationary noise. IEEE Transactions on Speech and Audio Processing 2000,8(4):478-482. 10.1109/89.848229
Article Google Scholar
Wu Y, Li Y: Robust speech/non-speech detection in adverse conditions using the fuzzy polarity correlation method. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC '04), October 2000, The Hague, The Netherlands 4: 2935-2939.
Article Google Scholar
Quddus A, Gabbouj M: Wavelet-based corner detection technique using optimal scale. Pattern Recognition Letters 2002,23(1–3):215-220.
Article MATH Google Scholar
Arfib D, Keiler F, Zölzer U: DAFX - Digital Audio Effects. John Wiley & Sons, New York, NY, USA; 2002.
Google Scholar
Adjouadi M, Candocia F, Riley J: Exploiting Walsh-based attributes to stereo vision. IEEE Transactions on Signal Processing 1996,44(2):409-420. 10.1109/78.485936
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore
Moe Pwint & Farook Sattar

Authors

Moe Pwint
View author publications
You can also search for this author in PubMed Google Scholar
Farook Sattar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moe Pwint.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Pwint, M., Sattar, F. Speech/Nonspeech Detection Using Minimal Walsh Basis Functions. J AUDIO SPEECH MUSIC PROC. 2007, 039546 (2006). https://doi.org/10.1155/2007/39546

Download citation

Received: 01 November 2005
Revised: 30 May 2006
Accepted: 12 June 2006
Published: 18 October 2006
DOI: https://doi.org/10.1155/2007/39546

Speech/Nonspeech Detection Using Minimal Walsh Basis Functions

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords