Barker J, Vincent E, Ma N, Christensen C, Green P: The PASCAL CHiME speech separation and recognition challenge. Comput. Speech Lang 2013, 27(3):621-633. 10.1016/j.csl.2012.10.004
Article
Google Scholar
Droppo J, Acero A: robustness, Environmental. In Springer Handbook of Speech Processing. Edited by: Benesty J, Sondhi MM, Huang Y. New York: Springer; 2008:653-679.
Chapter
Google Scholar
Gales MJF: Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang 1998, 12(2):75-98. 10.1006/csla.1998.0043
Article
Google Scholar
Omologo M, Svaizer P, Matassoni M: Environmental conditions and acoustic transduction in hands-free speech recognition. Speech Commun. 1998, 25: 75-95. 10.1016/S0167-6393(98)00030-2
Article
Google Scholar
Martin R: Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process 2001, 9(5):504-512. 10.1109/89.928915
Article
Google Scholar
Hermansky H, Morgan N: RASTA processing of speech. IEEE Trans. Speech Audio Proc 1994, 2(4):578-589. 10.1109/89.326616
Article
Google Scholar
Gales MJF, Young SJ: A fast and flexible implementation of parallel model combination. ICASSP, 1995, 1: 133-136.
Google Scholar
Kim C: Signal processing for robust speech recognition motivated by auditory processing. Ph.D. Thesis, CMU 2010
Google Scholar
Brown GJ, Palomaki KJ: A reverberation-robust automatic speech recognition system based on temporal masking. J. Acoustical Soc. Am 2008, 123(5):2978.
Article
Google Scholar
Ghitza O: Auditory models and human performance in tasks related to speech coding and speech recognition. IEEE Trans. Speech Audio Proc. SAP-2(1) 1994, 115-132.
Google Scholar
Kim D-S, Lee S-Y, Kil RM: Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Trans Speech Audio Proc 1999, 7: 55-69. 10.1109/89.736331
Article
Google Scholar
Dimitriadis D, Maragos P, Potamianos A: On the effects of filterbank design and energy computation on robust speech recognition. IEEE Trans. Audio Speech Lang. Proc 2011, 19: 1504-1516.
Article
Google Scholar
Flynn R, Jones E: A comparative study of auditory-based front-ends for robust speech recognition using the Aurora 2 database. Paper presented at the IET Irish signals and systems conference Dublin, Ireland, 28–30, June 2006 pp. 28–30
Google Scholar
Schluter R, Bezrukov I, Wagner H, Ney H: Gammatone features and feature combination for large vocabulary speech recognition. Paper presented in the IEEE international conference on acoustics, speech, and signal processing (ICASSP) Honolulu, HI, USA, 15–20 April 2007 pp. 649–652
Google Scholar
Shao Y, Jin Z, Wang DL, Srinivasan S: An auditory-based feature for robust speech recognition. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP) Taipei, Taiwan, 19–24 April 2009 pp. 4625–4628
Google Scholar
Drullman R, Festen J, Plomp R: Effect of reducing slow temporal modulations on speech reception. J. Acoustical Soc. Am 1994, 95: 2670-2680. 10.1121/1.409836
Article
Google Scholar
Kanedera N, Arai T, Hermansky H, Pavel M: On the importance of various modulation frequencies for speech recognition. Paper presented at the Eurospeech Rhodes Greece, 22–25 Sept 1997 pp. 1079–1082
Google Scholar
Falk TH, Chan WY: Modulation spectral features for robust far-field speaker identification. IEEE Trans. Audio Speech Lang. Process 2010, 18(1):90-100.
Article
Google Scholar
Maganti HK, Matassoni M: An auditory based modulation spectral feature for reverberant speech recognition. Paper presented at the 13th annual conference of the International Speech Communication Association (Interspeech) Makuhari, Japan, 26–30 Sept 2010 pp. 570–573
Google Scholar
Deng L, Sheikhzadeh H: Use of temporal codes computed from a cochlear model of speech recognition, chapter 15. In Listening to Speech: An Auditory Perspective. Edited by: Greenberg S, Ainsworth W. Mahwah: Lawrence Erlbaum; 2006:237-256.
Google Scholar
Kleinschmidt M, Tchorz J, Kollmeier B: Combining speech enhancement and auditory feature extraction for robust speech recognition. Speech Commun. 2001, 34: 75-91. 10.1016/S0167-6393(00)00047-9
Article
Google Scholar
Dau T, Pueschel D, Kohlrausch A: A quantitative model of the effective signal processing in the auditory system. J. Acoustical Soc. Am 1996, 99: 3615-3622. 10.1121/1.414959
Article
Google Scholar
Xiong X, Eng Siong C, Haizhou L: Normalization of the speech modulation spectra for robust speech recognition. IEEE Trans. Audio Speech Lang. Proc 2008, 16(8):1662-1674.
Article
Google Scholar
Mitra V, Franco H, Graciarena M, Mandal A: Normalized amplitude modulation features for large vocabulary noise-robust speech recognition. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP) Kyoto, Japan, 25–30 March 2012, pp. 4117–4120
Google Scholar
Valente F, Magimai-Doss M, Plahl C, Ravuri SV: Hierarchical processing of the modulation spectrum for GALE Mandarin LVCSR system. Paper presented at the meeting of the International Speech Communication Association (Interspeech) Brighton, UK, 6–10 Sept 2009, pp. 2963–2966
Google Scholar
Chiu Y-HB, Raj B, Stern RM: Learning-based auditory encoding for robust speech recognition. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP) Dallas, TX, USA, 14–19 March 2010, pp. 4278–4281
Google Scholar
Zhao X, Shao Y, Wang DL: CASA-based robust speaker identification. IEEE Trans. Audio Speech Lang. Proc 2012, 20–25: 1608-1616.
Article
Google Scholar
Zhao X, Wang DL: Analyzing noise robustness of MFCC and GFCC features in speaker identification. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP) Vancouver, Canada, 26–31 May 2013, pp. 7204–7208
Google Scholar
Matassoni M, Maganti HK, Omologo M: Non-linear spectro-temporal modulations for reverberant speech recognition. Paper presented at the joint workshop on hands-free speech communication and microphone arrays (HSCMA) Edinburgh, Scotland, 30 May–1 June 2011, pp. 115–120
Google Scholar
Slaney M: An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank. in Apple technical report, Perception Group, 1993
Google Scholar
Glasberg B, Moore B: Derivation of auditory filter shapes from notched-noise data. Hearing Res 1990, 47: 103-108. 10.1016/0378-5955(90)90170-T
Article
Google Scholar
Ellis DPW: Gammatone-like spectrograms,. . Accessed 6 June 2011. http://www.ee.columbia.edu/~dpwe/resources/matlab/gammatonegram/
Parihar N, Picone J, Pearce D, Hirsch HG: Performance analysis of the Aurora large vocabulary baseline system. Paper presented at the 12th European signal processing conference (EUSIPCO)n Vienna, Austria, 6–10 Sept 2004, pp. 553–556
Google Scholar