TY - JOUR AU - Sakhnov, K. AU - Verteletskaya, E. AU - Simak, B. PY - 2009 DA - 2009// TI - Approach for energy-based voice detector with adaptive scaling factor JO - IAENG Int. J. Comput. Sci. VL - 36 ID - Sakhnov2009 ER - TY - CHAP AU - Beritelli, F. AU - Casale, S. AU - Ruggeri, G. PY - 2001 DA - 2001// TI - Performance evaluation and comparison of ITU-T/ETSI voice activity detectors BT - Proc. IEEE Int Conf on Acoustics, Speech, and Signal Processing PB - IEEE CY - Salt Lake City ID - Beritelli2001 ER - TY - JOUR AU - Benyassine, A. PY - 1997 DA - 1997// TI - ITU-T Recommendation G. 729 Annex B: a silence compression scheme for use with G. 729 optimized for V. 70 digital simultaneous voice and data applications JO - IEEE Commun. Mag. VL - 35 UR - https://doi.org/10.1109/35.620527 DO - 10.1109/35.620527 ID - Benyassine1997 ER - TY - CHAP AU - Tong, S. AU - Chen, N. AU - Qian, Y. AU - Yu, K. PY - 2014 DA - 2014// TI - Evaluating vad for automatic speech recognition BT - Proc. IEEE Int Conf On Signal Processing PB - IEEE CY - Hangzhou ID - Tong2014 ER - TY - JOUR AU - Graf, S. AU - Herbig, T. AU - Buck, M. AU - Schmidt, G. PY - 2015 DA - 2015// TI - Features for voice activity detection: a comparative analysis JO - EURASIP J. Adv. Sign. Process. VL - 2015 UR - https://doi.org/10.1186/s13634-015-0277-z DO - 10.1186/s13634-015-0277-z ID - Graf2015 ER - TY - JOUR AU - Rabiner, L. R. AU - Sambur, M. R. PY - 1975 DA - 1975// TI - An algorithm for determining the endpoints of isolated utterances JO - Bell Labs Tech. J. VL - 54 UR - https://doi.org/10.1002/j.1538-7305.1975.tb02840.x DO - 10.1002/j.1538-7305.1975.tb02840.x ID - Rabiner1975 ER - TY - CHAP AU - Prasad, R. V. AU - Sangwan, A. AU - Jamadagni, H. S. AU - Chiranth, M. C. AU - Sah, R. AU - Gaurav, V. PY - 2002 DA - 2002// TI - Comparison of voice activity detection algorithms for VoIP BT - Proc 7th Int Symp on Computers and Communications PB - IEEE CY - Taormina-Giardini Naxos ID - Prasad2002 ER - TY - JOUR AU - Ramırez, J. AU - Segura, J. C. AU - Benıtez, C. AU - De La Torre, A. AU - Rubio, A. PY - 2004 DA - 2004// TI - Efficient voice activity detection algorithms using long-term speech information JO - Speech Comm. VL - 42 UR - https://doi.org/10.1016/j.specom.2003.10.002 DO - 10.1016/j.specom.2003.10.002 ID - Ramırez2004 ER - TY - JOUR AU - Ishizuka, K. AU - Nakatani, T. AU - Fujimoto, M. AU - Miyazaki, N. PY - 2010 DA - 2010// TI - Noise robust voice activity detection based on periodic to aperiodic component ratio JO - Speech Comm. VL - 52 UR - https://doi.org/10.1016/j.specom.2009.08.003 DO - 10.1016/j.specom.2009.08.003 ID - Ishizuka2010 ER - TY - JOUR AU - Pek, K. AU - Arai, T. AU - Kanedera, N. PY - 2012 DA - 2012// TI - Voice activity detection in noise using modulation spectrum of speech: investigation of speech frequency and modulation frequency ranges JO - Acoust. Sci. Technol. VL - 33 UR - https://doi.org/10.1250/ast.33.33 DO - 10.1250/ast.33.33 ID - Pek2012 ER - TY - CHAP AU - Kinnunen, T. AU - Rajan, P. PY - 2013 DA - 2013// TI - A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data BT - Proc. IEEE Int Conf on Acoustic, Speech and Signal Processing PB - IEEE CY - Vancouver ID - Kinnunen2013 ER - TY - JOUR AU - Sohn, J. AU - Kim, N. S. AU - Sung, W. PY - 1999 DA - 1999// TI - A statistical model-based voice activity detection JO - IEEE Signal Proc. Lett. VL - 6 UR - https://doi.org/10.1109/97.736233 DO - 10.1109/97.736233 ID - Sohn1999 ER - TY - JOUR AU - Davis, A. AU - Nordholm, S. AU - Togneri, R. PY - 2006 DA - 2006// TI - Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold JO - IEEE Trans. Audio Speech Lang. Process. VL - 14 UR - https://doi.org/10.1109/TSA.2005.855842 DO - 10.1109/TSA.2005.855842 ID - Davis2006 ER - TY - STD TI - T. Kinnunen, E. Chernenko, M. Tuononen, P. Fränti, H. Li, in Proc Int. Conf. on Speech and Computer (SPECOM07). Voice activity detection using MFCC features and support vector machine (Moscow, 2007), pp. 556–561. ID - ref14 ER - TY - JOUR AU - Jo, Q. H. AU - Chang, J. H. AU - Shin, J. W. AU - Kim, N. S. PY - 2009 DA - 2009// TI - Statistical model-based voice activity detection using support vector machine JO - IET Sign. Process. VL - 3 UR - https://doi.org/10.1049/iet-spr.2008.0128 DO - 10.1049/iet-spr.2008.0128 ID - Jo2009 ER - TY - BOOK PY - 2002 DA - 2002// TI - Applying support vector machines to voice activity detection, Proc Int Conf on Signal Processing PB - IEEE CY - Beijing ID - ref16 ER - TY - CHAP PY - 2015 DA - 2015// TI - Proc Int Joint Conference On Neural Networks BT - A deep neural network approach for voice activity detection in multi-room domestic scenarios PB - IEEE CY - Killarney ID - ref17 ER - TY - JOUR PY - 2016 DA - 2016// TI - Boosting contextual information for deep neural network based voice activity detection JO - IEEE/ACM Trans. Audio Speech Lang. Process. VL - 24 UR - https://doi.org/10.1109/TASLP.2015.2505415 DO - 10.1109/TASLP.2015.2505415 ID - ref18 ER - TY - STD TI - F. Bie, Z. Zhang, D. Wang, T. Zheng, DNN-based voice activity detection for speaker recognition. CSLT Tech. Rep (2015). available online http://www.cslt.org/mediawiki/images/c/c8/Dvad.pdf. UR - http://www.cslt.org/mediawiki/images/c/c8/Dvad.pdf ID - ref19 ER - TY - CHAP PY - 2012 DA - 2012// TI - Understanding how deep belief networks perform acoustic modelling BT - Proc IEEE Int Conf on Acoustics, Speech and Signal Processing PB - IEEE CY - Kyoto ID - ref20 ER - TY - JOUR PY - 2015 DA - 2015// TI - Exploiting spectro-temporal locality in deep learning based acoustic event detection JO - EURASIP J. Audio Speech Music Process. VL - 2015 UR - https://doi.org/10.1186/s13636-015-0069-2 DO - 10.1186/s13636-015-0069-2 ID - ref21 ER - TY - STD TI - N. Ryant, M. Liberman, J. Yuan, in INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, ed. by F. Bimbot, C. Cerisara, C. Fougeron, G. Gravier, L. Lamel, F. Pellegrino, and P. Perrier. Speech activity detection on youtube using deep neural networks (Lyon, 2013), pp. 728–731. http://www.isca-speech.org/archive/interspeech_2013. UR - http://www.isca-speech.org/archive/interspeech_2013 ID - ref22 ER - TY - JOUR PY - 2015 DA - 2015// TI - Robust voice activity detection with deep maxout neural networks JO - Mod. Appl. Sci. VL - 9 ID - ref23 ER - TY - JOUR PY - 2013 DA - 2013// TI - Deep belief networks based voice activity detection JO - IEEE Trans Audio Speech Lang. Process. VL - 21 UR - https://doi.org/10.1109/TASL.2012.2229986 DO - 10.1109/TASL.2012.2229986 ID - ref24 ER - TY - CHAP PY - 2013 DA - 2013// TI - Recent advances in deep learning for speech research at microsoft, BT - Proc IEEE Int Conf On Acoustics, Speech and Signal Processing PB - IEEE CY - Vancouver ID - ref25 ER - TY - JOUR PY - 2006 DA - 2006// TI - Dynamic speech models: theory, algorithms, and applications JO - Synth. Lect. Speech Audio Process. VL - 2 UR - https://doi.org/10.2200/S00028ED1V01Y200605SAP002 DO - 10.2200/S00028ED1V01Y200605SAP002 ID - ref26 ER - TY - JOUR PY - 2006 DA - 2006// TI - Noise reduction of speech signals by running spectrum filtering JO - Syst. Comput. Jpn VL - 37 UR - https://doi.org/10.1002/scj.20529 DO - 10.1002/scj.20529 ID - ref27 ER - TY - JOUR PY - 1999 DA - 1999// TI - On the relative importance of various components of the modulation spectrum for automatic speech recognition JO - Speech Commun. VL - 28 UR - https://doi.org/10.1016/S0167-6393(99)00002-3 DO - 10.1016/S0167-6393(99)00002-3 ID - ref28 ER - TY - JOUR PY - 1994 DA - 1994// TI - RASTA processing of speech JO - IEEE Trans. Speech Audio Process. VL - 2 UR - https://doi.org/10.1109/89.326616 DO - 10.1109/89.326616 ID - ref29 ER - TY - JOUR PY - 2003 DA - 2003// TI - Joint acoustic and modulation frequency JO - EURASIP J. Adv. Sign. Process. VL - 2003 ID - ref30 ER - TY - JOUR PY - 1999 DA - 1999// TI - Syllable intelligibility for temporally filtered LPC cepstral trajectories JO - J. Acoust. Soc. Am. VL - 105 UR - https://doi.org/10.1121/1.426895 DO - 10.1121/1.426895 ID - ref31 ER - TY - CHAP PY - 2004 DA - 2004// TI - Robust speech analysis in noisy environment using running spectrum filtering BT - Proc IEEE Int Symp on Communications and Information Technology PB - IEEE CY - Sapporo ID - ref32 ER - TY - BOOK PY - 2014 DA - 2014// TI - Automatic speech recognition A deep learning approach PB - Springer CY - London ID - ref33 ER - TY - JOUR PY - 2009 DA - 2009// TI - Exploring strategies for training deep neural networks JO - J. Mach. Learn. Res. VL - 10 ID - ref34 ER - TY - JOUR PY - 2007 DA - 2007// TI - Learning multiple layers of representation JO - Trends Cogn. Sci. VL - 11 UR - https://doi.org/10.1016/j.tics.2007.09.004 DO - 10.1016/j.tics.2007.09.004 ID - ref35 ER - TY - JOUR PY - 2005 DA - 2005// TI - On contrastive divergence learning JO - AISTATS VL - 10 ID - ref36 ER - TY - STD TI - M.A. Keyvanrad, M.M. Homanyounpour, A brief survey on deep belief networks and introducing a new object oriented toolbox (DeeBNet).arXiv preprint arXiv:1408.3264. (2014). https://arxiv.org/abs/1408.3264. UR - https://arxiv.org/abs/1408.3264 ID - ref37 ER - TY - JOUR PY - 1992 DA - 1992// TI - ASJ continuous speech corpus for research JO - J. Acoust. Soc. Jpn VL - 48 ID - ref38 ER - TY - STD TI - The Rice University, “Noisex-92 Database”. http://spib.linse.ufsc.br/noise.html. Accessed 22 Feb 2017. UR - http://spib.linse.ufsc.br/noise.html ID - ref39 ER - TY - STD TI - R.O. Duda, P.E. Hart, D.G. Stork, Pattern classification, 2nd edn (New York, 2001). ID - ref40 ER - TY - CHAP PY - 2008 DA - 2008// TI - Voice activity detection in the presence of breathing noise using neural network and hidden markov model BT - Proc 16th European Signal Processing Conference PB - IEEE CY - Lausanne ID - ref41 ER - TY - JOUR PY - 2015 DA - 2015// TI - Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits JO - Comput Speech Lang. VL - 29 UR - https://doi.org/10.1016/j.csl.2013.11.004 DO - 10.1016/j.csl.2013.11.004 ID - ref42 ER - TY - JOUR PY - 1999 DA - 1999// TI - Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds JO - Speech Commun. VL - 27 UR - https://doi.org/10.1016/S0167-6393(98)00085-5 DO - 10.1016/S0167-6393(98)00085-5 ID - ref43 ER - TY - STD TI - M.V. Segbroeck, A. Tsiartas, S. Narayanan, in Proc INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, ed. by F. Bimbot, C. Cerisara, C. Fougeron, G. Gravier, L. Lamel, F. Pellegrino, and P. Perrier. A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice (Lyon, 2013), pp. 704–708. http://www.isca-speech.org/archive/interspeech_2013. UR - http://www.isca-speech.org/archive/interspeech_2013 ID - ref44 ER -