TY - JOUR AU - Hinton, G. AU - Deng, L. AU - Yu, D. AU - Dahl, G. E. AU - Mohamed, A. AU - Jaitly, N. AU - Senior, A. AU - Vanhoucke, V. AU - Nguyen, P. AU - Sainath, T. N. AU - Kingsbury, B. PY - 2012 DA - 2012// TI - Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups JO - IEEE Signal Process. Mag. VL - 29 UR - https://doi.org/10.1109/MSP.2012.2205597 DO - 10.1109/MSP.2012.2205597 ID - Hinton2012 ER - TY - STD TI - New Electronic Friends. https://pages.arm.com/machine-learning-voice-recognition-report.html. Accessed 30 May 2018. UR - https://pages.arm.com/machine-learning-voice-recognition-report.html ID - ref2 ER - TY - STD TI - R. C. Rose, D. B. Paul, in International Conference on Acoustics, Speech, and Signal Processing. A hidden Markov model based keyword recognition system, (1990), pp. 129–1321. https://doi.org/10.1109/ICASSP.1990.115555. ID - ref3 ER - TY - STD TI - J. R. Rohlicek, W. Russell, S. Roukos, H. Gish, in International Conference on Acoustics, Speech, and Signal Processing,. Continuous hidden Markov modeling for speaker-independent word spotting, (1989), pp. 627–6301. https://doi.org/10.1109/ICASSP.1989.266505. ID - ref4 ER - TY - STD TI - J. G. Wilpon, L. G. Miller, P. Modi, in [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing. Improvements and applications for key word recognition using hidden Markov modeling techniques, (1991), pp. 309–312. https://doi.org/10.1109/ICASSP.1991.150338. http://ieeexplore.ieee.org/document/150338/. UR - http://ieeexplore.ieee.org/document/150338/ ID - ref5 ER - TY - STD TI - G. Chen, C. Parada, G. Heigold. Small-footprint keyword spotting using deep neural networks, (2014). https://doi.org/10.1109/icassp.2014.6854370. ID - ref6 ER - TY - JOUR AU - Shen, K. AU - Cai, M. AU - Zhang, W. -. Q. AU - Tian, Y. AU - Liu, J. PY - 2016 DA - 2016// TI - Investigation of DNN-based keyword spotting in low resource environments JO - Int. J. Future Comput. Commun. VL - 5 UR - https://doi.org/10.18178/ijfcc.2016.5.2.458 DO - 10.18178/ijfcc.2016.5.2.458 ID - Shen2016 ER - TY - STD TI - G. Tucker, M. Wu, M. Sun, S. Panchapagesan, G. Fu, S. Vitaladevuni. Model compression applied to small-footprint keyword spotting, (2016), pp. 1878–1882. https://doi.org/10.21437/Interspeech.2016-1393. ID - ref8 ER - TY - CHAP AU - Fernández, S. AU - Graves, A. AU - Schmidhuber, J. ED - de Sá, J. M. ED - Alexandre, L. A. ED - Duch, W. ED - Mandic, D. PY - 2007 DA - 2007// TI - An application of recurrent neural networks to discriminative keyword spotting BT - Artificial Neural Networks – ICANN 2007 PB - Springer CY - Berlin, Heidelberg UR - https://doi.org/10.1007/978-3-540-74695-9_23 DO - 10.1007/978-3-540-74695-9_23 ID - Fernández2007 ER - TY - STD TI - K. P. Li, J. A. Naylor, M. L. Rossen, in [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2. A whole word recurrent neural network for keyword spotting, (1992), pp. 81–842. https://doi.org/10.1109/ICASSP.1992.226115. ID - ref10 ER - TY - STD TI - M. Sun, A. Raju, G. Tucker, S. Panchapagesan, G. Fu, A. Mandal, S. Matsoukas, N. Strom, S. Vitaladevuni, Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting. CoRR. abs/1705.02411: (2017). http://arxiv.org/abs/1705.02411. ID - ref11 ER - TY - STD TI - S. Ö,. Arik, M. Kliegl, R. Child, J. Hestness, A. Gibiansky, C. Fougner, R. Prenger, A. Coates, Convolutional recurrent neural networks for small-footprint keyword spotting. CoRR. abs/1703.05390: (2017). http://arxiv.org/abs/1703.05390. ID - ref12 ER - TY - CHAP AU - LeCun, Y. AU - Bengio, Y. PY - 1998 DA - 1998// TI - The Handbook of Brain Theory and Neural Networks BT - Chap. Convolutional Networks for Images, Speech, and Time Series PB - Press, MIT CY - Cambridge, MA, USA ID - LeCun1998 ER - TY - STD TI - T. N. Sainath, C. Parada, in INTERSPEECH. Convolutional neural networks for small-footprint keyword spotting, (2015). ID - ref14 ER - TY - STD TI - F. Chollet, Xception: deep learning with depthwise separable convolutions. CoRR. abs/1610.02357: (2016). http://arxiv.org/abs/1610.02357. ID - ref15 ER - TY - STD TI - A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam, Mobilenets: efficient convolutional neural networks for mobile vision applications. CoRR. abs/1704.04861: (2017). http://arxiv.org/abs/1704.04861. ID - ref16 ER - TY - STD TI - Y. Zhang, N. Suda, L. Lai, V. Chandra, Hello edge: keyword spotting on microcontrollers. CoRR. abs/1711.07128: (2017). http://arxiv.org/abs/1711.07128. ID - ref17 ER - TY - JOUR AU - Davis, S. AU - Mermelstein, P. PY - 1980 DA - 1980// TI - Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences JO - IEEE Trans. Acoust. Speech Signal Process. VL - 28 UR - https://doi.org/10.1109/TASSP.1980.1163420 DO - 10.1109/TASSP.1980.1163420 ID - Davis1980 ER - TY - STD TI - I. Chadawan, S. Siwat, Y. Thaweesak, in International Conference on Computer Graphics, Simulation and Modeling (ICGSM’2012). Speech recognition using MFCC (Pattaya (Thailand), 2012). ID - ref19 ER - TY - STD TI - Bhadragiri Jagan Mohan, Ramesh Babu N., in 2014 International Conference on Advances in Electrical Engineering (ICAEE). Speech recognition using MFCC and DTW, (2014), pp. 1–4. https://doi.org/10.1109/ICAEE.2014.6838564. ID - ref20 ER - TY - JOUR AU - Abdel-Hamid, O. AU - Mohamed, A. AU - Jiang, H. AU - Deng, L. AU - Penn, G. AU - Yu, D. PY - 2014 DA - 2014// TI - Convolutional neural networks for speech recognition JO - IEEE/ACM Trans. Audio Speech Lang. Process. VL - 22 UR - https://doi.org/10.1109/TASLP.2014.2339736 DO - 10.1109/TASLP.2014.2339736 ID - Abdel-Hamid2014 ER - TY - STD TI - A. -R. Mohamed, Deep Neural Network acoustic models for ASR. PhD thesis (University of Toronto, 2014). https://tspace.library.utoronto.ca/bitstream/1807/44123/1/Mohamed_Abdel-rahman_201406_PhD_thesis.pdf. UR - https://tspace.library.utoronto.ca/bitstream/1807/44123/1/Mohamed_Abdel-rahman_201406_PhD_thesis.pdf ID - ref22 ER - TY - STD TI - S. Watanabe, M. Delcroix, F. Metze, J. R. Hershey, in Springer International Publishing. New era for robust speech recognition, (2017), p. 205. https://doi.org/10.1007/978-3-319-64680-0. ID - ref23 ER - TY - JOUR AU - W. Picone, J. PY - 1993 DA - 1993// TI - Signal modeling techniques in speech recognition JO - Proc. IEEE VL - 81 UR - https://doi.org/10.1109/5.237532 DO - 10.1109/5.237532 ID - W. Picone1993 ER - TY - JOUR AU - Xiao, X. AU - Li, J. AU - Li, H. AU - Lee, C. -. H. PY - 2010 DA - 2010// TI - A study on the generalization capability of acoustic models for robust speech recognition JO - IEEE Trans. Audio Speech Lang. Process. VL - 18 UR - https://doi.org/10.1109/TASL.2009.2031236 DO - 10.1109/TASL.2009.2031236 ID - Xiao2010 ER - TY - JOUR AU - Rebai, I. AU - BenAyed, Y. AU - Mahdi, W. AU - Lorré, J. -. P. PY - 2017 DA - 2017// TI - Improving speech recognition using data augmentation and acoustic model fusion JO - Procedia Comput. Sci. VL - 112 UR - https://doi.org/10.1016/j.procs.2017.08.003 DO - 10.1016/j.procs.2017.08.003 ID - Rebai2017 ER - TY - STD TI - T. Ko, V. Peddinti, D. Povey, S. Khudanpur, in INTERSPEECH. Audio augmentation for speech recognition, (2015). ID - ref27 ER - TY - JOUR AU - Yin, S. AU - Liu, C. AU - Zhang, Z. AU - Lin, Y. AU - Wang, D. AU - Tejedor, J. AU - Zheng, T. F. AU - Li, Y. PY - 2015 DA - 2015// TI - Noisy training for deep neural networks in speech recognition JO - EURASIP J. Audio Speech Music Process. VL - 2015 UR - https://doi.org/10.1186/s13636-014-0047-0 DO - 10.1186/s13636-014-0047-0 ID - Yin2015 ER - TY - STD TI - P. Gysel, M. Motamedi, S. Ghiasi, Hardware-oriented approximation of convolutional neural networks. CoRR. abs/1604.03168: (2016). http://arxiv.org/abs/1604.03168. ID - ref29 ER - TY - STD TI - D. D. Lin, S. S. Talathi, V. S. Annapureddy, Fixed point quantization of deep convolutional networks. CoRR. abs/1511.06393: (2015). http://arxiv.org/abs/1511.06393. ID - ref30 ER - TY - STD TI - D. O’Shaughnessy, Speech Communication: Human and Machine, (1987). ID - ref31 ER - TY - STD TI - M. A. Nielsen, Neural Networks and Deep Learning, (2015). http://neuralnetworksanddeeplearning.com/. Accessed 26 May 2020. UR - http://neuralnetworksanddeeplearning.com/ ID - ref32 ER - TY - STD TI - S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR. abs/1502.03167: (2015). http://arxiv.org/abs/1502.03167. ID - ref33 ER - TY - STD TI - P. Warden, Speech commands: a public dataset for single-word speech recognition (2017). Dataset available from http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz. UR - http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz ID - ref34 ER - TY - STD TI - A. Mesaros, T. Heittola, T. Virtanen, in 2016 24th European Signal Processing Conference (EUSIPCO). TUT database for acoustic scene classification and sound event detection, (2016), pp. 1128–1132. https://doi.org/10.1109/EUSIPCO.2016.7760424. ID - ref35 ER - TY - STD TI - J. Thiemann, N. Ito, E. Vincent, DEMAND: a collection of multi-channel recordings of acoustic noise in diverse environments. Supported by Inria under the Associate Team Program VERSAMUS (2013). https://doi.org/10.5281/zenodo.1227121. ID - ref36 ER - TY - STD TI - H. -G. Hirsch, FaNT -filtering and noise adding tool. Technical report. Hochschule Niederrhein (2005). http://dnt.kr.hs-niederrhein.de/download/fant_manual.pdf. Accessed 26 May 2020. UR - http://dnt.kr.hs-niederrhein.de/download/fant_manual.pdf ID - ref37 ER - TY - STD TI - N. Mellempudi, A. Kundu, D. Das, D. Mudigere, B. Kaul, Mixed low-precision deep learning inference using dynamic fixed point. CoRR. abs/1701.08978: (2017). http://arxiv.org/abs/1701.08978. ID - ref38 ER - TY - STD TI - D. Williamson, in IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings. Dynamically scaled fixed point arithmetic (IEEE, 1991), pp. 315–318. https://doi.org/10.1109/PACRIM.1991.160742. http://ieeexplore.ieee.org/document/160742/. UR - http://ieeexplore.ieee.org/document/160742/ ID - ref39 ER - TY - STD TI - M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow: large-scale machine learning on heterogeneous systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/. Accessed 26 May 2020. UR - https://www.tensorflow.org/ ID - ref40 ER - TY - JOUR AU - Smeds, K. AU - Wolters, F. AU - Rung, M. PY - 2015 DA - 2015// TI - Estimation of signal-to-noise ratios in realistic sound scenarios JO - J. Am. Acad. Audiol. VL - 26 2 UR - https://doi.org/10.3766/jaaa.26.2.7 DO - 10.3766/jaaa.26.2.7 ID - Smeds2015 ER - TY - STD TI - L. Lai, N. Suda, V. Chandra, CMSIS-NN: efficient neural network kernels for arm cortex-M CPUS. CoRR. abs/1801.06601: (2018). http://arxiv.org/abs/1801.06601. ID - ref42 ER - TY - STD TI - P. Warden, Speech commands: a dataset for limited-vocabulary speech recognition. CoRR. abs/1804.03209: (2018). http://arxiv.org/abs/1804.03209. ID - ref43 ER - TY - JOUR AU - Cheng, Z. AU - Huang, K. AU - Wang, Y. AU - Liu, H. AU - Guan, J. AU - Zhou, S. PY - 2017 DA - 2017// TI - Selecting high-quality negative samples for effectively predicting protein-RNA interactions JO - BMC Syst. Biol. VL - 11 UR - https://doi.org/10.1186/s12918-017-0390-8 DO - 10.1186/s12918-017-0390-8 ID - Cheng2017 ER - TY - JOUR AU - Kurczab, R. AU - Smusz, S. AU - Bojarski, A. J. PY - 2014 DA - 2014// TI - The influence of negative training set size on machine learning-based virtual screening, JO - J Cheminformatics VL - 6 UR - https://doi.org/10.1186/1758-2946-6-32 DO - 10.1186/1758-2946-6-32 ID - Kurczab2014 ER - TY - STD TI - P. Warden, Why GEMM is at the heart of deep learning. https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/. Accessed 19 May 2018. UR - https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ ID - ref46 ER - TY - STD TI - S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, E. Shelhamer, cudnn: efficient primitives for deep learning. CoRR. abs/1410.0759: (2014). http://arxiv.org/abs/1410.0759. ID - ref47 ER - TY - STD TI - P. Molchanov, S. Tyree, T. Karras, T. Aila, J. Kautz, Pruning convolutional neural networks for resource efficient transfer learning. CoRR. abs/1611.06440: (2016). http://arxiv.org/abs/1611.06440. ID - ref48 ER - TY - STD TI - P. M. Sørensen, A depthwise separable convolutional neural network for keyword spotting on embedded systems. GitHub (2018). https://github.com/PeterMS123/KWS-DS-CNN-for-embedded. UR - https://github.com/PeterMS123/KWS-DS-CNN-for-embedded ID - ref49 ER -