I. McCowan, *Microphone arrays: a tutorial* (Queensland University, Australia, 2001), p. 1

Google Scholar

F. Gustafsson, F. Gunnarsson, in *2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP’03)*. Positioning using time-difference of arrival measurements, vol 6 (2003), pp. VI–553

Chapter
Google Scholar

Z. Khan, M.M. Kamal, N. Hamzah, K. Othman, N. Khan, in *2008 IEEE International RF and Microwave Conference*. Analysis of performance for multiple signal classification (MUSIC) in estimating direction of arrival (2008), pp. 524–529

Chapter
Google Scholar

K. Nakadai, K. Nakamura, in *Wiley Encyclopedia of Electrical and Electronics Engineering*. Sound source localization and separation, (New York: John Wiley & Sons, 2015), pp. 1–18

S.A. Vorobyov, Principles of minimum variance robust adaptive beamforming design. Signal Process. **93**, 3264 (2013)

Article
Google Scholar

M. Fuhry, L. Reichel, A new Tikhonov regularization method. Numerical Algorithms **59**, 433 (2012)

Article
MathSciNet
MATH
Google Scholar

S. Amari, S.C. Douglas, A. Cichocki, H.H. Yang, in *First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications*. Multichannel blind deconvolution and equalization using the natural gradient (1997), pp. 101–104

Chapter
Google Scholar

M. Kawamoto, K. Matsuoka, N. Ohnishi, A method of blind separation for convolved non-stationary signals. Neurocomputing **22**, 157 (1998)

Article
MATH
Google Scholar

T. Takatani, T. Nishikawa, H. Saruwatari, K. Shikano, in *Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings*. High-fidelity blind separation for convolutive mixture of acoustic signals using simo-model-based in- dependent component analysis, vol 2 (2003), pp. 77–80

Chapter
Google Scholar

D.W. Schobben, P. Sommen, A frequency domain blind signal separation method based on decorrelation. IEEE Trans. Signal Process. **50**, 1855 (2002)

Article
Google Scholar

S. Makino, H. Sawada, S. Araki, in *Blind Speech Separation*. Frequency-domain blind source separation (Dordrecht: Springer, 2007), pp. 47–78

H. Buchner, R. Aichner, W. Kellermann, A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Trans. Speech Audio Process. **13**, 120 (2004)

Article
Google Scholar

T. Kim, I. Lee, T.-W. Lee, in *2006 Fortieth Asilomar Conference on Signals, Systems and Computers*. Independent vector analysis: definition and algorithms (2006), pp. 1393–1396

Chapter
Google Scholar

Y. Wang, D. Wang, Towards scaling up classification- based speech separation. IEEE Trans. Audio Speech Lang. Process. **21**, 1381 (2013)

Article
Google Scholar

S. Mobin, B. Cheung, B. Olshausen, *Generalization challenges for neural architectures in audio source separation, arXiv preprint arXiv:1803.08629* (2018)

Google Scholar

P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, Joint optimization of masks and deep re- current neural networks for monaural source separation. IEEE/ACM Trans. Audio Speech Lang. Process. **23**, 2136 (2015)

Article
Google Scholar

J.R. Hershey, Z. Chen, J. Le Roux, S. Watanabe, in *2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. Deep clustering: discriminative embeddings for seg- mentation and separation (2016), pp. 31–35

Chapter
Google Scholar

M. Kolbæk, D. Yu, Z.-H. Tan, J. Jensen, Mul-titalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks. IEEE/ACM Trans. Audio Speech Lang. Process. **25**, 1901 (2017)

Article
Google Scholar

Y. Luo, N. Mesgarani, Conv-TasNet: Surpassing ideal time–frequency magnitude masking for speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. **27**, 1256 (2019)

Article
Google Scholar

K. Furuya, S. Sakauchi, A. Kataoka, in *2006 IEEE Inter-national Conference on Acoustics Speech and Signal Processing Proceedings*. Speech dereverberation by combining MINT-based blind deconvolution and modified spectral subtraction, vol 1 (2006), p. I–I

Google Scholar

T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Miyoshi, in *2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics*. Importance of energy and spectral features in gaussian source model for speech dereverberation (New Paltz: IEEE, 2007), pp. 299–302

T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, B.-H. Juang, in *2008 IEEE International Conference on Acoustics, Speech and Signal Processing*. Blind speech dereverberation with multi- channel linear prediction based on short time fourier transform representation (2008), pp. 85–88

Chapter
Google Scholar

T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, B.-H. Juang, Speech dereverberation based on variance- normalized delayed linear prediction. IEEE Trans. Audio Speech Lang. Process. **18**(1717) (2010)

T. Yoshioka, T. Nakatani, M. Miyoshi, H.G. Okuno, Blind separation and dereverberation of speech mix- tures by joint optimization. IEEE Trans. Audio Speech Lang. Process. **19**(69) (2010)

A. Jukić, N. Mohammadiha, T. van Waterschoot, T. Gerkmann, S. Doclo, in *2015 IEEE Inter- national Conference on Acoustics, Speech and Signal Processing (ICASSP)*. Multi-channel linear prediction-based speech dereverberation with low-rank power spectrogram approximation (2015), pp. 96–100

Chapter
Google Scholar

F. Weninger, S. Watanabe, Y. Tachioka, B. Schuller, in *2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. Deep recurrent de-noising auto-encoder and blind de- reverberation for reverberated speech recognition (2014), pp. 4623–4627

Chapter
Google Scholar

D.S. Williamson, D. Wang, in *2017 IEEE international conference on acoustics, speech and signal processing (ICASSP)*. Speech dereverberation and denoising using complex ratio masks (2017), pp. 5590–5594

Chapter
Google Scholar

J. Heymann, L. Drude, R. Haeb-Umbach, K. Kinoshita, T. Nakatani, in *ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. Joint optimization of neural network- based WPE dereverberation and acoustic model for robust online ASR (2019), pp. 6655–6659

Chapter
Google Scholar

K. Kinoshita, M. Delcroix, H. Kwon, T. Mori, T. Nakatani, in *Interspeech*. Neural network-based spectrum estimation for online wpe dereverberation (2017), pp. 384–388

Chapter
Google Scholar

M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fuji-moto, N. Ito, K. Kinoshita, M. Espi, S. Araki, T. Hori, et al., Strategies for distant speech recognitionin reverberant environments. EURASIP J. Adv. Signal Process. **2015**, 1 (2015)

Article
Google Scholar

W. Yang, G. Huang, W. Zhang, J. Chen, J. Benesty, in *2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC)*. Dereverberation with differential microphone arrays and the weighted-prediction-error method (2018), pp. 376–380

Google Scholar

M. Togami, in *2015 23rd European Signal Processing Conference (EUSIPCO)*. Multichannel online speech dereverberation under noisy environments (2015), pp. 1078–1082

Chapter
Google Scholar

L. Drude, C. Boeddeker, J. Heymann, R. Haeb-Umbach, K. Kinoshita, M. Delcroix, T. Nakatani, in *Interspeech*. Integrating neural network based beamforming and weighted pre- diction error dereverberation (2018), pp. 043–3047

Google Scholar

T. Nakatani, K. Kinoshita, A unified convolutional beamformer for simultaneous denoising and dereverberation. IEEE Signal Process. Lett. **26**, 903 (2019)

Article
Google Scholar

G. Wichern, J. Antognini, M. Flynn, L.R. Zhu, E. Mc-Quinn, D. Crow, E. Manilow, J.L. Roux, *Wham!: Extending speech separation to noisy environments, arXiv preprint arXiv:1907.01160* (2019)

Google Scholar

C. Ma, D. Li, X. Jia, *Two-stage model and optimal si-snr for monaural multi-speaker speech separation in noisy environment, arXiv preprint arXiv:2004.06332* (2020)

Google Scholar

T. Yoshioka, Z. Chen, C. Liu, X. Xiao, H. Erdogan, D. Dimitriadis, in *ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. Low-latency speaker-independent continuous speech separation (2019), pp. 6980–6984

Chapter
Google Scholar

Z.-Q. Wang, D. Wang, in *Interspeech*. Integrating spectral and spatial features for multi-channel speaker separation (2018), pp. 2718–2722

Google Scholar

J. Wu, Z. Chen, J. Li, T. Yoshioka, Z. Tan, E. Lin, Y. Luo, L. Xie, *An end-to-end architecture of online multi-channel speech separation, arXiv preprint arXiv:2009.03141* (2020)

Google Scholar

T. Nakatani, R. Takahashi, T. Ochiai, K. Kinoshita, R. Ikeshita, M. Delcroix, S. Araki, in *ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. DNN-supported mask-based convolutional beamforming for simultaneous denoising, dereverberation, and source separation (2020), pp. 6399–6403

Chapter
Google Scholar

Y. Fu, J. Wu, Y. Hu, M. Xing, L. Xie, in *2021 IEEE Spoken Language Technology Workshop (SLT)*. DESNET: A multi-channel network for simultaneous speech dereverberation, enhancement and separation (2021), pp. 857–864

Chapter
Google Scholar

T. Ochiai, M. Delcroix, R. Ikeshita, K. Kinoshita, T. Nakatani, S. Araki, in *ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. Beam-Tasnet: Time-domain audio separation network meets frequency-domain beam- former (2020), pp. 6384–6388

Chapter
Google Scholar

J. Le Roux, S. Wisdom, H. Erdogan, J.R. Hershey, in *ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. SDR–half-baked or well done? (2019), pp. 626–630

Chapter
Google Scholar

O. Ronneberger, P. Fischer, T. Brox, in *International Conference on Medical image computing and computer-assisted intervention*. U-net: Convolutional networks for biomedical image segmentation (Cham: Springer, 2015), pp. 234–241

O. Ernst, S.E. Chazan, S. Gannot, J. Goldberger, in *2018 26th European Signal Processing Conference (EUSIPCO)*. Speech dereverberation using fully convolutional networks (2018), pp. 390–394

Chapter
Google Scholar

V. Kothapally, W. Xia, S. Ghorbani, J.H. Hansen, W. Xue, J. Huang, *Skipconvnet: Skip convolutional neural network for speech dereverberation using optimally smoothed spectral mapping, arXiv preprint arXiv:2007.09131* (2020)

Google Scholar

J. Yamagishi, C. Veaux, K. MacDonald, et al., *Cstr vctk corpus: English multi-speaker corpus for cstr voice cloning toolkit (version 0.92)* (2019).

Google Scholar

A.W. Rix, J.G. Beerends, M.P. Hollier, A.P. Hekstra, in *2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221)*. Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone net- works and codecs, vol 2 (2001), pp. 749–752

Chapter
Google Scholar

C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen, in *2010 IEEE International Conference on Acoustics, Speech and Signal Processing*. A short-time objective intelligibility measure for time-frequency weighted noisy speech (2010), pp. 4214–4217

Chapter
Google Scholar

K. Kinoshita, M. Delcroix, T. Nakatani, M. Miyoshi, Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction. IEEE Trans. Audio Speech Lang. Process. **17**, 534 (2009)

Article
Google Scholar

S.-I. Amari, A. Cichocki, H.H. Yang, et al., in *Advances in neural information processing systems*. A new learning algorithm for blind signal separation (1996), pp. 757–763

Google Scholar

S. Wold, K. Esbensen, P. Geladi, Principal component analysis. Chemom. Intell. Lab. Syst. **2**, 37 (1987)

Article
Google Scholar

R. Gu, J. Wu, S. Zhang, L. Chen, Y. Xu, M. Yu, D. Su, Y. Zou, D. Yu, *End-to-end multi-channel speech separation, arXiv preprint arXiv:1905.06286* (2019)

Google Scholar

Y. Zhao, Z.-Q. Wang, D. Wang, in *2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*. A two-stage algorithm for noisy and reverberant speech enhancement (2017), pp. 5580–5584

Chapter
Google Scholar

F. Chollet, in *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*. Xception: Deep learning with depthwise separable convolutions (2017)

Google Scholar

K. He, X. Zhang, S. Ren, J. Sun, in *Proceedings of the IEEE international conference on computer vision*. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015), pp. 1026–1034

Google Scholar

Y. Bengio, J. Louradour, R. Collobert, J. Weston, in *Proceedings of the 26th annual international conference on machine learning*. Curriculum learning (2009), pp. 41–48

Google Scholar

D.P. Kingma, J. Ba, *Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980* (2014)

Google Scholar

F. Bahmaninezhad, J. Wu, R. Gu, S.-X. Zhang, Y. Xu, M. Yu, D. Yu, *A comprehensive study of speech separation: spectrogram vs waveform separation, arXiv preprint arXiv:1905.07497* (2019)

Google Scholar

J.B. Allen, D.A. Berkley, Image method for efficiently simulating small-room acoustics. *J. Acoust. Soc. Am.* **65**, 943 (1979)

Article
Google Scholar

N. Zeghidour, D. Grangier, *Wavesplit: End-to-end speech separation by speaker clustering, arXiv preprint arXiv:2002.08933* (2020)

Google Scholar