H. Kuttruff, Room acoustics (CRC Press, Germany, 2016). https://doi.org/10.1201/9781315372150.
Google Scholar
D. Griesinger, The psychoacoustics of apparent source width, spaciousness and envelopment in performance spaces. Acta Acustica U. Acustica. 83(4), 721–731 (1997).
Google Scholar
J. B. Allen, D. A. Berkley, Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am.65(4), 943–950 (1979). https://doi.org/10.1121/1.382599.
Google Scholar
S. Gannot, D. Burshtein, E. Weinstein, Signal enhancement using beamforming and non-stationarity with applications to speech. IEEE Trans. Signal Process.49(8), 1614–1626 (2001). https://doi.org/10.1109/78.934132.
Google Scholar
I. Cohen, Relative transfer function identification using speech signals. IEEE Trans. Speech Audio Process.12(5), 451–459 (2004). https://doi.org/10.1109/TSA.2004.832975.
Google Scholar
S. Markovich, S. Gannot, I. Cohen, Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals. IEEE Trans Audio Speech Lang. Process.17(6), 1071–1086 (2009). https://doi.org/10.1109/TASL.2009.2016395.
Google Scholar
O. Schwartz, S. Gannot, E. A. Habets, Multi-microphone speech dereverberation and noise reduction using relative early transfer functions. IEEE/ACM Trans. Audio Speech Lang. Process.23(2), 240–251 (2014). https://doi.org/10.1109/TASLP.2014.2372335.
Google Scholar
S. Braun, W. Zhou, E. A. Habets, in 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Narrowband direction-of-arrival estimation for binaural hearing aids using relative transfer functions, (2015), pp. 1–5. https://doi.org/10.1109/WASPAA.2015.7336917.
X. Li, L. Girin, F. Badeig, R. Horaud, in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Reverberant sound localization with a robot head based on direct-path relative transfer function, (2016), pp. 2819–2826. https://doi.org/10.1109/IROS.2016.7759437.
Q. Nguyen, L. Girin, G. Bailly, F. Elisei, D. -C. Nguyen, in Workshop on Crossmodal Learning for Intelligent Robotics in conjunction with IEEE/RSJ IROS. Autonomous sensorimotor learning for sound source localization by a humanoid robot (IEEENew York, 2018).
Google Scholar
B. Laufer-Goldshtein, R. Talmon, S. Gannot, et al, Data-driven multi-microphone speaker localization on manifolds. Found. Trends Signal Process.14(1–2), 1–161 (2020).
MathSciNet
MATH
Google Scholar
J. L. Flanagan, A. C. Surendran, E. -E. Jan, Spatially selective sound capture for speech and audio processing. Speech Comm.13(1-2), 207–222 (1993). https://doi.org/10.1016/0167-6393(93)90072-S.
Google Scholar
E. E. Jan, P. Svaizer, J. L. Flanagan, in IEEE International Symposium on Circuits and Systems, vol. 2. Matched-filter processing of microphone array for spatial volume selectivity, (1995), pp. 1460–1463. https://doi.org/10.1109/ISCAS.1995.521409.
S. Affes, Y. Grenier, A signal subspace tracking algorithm for microphone array processing of speech. IEEE Trans. Speech Audio Process.5(5), 425–437 (1997). https://doi.org/10.1109/89.622565.
Google Scholar
P. Annibale, F. Antonacci, P. Bestagini, A. Brutti, A. Canclini, L. Cristoforetti, E. Habets, W. Kellermann, K. Kowalczyk, A. Lombard, E. Mabande, D. Markovic, P. Naylor, M. Omologo, R. Rabenstein, A. Sarti, P. Svaizer, M. Thomas, The SCENIC project: environment-aware sound sensing and rendering. Procedia Comput. Sci.7:, 150–152 (2011). https://doi.org/10.1016/j.procs.2011.09.039.
Google Scholar
I. Dokmanić, R. Scheibler, M. Vetterli, Raking the cocktail party. IEEE J. Sel. Top. Signal Process.9(5), 825–836 (2015). https://doi.org/10.1109/JSTSP.2015.2415761.
Google Scholar
K. Kowalczyk, Raking early reflection signals for late reverberation and noise reduction. J. Acoust. Soc. Am. (JASA). 145(3), 257–263 (2019). https://doi.org/10.1121/1.5095535.
Google Scholar
F. Ribeiro, D. Ba, C. Zhang, D. Florêncio, in IEEE International Conference on Multimedia and Expo (ICME). Turning enemies into friends: using reflections to improve sound source localization, (2010), pp. 731–736. https://doi.org/10.1109/ICME.2010.5583886.
D. Salvati, C. Drioli, G. L. Foresti, Sound source and microphone localization from acoustic impulse responses. IEEE Signal Process. Lett.23(10), 1459–1463 (2016). https://doi.org/10.1109/LSP.2016.2601878.
Google Scholar
D. Di Carlo, A. Deleforge, N. Bertin, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Mirage: 2D source localization using microphone pair augmentation with echoes, (2019), pp. 775–779. https://doi.org/10.1109/ICASSP.2019.8683534.
J. Daniel, S. Kitić, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Time domain velocity vector for retracing the multipath propagation, (2020), pp. 421–425. https://doi.org/10.1109/ICASSP40776.2020.9054561.
A. Asaei, M. Golbabaee, H. Bourlard, V. Cevher, Structured sparsity models for reverberant speech separation. IEEE/ACM Trans. Audio Speech Lang. Process.22(3), 620–633 (2014). https://doi.org/10.1109/TASLP.2013.2297012.
Google Scholar
S. Leglaive, R. Badeau, G. Richard, Multichannel audio source separation with probabilistic reverberation priors. IEEE/ACM Trans. Audio Speech Lang. Process.24(12), 2453–2465 (2016). https://doi.org/10.1109/TASLP.2016.2614140.
Google Scholar
R. Scheibler, D. Di Carlo, A. Deleforge, I. Dokmanić, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Separake: source separation with a little help from echoes, (2018), pp. 6897–6901. https://doi.org/10.1109/ICASSP.2018.8461345.
L. Remaggi, P. J. Jackson, W. Wang, Modeling the comb filter effect and interaural coherence for binaural source separation. IEEE/ACM Trans. Audio Speech Lang. Process.27(12), 2263–2277 (2019). https://doi.org/10.1109/TASLP.2019.2946043.
Google Scholar
K. A. Al-Karawi, D. Y. Mohammed, Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. Int. J. Speech Technol.22(4), 1077–1084 (2019). https://doi.org/10.1007/s10772-019-09648-z.
Google Scholar
F. Antonacci, J. Filos, M. R. Thomas, E. A. Habets, A. Sarti, P. A. Naylor, S. Tubaro, Inference of room geometry from acoustic impulse responses. IEEE Trans. Audio Speech Lang. Process.20(10), 2683–2695 (2012). https://doi.org/10.1109/TASL.2012.2210877.
Google Scholar
I. Dokmanić, R. Parhizkar, A. Walther, Y. M. Lu, M. Vetterli, Acoustic echoes reveal room shape. Proc. Natl. Acad. Sci. U.S.A.110(30), 12186–12191 (2013). https://doi.org/10.1073/pnas.1221464110.
Google Scholar
M. Crocco, A. Trucco, A. Del Bue, Uncalibrated 3D room geometry estimation from sound impulse responses. J. Frankl. Inst.354(18), 8678–8709 (2017). https://doi.org/10.1016/j.jfranklin.2017.10.024.
MathSciNet
MATH
Google Scholar
L. Remaggi, P. J. B. Jackson, P. Coleman, W. Wang, Acoustic reflector localization: novel image source reversion and direct localization methods. IEEE/ACM Trans. Audio Speech Lang. Process.25(2), 296–309 (2017). https://doi.org/10.1109/TASLP.2016.2633802.
Google Scholar
I. Szoke, M. Skacel, L. Mosner, J. Paliesek, J. H. Cernocky, Building and evaluation of a real room impulse response dataset. IEEE J. Sel. Top. Signal Process.13(4), 863–876 (2019). https://doi.org/10.1109/JSTSP.2019.2917582.
Google Scholar
A. F. Genovese, H. Gamper, V. Pulkki, N. Raghuvanshi, I. J. Tashev, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Blind room volume estimation from single-channel noisy speech, (2019), pp. 231–235. https://doi.org/10.1109/ICASSP.2019.8682951.
E. Hadad, F. Heese, P. Vary, S. Gannot, in 14th International Workshop on Acoustic Signal Enhancement (IWAENC). Multichannel audio database in various acoustic environments, (2014), pp. 313–317. https://doi.org/10.1109/IWAENC.2014.6954309.
N. Bertin, E. Camberlein, R. Lebarbenchon, E. Vincent, S. Sivasankaran, I. Illina, F. Bimbot, VoiceHome-2, an extended corpus for multichannel speech processing in real homes. Speech Commun.106:, 68–78 (2019). https://doi.org/10.1016/j.specom.2018.11.002.
Google Scholar
C. Gaultier, S. Kataria, A. Deleforge, in Lecture Notes in Computer Science, vol. 10169 LNCS. VAST: the virtual acoustic space traveler dataset, (2017), pp. 68–79. https://doi.org/10.1007/978-3-319-53547-0_7.
C. Kim, A. Misra, K. Chin, T. Hughes, A. Narayanan, T. N. Sainath, M. Bacchiani, in Interspeech 2017. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home (ISCAStockholm, 2017), pp. 379–383.
Google Scholar
L. Perotin, R. Serizel, E. Vincent, A. Guerin, CRNN-based multiple DoA estimation using acoustic intensity features for ambisonics recordings. IEEE J. Sel. Top. Signal Process.13(1), 22–33 (2019). https://doi.org/10.1109/JSTSP.2019.2900164.
Google Scholar
D. Di Carlo, C. Elvira, A. Deleforge, N. Bertin, R. Gribonval, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Blaster: an off-grid method for blind and regularized acoustic echoes retrieval, (2020), pp. 156–160. https://doi.org/10.1109/ICASSP40776.2020.9054647.
S. M. Schimmel, M. F. Muller, N. Dillier, in IEEE International Conference on Acoustics, Speech and Signal Processing. A fast and accurate “shoebox” room acoustics simulator, (2009), pp. 241–244. https://doi.org/10.1109/ICASSP.2009.4959565.
E. A. Habets, Room impulse response generator. Technische Universiteit Eindhoven, Tech. Rep. 2(2.4), 1 (2006).
Google Scholar
R. Scheibler, E. Bezzam, I. Dokmanić, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Pyroomacoustics: a Python package for audio room simulations and array processing algorithms (Calgary, 2018). https://doi.org/10.1109/ICASSP.2018.8461310.
D. Diaz-Guerra, A. Miguel, J. R. Beltran, gpurir: a Python library for room impulse response simulation with GPU acceleration. Multimedia Tools Appl.80(4), 5653–5671 (2021). https://doi.org/10.1007/s11042-020-09905-3.
Google Scholar
J. Čmejla, T. Kounovský, S. Gannot, Z. Koldovský, P. Tandeitnik, in European Signal Processing Conference (EUSIPCO). Mirage: multichannel database of room impulse responses measured on high-resolution cube-shaped grid, (2021), pp. 56–60. https://doi.org/10.23919/Eusipco47968.2020.9287646.
D. B. Paul, J. M. Baker, in Proceedings of the Workshop on Speech and Natural Language. The design for the Wall Street Journal-based CSR corpus (Association for Computational Linguistics, 1992), pp. 357–362. https://doi.org/10.3115/1075527.1075614.
O. Cramer, The variation of the specific heat ratio and the speed of sound in air with temperature, pressure, humidity, and co 2 concentration. J. Acoust. Soc. Am.93(5), 2510–2516 (1993). https://doi.org/10.1121/1.405827.
Google Scholar
A. Farina, Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique. Journal of The Audio Engineering Society (Audio Engineering Society, New York, 2000).
Google Scholar
A. Farina, in Audio Eng. Soc. Convention (AES), 3. Advancements in impulse response measurements by sine sweeps, (2007), pp. 1626–1646.
M. Ravanelli, A. Sosi, P. Svaizer, M. Omologo, in European Signal Processing Conference (EUSIPCO). Impulse response estimation for robust speech recognition in a reverberant environment (IEEENew York, 2012), pp. 1668–1672.
Google Scholar
I. Dokmanić, J. Ranieri, M. Vetterli, in European Signal Processing Conference (EUSIPCO). Relax and unfold: Microphone localization with Euclidean distance matrices (IEEENew York, 2015), pp. 265–269. https://doi.org/10.1109/EUSIPCO.2015.7362386.
Google Scholar
M. Crocco, A. Del Bue, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Estimation of TDOA for room reflections by iterative weighted l1 constraint, (2016), pp. 3201–3205. https://doi.org/10.1109/ICASSP.2016.7472268.
A. Plinge, F. Jacob, R. Haeb-Umbach, G. A. Fink, Acoustic microphone geometry calibration. IEEE Signal Process. Mag., 14–28 (2016). https://doi.org/10.1109/MSP.2016.2555198.
A. Beck, P. Stoica, J. Li, Exact and approximate solutions of source localization problems. IEEE Trans. Signal Process.56(5), 1770–1778 (2008). https://doi.org/10.1109/TSP.2007.909342.
MathSciNet
MATH
Google Scholar
Y. E. Baba, A. Walther, E. A. P. Habets, 3D room geometry inference based on room impulse response stacks. IEEE/ACM Trans. Audio Speech Lang. Process.26(5), 857–872 (2018). https://doi.org/10.1109/TASLP.2017.2784298.
Google Scholar
J. Eaton, N. D. Gaubitch, A. H. Moore, P. A. Naylor, Estimation of room acoustic parameters: the ACE challenge. IEEE/ACM Trans. Audio Speech Lang. Process.24:, 1681–1693 (2016).
Google Scholar
G. Defrance, L. Daudet, J. -D. Polack, Finding the onset of a room impulse response: straightforward?IEEE/ACM Trans. Audio Speech Lang. Process.124(4), 248–254 (2008).
Google Scholar
D. Di Carlo, P. Tandeitnik, C. Foy, N. Bertin, A. Deleforge, S. Gannot, Zenodo (2021). https://doi.org/10.5281/zenodo.4626590.
J. Eaton, N. D. Gaubitch, A. H. Moore, P. A. Naylor, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). The ACE challenge–corpus description and performance evaluation, (2015), pp. 1–5. https://doi.org/10.1109/WASPAA.2015.7336912.
J. M. Eargle, in Handbook of Recording Engineering. Characteristics of performance and recording spaces (SpringerNew York, 1996), pp. 57–65.
Google Scholar
P. A. Naylor, N. D. Gaubitch, Speech dereverberation (Springer, United Kingdom, 2010).
MATH
Google Scholar
M. R. Schroeder, New method of measuring reverberation time. J. Acoust. Soc. Am.37(6), 1187–1188 (1965).
Google Scholar
W. T. Chu, Comparison of reverberation measurements using schroeder’s impulse method and decay-curve averaging method. J. Acoust. Soc. Am.63(5), 1444–1450 (1978).
Google Scholar
N. Xiang, Evaluation of reverberation times using a nonlinear regression approach. J. Acoust. Soc. Am.98(4), 2112–2121 (1995).
Google Scholar
S. Gannot, E. Vincent, S. Markovich-Golan, A. Ozerov, A consolidated perspective on multi-microphone speech enhancement and source separation. IEEE/ACM Trans. Audio Speech Lang. Process.25(4), 692–730 (2017). https://doi.org/10.1109/TASLP.2016.2647702.
Google Scholar
H. L. Van Trees, Optimum array processing: part IV of detection, estimation, and modulation theory (Wiley, United States, 2004).
Google Scholar
R. Scheibler, I. Dokmanić, M. Vetterli, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Raking echoes in the time domain, (2015), pp. 554–558. https://doi.org/10.1109/ICASSP.2015.7178030.
H. A. Javed, A. H. Moore, P. A. Naylor, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Spherical microphone array acoustic rake receivers, (2016), pp. 111–115. https://doi.org/10.1109/ICASSP.2016.7471647.
L. Condat, A. Hirabayashi, Cadzow denoising upgraded: a new projection method for the recovery of Dirac pulses from noisy linear measurements. Sampling Theory Signal Image Process.14(1), 17–47 (2015). https://doi.org/10.1007/BF03549586.
MathSciNet
MATH
Google Scholar
M. Miyoshi, Y. Kaneda, Inverse filtering of room acoustics. IEEE/ACM Trans. Acoust. Speech Signal Process.36(2), 145–152 (1988). https://doi.org/10.1109/29.1509.
Google Scholar
S. Gannot, M. Moonen, Subspace methods for multimicrophone speech dereverberation. EURASIP J. Adv. Signal Process.2003(11), 1–17 (2003). https://doi.org/10.1155/S1110865703305049.
MATH
Google Scholar
J. Benesty, J. Chen, Y. Huang, J. Dmochowski, On microphone-array beamforming from a mimo acoustic signal processing perspective. IEEE Trans. Audio Speech Lang. Process.15(3), 1053–1065 (2007). https://doi.org/10.1109/TASL.2006.885251.
Google Scholar
M. R. Thomas, I. J. Tashev, F. Lim, P. A. Naylor, in International Workshop on Acoustic Signal Enhancement (IWAENC). Optimal beamforming as a time domain equalization problem with application to room acoustics (IEEE, 2014), pp. 75–79. https://doi.org/10.1109/IWAENC.2014.6953341.
I. Kodrasi, S. Doclo, in Hands-free Speech Communications and Microphone Arrays (HSCMA). EVD-based multi-channel dereverberation of a moving speaker using different RETF estimation methods, (2017), pp. 116–120. https://doi.org/10.1109/HSCMA.2017.7895573.
N. Gößling, S. Doclo, in International Workshop on Acoustic Signal Enhancement (IWAENC). Relative transfer function estimation exploiting spatially separated microphones in a diffuse noise field, (2018), pp. 146–150. https://doi.org/10.1109/IWAENC.2018.8521295.
S. Markovich-Golan, S. Gannot, W. Kellermann, in European Signal Processing Conference (EUSIPCO). Performance analysis of the covariance-whitening and the covariance-subtraction methods for estimating the relative transfer function, (2018), pp. 2499–2503. https://doi.org/10.23919/EUSIPCO.2018.8553007.
M. Kuster, Objective sound field analysis based on the coherence estimated from two microphone signals. J. Acoust. Soc. Am.131(4), 3284–3284 (2012). https://doi.org/10.1121/1.4708280.
Google Scholar
O. Schwartz, S. Gannot, E. A. Habets, in 24th European Signal Processing Conference (EUSIPCO). Joint estimation of late reverberant and speech power spectral densities in noisy environments using Frobenius norm, (2016), pp. 1123–1127. https://doi.org/10.1109/EUSIPCO.2016.7760423.
T. H. Falk, C. Zheng, W. -Y. Chan, A non-intrusive quality and intelligibility measure of reverberant and dereverberated speech. IEEE/ACM Trans. Audio Speech Lang. Process.18(7), 1766–1774 (2010). https://doi.org/10.1109/TASL.2010.2052247.
Google Scholar
A. W. Rix, J. G. Beerends, M. P. Hollier, A. P. Hekstra, in IEEE International Conference on Acoustics, Speech, and Signal (ICASSP), vol. 2. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs, (2001), pp. 749–752. https://doi.org/10.1109/ICASSP.2001.941023.
J. S. Bradley, H. Sato, M. Picard, On the importance of early reflections for speech in rooms. J. Acoust. Soc. Am.113(6), 3233–3244 (2003). https://doi.org/10.1121/1.1570439.
Google Scholar
H Peic Tukuljac, A. Deleforge, R. Gribonval, in Advances in Neural Information Processing Systems (NeurIPS), 31, ed. by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett. MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval (Curran Associates, Inc.New York, 2018). https://proceedings.neurips.cc/paper/2018/file/c9f95a0a5af052bffce5c89917335f67-Paper.pdf.
Google Scholar
M. Crocco, A. Trucco, A. Del Bue, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Room reflector estimation from sound by greedy iterative approach, (2018), pp. 6877–6881. https://doi.org/10.1109/ICASSP.2018.8461640.
S. Tervo, T. Tossavainen, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3D room geometry estimation from measured impulse responses, (2012), pp. 513–516. https://doi.org/10.1109/ICASSP.2012.6287929.
O. Shih, A. Rowe, in ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). Can a phone hear the shape of a room?, (2019), pp. 277–288. https://doi.org/10.1145/3302506.3310407.
U. Saqib, S. Gannot, J. R. Jensen, Estimation of acoustic echoes using expectation-maximization methods. EURASIP J. Audio Speech Music (2020). https://doi.org/10.1186/s13636-020-00179-z.
A. Beck, P. Stoica, J. Li, Exact and approximate solutions of source localization problems. IEEE Trans. Signal Process.56(5), 1770–1778 (2008). https://doi.org/10.1109/TSP.2007.909342.
MathSciNet
MATH
Google Scholar